This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
I have a list of zip codes, can I use Alteryx to determine the city they're in?
The answer is yes! But with a few caveats. Zip Codes can be notoriously difficult to pair with cities they belong in because they exist in two forms; points and polygons. Point Zip Codes are generally associated with businesses or universities, while polygons generally encompass residential areas. Alteryx data does have Zip Codes with Points data; however, it is not immediately accessible and will require some configuration on the part of the user to get access to that level of data.
Zip Codes are also frequently adjusted, deprecated, and consolidated by the USPS, so depending on the age/vintage of the zip codes in your data as compared to that in the Alteryx Spatial Database, there may be some slight variation there as well. All said, users should still expect a reasonably high match rate.
To start, make sure you either have the Alteryx Data Package installed (available with a license), or you have the 2010 US Census data installed (available for free at http://downloads.alteryx.com/data.html).
With the data installed, the first thing you will bring down is an Allocate Input Tool. Here you will choose the relevant dataset (Experian data or US Census) from the drop down, then check the box for Zip Codes under Pick Geography.
From here you will use the Join Tool to join your data to the Allocate Input data based on your Zip Code field and the Key field of the Allocate Input data. [NOTE: the Zip Code and Key fields will both need to be either String fields or numeric fields. Either is fine as long as it is consistent]
The resulting data that comes from the J output anchor of the Join tool will contain all of your Zip Codes that matched those in the dataset. The field "Name" that comes from the Allocate Input is formatted as the 5 digit Zip Code, followed by the City name. From here a simple Text to Columns tool configured to create 2 columns and parse on the space, will create a field specifically for the City and an extra Zip code field that can be deselected and discarded as the data moves down your workflow.
See an example of the process in the attached workflow below.
Have you ever used the Allocate tools and received back some strange looking variable names? You're not alone! The Allocate Rename Fields Macro will allow you to rename your fields into readable variables.
The macro can be downloaded here. Note: This will navigate you to the Alteryx Gallery. Select "Download & Install the Allocate Rename Fields Macro" and follow the prompts to install.
USING THE TOOL
The Allocate tools allow users to enrich their workflows with third party data provided from Experian and the US Census. This data contains demographic and household information by geography. Allocate tools can be found under the “Demographic Analysis” tab in the Alteryx toolbar; they include the Allocate Input, Allocate Append, Allocate Report, and Allocate Metainfo.
Allocate Input and Allocate Append tools allow users to select variables to display by geography. Once configured, the fields returned look something like this:
Add the Allocate Rename macro after the Allocate Input/Append. In the Configuration window, select the Dataset that you are pulling from. Press Run for the magic!
Voila! Your field names are now human-readable.
What if my company blocks access to downloading new tools/macros from the Gallery?
In the case that you cannot download this macro, you can use Alteryx to dynamically rename the field names. See is it possible to get the variable name I see in the Allocate tool?
You know what really stinks? Working with addresses that aren’t standardized or verified. Whether human-input, or one of the many address formatting standards in the U.S., being stuck with an address you can’t either (1) identify or (2) ensure it exists can be a real pain in the… well…
CASS is here to help!
Have some Latitude/Longitude points and not much else? Working with Spatial Objects and need more information than a simple map point? Time to call in the Reverse Geocoding macro. Reverse Geocoding can give some robust information to help make important business decisions when working with spatial data. Case in point: Your “friend” has hacked the Pokémon Go APK and hands you a list of Pokémon with the associated Latitude/Longitude. You can’t make an informed decision on your next weekend Poke-session without first understanding more than just the location on a map!
Mosaic BG Dominant and Mosaic BG Household Distribution counts are balanced to Experian’s census estimates. ConsumerView is a marketing file and therefore doesn’t need to be balanced to the census estimates.
Have you recently updated your Alteryx version, and now are getting an error when you try to run workflows that use Spatial Data? This article outlines the symptoms, diagnosis, and solution for the Error: “The Designer .x64 reported: InboundNamedPipe::ReadFile: Not enough bytes read. Overlapped I/O operation is in progress.” message associated with Spatial Tools.
The ConsumerView Matching macro enables users to match their customer file to the Experian ConsumerView data. Starting with customer information such as name and address you can leverage the ConsumerView macro in Alteryx to append a variety of information about your customers such as household segmentation, home purchase price, presence of children in a home, estimated education and income levels, length of residence, and many more!
Why do values in the ConsumerView HH & Individual fields contradict the Household Composition Code? Sometimes other field values in a CV household record don't support the description of the Household Composition value. For example, there might be two adult females in a household record but the Household Composition is A (HH w/1 adult female). Or another record has 2 adults with a Composition code of B (HH w/1 adult male).
The first thing to note is the ConsumerView file delivers 6 of up to 8 persons in a Living Unit. Not every person is delivered which may impact the Household Composition Code. Another consideration is the description does not define the exact makeup of the household. For example, a Household Composition code of "A" (HH w/1 adult female) does not mean this household is exclusively made up of a single adult female.
Household Composition - This field is calculated based on gender and children present in the Living Unit. On the Household Composition, Experian looks at everyone in the Living Unit provided to them and determines how many adults are present in each Living Unit, gender of those individuals and if children are present in that living unit, coding them according to the codes chart.
* A = HH w/1 adult female (no adult male, no child) * B = HH w/1 adult male (no adlt F, no child) * C = HH w/1 adult female and 1 adult male (no child) * D = HH w/1 adult female, 1 adult male and children present * E = HH w/1 adult female and children present (no adult male) * F = HH w/1 adult male and children present (no adult female) * I = HH w/2 or more adult males and children present (This does not mean there is not an adult female present in the household) * J = HH w/2 or more adult females and children present (This does not mean there is not an adult male present in the household) * G = HH w/2 or more adult males (An adult female can also be present in household, no child)) * H = HH w/2 or more adult females (An adult male can also be present in household, no child) * U = Unable to code
Records coded with C and D are typically scenarios where there's a male and female with/without children. I - H could be one of these situations below:
1.) Scenarios where an elderly parent living with their adult children. 2.) Roommates (both male and female, young and old) 3.) Scenarios where a married couple allows an in-law (the brother or sister of a spouse) to reside with them 4.) Scenarios where a relative (uncle, cousin, etc.) may live with their relatives 5.) College age children (ages 19+) living with parents.
When might the Household Composition value not match the houehold attributes? In one scenario, 2 Adults in a household with a Code of "A," more than likely there are two female adults in the household. This could be a roommate situation, it could be a mother and older daughter. Only one female may be a PDM in a household. Because of this, the record would receive the Household Composition code of “A” meaning a household with one adult female even though there are two adult females present.
In the "Question" example of two adult females in a household record but the Household Composition is A (HH w/1 adult female), this household looks as if it is more likely two sisters residing together as the ages are close together. One is listed as the PDM, the other is listed as the other adult. The system assigned the living unit an A denoting the female PDM.
The second "Question" example of 2 adults with a Code of "B" may be a household with a parent living with adult child situation. The adult child is coded as the PDM and the other person is only an initial and coded as the other adult. The system assigned the living unit a B denoting the male PDM.
In situations where fields in specific records are questionable and need further validation, provide those records to firstname.lastname@example.org explaining the issue. Also include the vintage of the ConsumerView Household or Individual file.
Experian's ConsumerView file is built directly from hundreds of public and proprietary sources. They employ rigorous data testing including applying proprietary models and algorithms to ensure only the most deliverable addresses and accurate data elements are on the ConsumerView database. Attached to this article is an excellent User Guide for the ConsumerView dataset. This User Guide is also included in the US Data Bundle documentation.
This question comes up frequently because some of the US Data and Spatial datasets are HUGE and we want to ensure there's enough hard drive space available. To help users better anticipate hard drive requirements, file sizes are included on the installation media in the Documentation folder of the update in a Excel worksheet. The .xlsx is titled Alteryx Qx 201x Variable List and includes file sizes for the United States and Canada in a worksheet labeled File sizes. File sizes reflect the installed file size.
If you have questions or comments about this table, send an email to email@example.com.
Symptoms This note is to alert users of Alteryx 8.6 to 10.5 of an error message that may appear if a user previously installed the combined US and Canada CASS dataset and then in subsequent data updates separately installs CASS for a single country.
Some clients reported receiving a message in the Alteryx Designer that “CASS is not installed properly or it has expired.” Once CASS is installed for both countries, supporting program files continue to look for the combined dataset even if CASS is uninstalled and a single country is reinstalled.
Diagnosis The instigating files are identified and a resolution is being explored, however, this error appears to be inherent to the source data itself.
Solution Action required: 1. Close open instances of Alteryx 2. Delete the CASS.ini file which in most cases resides on the user’s local drive (C:\ProgramData\Alteryx\DataProducts\DataSets\CASS\CASS.ini) 3. Locate the most recent CASS installs and re-run the installation executables 4. Relaunch Alteryx and verify the CASS tool is no longer reporting an error
This note is to alert users of Alteryx 8.6 to 10.1 of a potential message window that may appear after updating the US and Canadian CASS engine. Subsequent to completing the CASS installation and closing the installer, some clients have reported receiving a message window stating that the program may not have been installed correctly. This message can be disregarded and will not impact dataset functionality.
The source of the problem is being identified and a fix will be included in future Alteryx software releases.
Action Simply choose ‘Cancel’ to close the Program Compatibility Assistant window. Alteryx can then be opened and workflows can be run as normal.
Was the installation of CASS successful? Yes, CASS was installed fully and all scripts have finished running.
Will this message impact my installation or existing workflows? This message is entirely cosmetic; there are no impacts to the installation or workflows.
Alteryx Data Artisans occasionally identify problems with how street segments are displayed or where addresses may be located. They want to pass along corrections and may reach out to Alteryx to help. Artisans can actually provide feedback directly back to our spatial vendor, TomTom, by logging into this web site: http://www.mapsharetool.com/external-iframe/external.jsp.
You can submit a report after logging in with Facebook, Google or Yahoo. As noted on the linked page, your reports help them keep their maps as accurate as possible.
Question I have CAPE installed and a newer version of a VGF (with updated names) is available in a new install. I uninstalled that version of the data, ran the new install but the old VGF names are still visible. I reran the install twice. What am I doing incorrectly? I notice when I run the install, the process happening more quickly than usual.
Answer More than likely you have an incidence of Alteryx open (employing an Allocate tool) when running the install. Alteryx can prevent files from being overwritten, especially when reinstalling data. Before running a data install, close all open Allocate and Alteryx programs. The closure should allow the updating to happen as expected.
Here we are in 2015. The 2010 Census is five years behind us and the 2020 Census is five years away. Have you wondered about the next Census? How will data be collected? Will the questionnaire catch up with current technology? What happens to non-responders? Since much of our demographic data is based upon the results from each Census (whether from the Census Bureau or demographic vendors like Experian), I went looking on the Census Bureau's web site for a preview of coming attractions. And I found a page at A cost-effective 2020 Census answering my questions. The decennial Census is mandated by the U.S. Constitution. If you answered the Census in 2000, you took black/blue pen to paper for either a short or long-form questionnaire. No Internet access back then. In 2010 you still used a black/blue pen on paper and answered 10 simple questions even though the Internet was integrated in much of our day-to-day life. Are we relegated to a black/blue pen on paper to answer the 2020 Census Questionnaire? Based on information at census.gov, the next Census will encourage self-response via the Internet. Nice! And for those who do not respond, other existing governmental data may be used as a supplement. This equates to cost reductions with fewer physical offices, fewer staff and less followup with non responders. In 2010 there were 500+ Census offices and more than 750,000 staff on the ground. The 2020 Census may have as few as 150 Census offices and 200,000 staff on the ground. Technology may also influence another component of the U.S. Census - the Topologically Integrated Geographic Encoding and Referencing (TIGER) database. These are reference maps, created for the Census, used to visualize geographic and statistical data. Maps are the basis for companies such as TomTom who offer enhanced versions for licensing and inclusion in navigation products. Alteryx users can find mapping layers in the Map Input, Reporting and Browse tools as backdrop references for spatial objects. As referenced on census.gov, existing maps and address lists may be updated using technology, data and GPS to collect interviews efficiently. In the past enumerators walked EVERY block in EVERY neighborhood in the United States gathering responses and information. You can read more about the Census Bureau's 155-year history of mapping here: 155 years of mapping From what I read, these changes have the potential to save taxpayer dollars, maintain a high level of accuracy and make responding to the Census easier. So what happens next? Testing these new processes began this year on a small-scale and national basis. On April 1, 2017, Congress will be delivered the 2020 Census "topics." On April 1, 2018, "question wording" will be delivered. April 1, 2020 is Census Day! On December 31, 2020 apportionment counts are delivered to the President. Results of the Census were historically not instantaneously available but were released over a period of a few years. But who knows what WILL be available in another 5 years. http://census.gov/ is an excellent resource for information on the Census, American Community Survey (ACS), geographies, news and events.
Question Are seasonal population figures included in total population counts?
Answer It is very important to note that the CAPE ‘Seasonal Population’ only refers to the proportion of the population that is temporarily living in housing units that are defined as ‘For seasonal, recreational, or occasional use’. The CAPE ‘Seasonal Population’ therefore needs to be combined with the permanent ‘Residual Population’ to estimate the overall level of the population in each area by quarter.
We are trying to understand the difference between employees and daytime population. It looks like some of the population may be double counted. Can you explain what rows are used for the 2014 Total Daytime population #.
Methodologies are different for Employees and Daytime Population.
Employees & Establishments in Business Summary are sourced from the D&B Business list and summarized to a geographic level although delivered in the Experian CAPE release. The employee counts are as accurate as the D&B employee value but are also subject to block centroid allocation used for population.
Employment fields from the Occupation & Employment folder are based upon the American Community Survey, modeled to a current year value and are part of CAPE.
Daytime Population is sourced from Experian and are compiled values using several CAPE fields. The excerpt below is pulled from the Tech Overview delivered to clients.
Daytime PopulationDaytime Population – Current Year Estimates (CYE)
The Daytime Population database is created using a variety of methodologies applicable for different subsets of the Total Daytime Population. These subsets are then added together to create the Total Daytime Population.
The process starts by identifying key subsets of the residential population that are assumed to stay in or close to their home location during the day. In particular, the following subsets of population are assumed to remain in the same Block Group during the day as the Block Group in which they live (or reside):
Residential Population : Children aged less than or equal to 2
Residential Population : Civilian aged 16+ population that are unemployed
Residential Population : Civilian aged 16+ population that work at home
Residential Population : Population aged 65+ who are retired
Residential Population : Population aged 16+ who are homemakers
Residential Population : Population aged 16+ who are in the Armed Forces
All of the above variables can be directly obtained from previously calculated CAPE – Demographics – Current Year Estimate (CYE) residentially-based variables, except for the ‘Residential Population : Population aged 16+ who are homemakers’. This variable is calculated by applying suitable localized proportions to the existing ‘larger population’ variable of the ‘Civilian aged 16+ population who are ‘Not in Labor Force’. Applying these proportions determines the subset of this ‘larger population’ that are estimated to be homemakers.
Once these initial subsets of Daytime Population who are assumed to stay in their residential Block Group during the daytime are defined and accounted for, then the daytime location of other population types are modelled. It is assumed that these remaining population types are much more likely to travel out of their residential Block Group to reach their typical daytime location than is the case for the population groups previously accounted for. However, flows from home address to daytime address that occur completely within the same Block Group are also possible for these types.
First, the estimate of daytime population at place of work that has already been modelled for the Mosaic Workplace database is accounted for. This variable is:
Daytime Population, Civilian 16+, at WorkplaceAfter the above, the main population groups left to be modelled are:
Within the work to create Mosaic Workplace, this variable is estimated using Census Tract-to-Tract flows of workers from residence to workplace, and National Business Database data to update these flows and allocate them from Tract level to Block Group level.
Daytime Population, Students : Prekindergarten to 8th grade
Daytime Population, Students : 9th grade to 12th grade
Daytime Population, Students : Post-secondary students
Daytime Population: Any remaining Civilian aged 16+ population that are ‘Not in Labor Force’ and have not yet been accounted for.
All of the three student populations are modelled using a variety of data from the National Center for Education Statistics (NCES) and also information from key institutions (i.e. universities/colleges) themselves. After making allowance for students registered at an institution but very unlikely to travel to that institution on a typical day (for example, students undertaking online courses), this information is compiled and modelled to create an initial estimate of the typical number of students that spend the day at the location (or campus) of each institution. These figures are then calibrated so that the initial estimates of students who spend a typical day at the location of each institution, and those who stay within their residential Block Group during a typical day, are balanced to equal the national number of students within each category (i.e. Prekindergarten to 8th grade, 9th grade to 12th grade, Post-secondary students).
Once all students have been accounted for, current estimates of each relevant daytime population sub-group are tallied and compared to the national estimate of ‘Residential Population: Civilian aged 16+ population that are Not in Labor Force’. The above work does not yet account for a proportion of this population group. The, as yet unaccounted for, proportion of this group is therefore calculated and assumed to spend a typical day within the Block Group in which they live.
Having allocated all of the relevant subsets of residential population to either the Block Group in which they reside, or to another Block Group which they are estimated to travel to in order to spend a typical day, then the two final variables in the database are calculated:
Daytime Population Aged 16+
Total Daytime Population (i.e. all ages)