The Location Data Knowledge Base will be deprecated soon. Please go to Help Docs for release notes and schedules and the Knowledge Base for in-depth information.

Location Data Knowledge Base

Data methodologies, and Release schedules.
Question Why do values in the ConsumerView HH & Individual fields contradict the Household Composition Code?  Sometimes other field values in a CV household record don't support the description of the Household Composition value.  For example, there might be two adult females in a household record but the Household Composition is A (HH w/1 adult female).  Or another record has 2 adults with a Composition code of B (HH w/1 adult male).  Answer   The first thing to note is the ConsumerView file delivers 6 of up to 8 persons in a Living Unit.  Not every person is delivered which may impact the Household Composition Code.  Another consideration is the description does not define the exact makeup of the household.  For example, a Household Composition code of "A" (HH w/1 adult female) does not mean this household is exclusively made up of a single adult female.   Household Composition - This field is calculated based on gender and children present in the Living Unit.  On the Household Composition, Experian looks at everyone in the Living Unit provided to them and determines how many adults are present in each Living Unit, gender of those individuals and if children are present in that living unit, coding them according to the codes chart. * A = HH w/1 adult female (no adult male, no child) * B = HH w/1 adult male (no adlt F, no child) * C = HH w/1 adult female and 1 adult male (no child) * D = HH w/1 adult female, 1 adult male and children present * E = HH w/1 adult female and children present (no adult male) * F = HH w/1 adult male and children present (no adult female) * I = HH w/2 or more adult males and children present (This does not mean there is not an adult female present in the household) * J = HH w/2 or more adult females and children present (This does not mean there is not an adult male present in the household) * G = HH w/2 or more adult males (An adult female can also be present in household, no child)) * H = HH w/2 or more adult females (An adult male can also be present in household, no child) * U = Unable to code   Records coded with C and D are typically scenarios where there's a male and female with/without children.  I - H could be one of these situations below: 1.) Scenarios where an elderly parent living with their adult children. 2.) Roommates (both male and female, young and old) 3.) Scenarios where a married couple allows an in-law (the brother or sister of a spouse) to reside with them 4.) Scenarios where a relative (uncle, cousin, etc.) may live with their relatives 5.) College age children (ages 19+) living with parents.   When might the Household Composition value not match the houehold attributes?  In one scenario, 2 Adults in a household with a Code of "A," more than likely there are two female adults in the household. This could be a roommate situation, it could be a mother and older daughter. Only one female may be a PDM in a household. Because of this, the record would receive the Household Composition code of “A” meaning a household with one adult female even though there are two adult females present.    In the "Question" example of two adult females in a household record but the Household Composition is A (HH w/1 adult female), this household looks as if it is more likely two sisters residing together as the ages are close together. One is listed as the PDM, the other is listed as the other adult. The system assigned the living unit an A denoting the female PDM.   The second "Question" example of 2 adults with a Code of "B" may be a household with a parent living with adult child situation. The adult child is coded as the PDM and the other person is only an initial and coded as the other adult. The system assigned the living unit a B denoting the male PDM.   In situations where fields in specific records are questionable and need further validation, provide those records to data_products@alteryx.com explaining the issue.  Also include the vintage of the ConsumerView Household or Individual file.     Experian's ConsumerView file is built directly from hundreds of public and proprietary sources.  They employ rigorous data testing including applying proprietary models and algorithms to ensure only the most deliverable addresses and accurate data elements are on the ConsumerView database.  Attached to this article is an excellent User Guide for the ConsumerView dataset. This User Guide is also included in the US Data Bundle documentation.
View full article
This question comes up frequently because some of the US Data and Spatial datasets are HUGE and we want to ensure there's enough hard drive space available.  To help users better anticipate hard drive requirements, file sizes are included on the installation media in the Documentation folder of the update in a Excel worksheet. The .xlsx is titled Alteryx Qx 201x Variable List and includes file sizes for the United States and Canada in a worksheet labeled File sizes.  File sizes reflect the installed file size.   If you have questions or comments about this table, send an email to products@alteryx.com. 
View full article
The Q4 2016 US Data and US Spatial Data installs were mistakenly shipped with an Alteryx Maps macro named TomTomLayerSelection and should not be placed into production under any circumstances.    Users running Alteryx v. 10.1 and earlier versions who have installed the Q4 2016 US Data[1] update may receive the following error upon opening Alteryx:                     Action: To fix the error, or to remove the macro from the Tool Palette, users will need to delete the folder containing TomTomLayerSelection.yxmc by following these steps:   1. Close Alteryx and navigate to the Q4 Alteryx Maps data install folder, typically located here – Program Files (x86)\Alteryx\DataProducts\AlteryxMap\TomTom_US_2016_Q4   2. Delete the Macros folder, shown in the screenshot below:                       3. Relaunch Alteryx. You should no longer see the error message.        [1] US Spatial end users will not see this error when opening Alteryx  
View full article
Check out this visual of a Guzzler update
View full article
Download and install instructions for the Spatial data installs.
View full article
Download and install instructions for the Data (US) install on AWS3.
View full article
Download and install instructions for the Data (US and Canada) data installs.
View full article
Question I am curious about how the Gallery (Allocate) estimates ZIP Code demographics that might differ from what the Census Bureau does on their American Factfinder site. Would this be caused by the source of the Zip Codes? Or are we using a different method of pulling demographics for Zip Codes? Answer ZIP Codes and ZCTAs are different.  A major difference is ZCTAs are created at the time of the Census in 2010 and not typically updated.  ZIP Codes in the Gallery are updated quarterly.  So the physical area of the ZIP Codes being compared might be different even though they reference 2010 data.  They could be looking at physically different areas. If one could visualize the ZIP Codes and ZCTAs on a map, the physical differences between the geographies would be easier to see.  Here are a few informative links from the Census Bureau:  https://ask.census.gov/faq.php?id=5000&faqId=10488 and http://www.census.gov/geo/reference/zctas.html.    Another note - The Gallery does not include ZIP Codes with Points but Allocate does, when registered).  ZIP Code Points can be Post Office Boxes, business ZIP Codes in an office building, etc..  If a ZIP Code cannot be found check to see if it categorized as a polygon or point type.   Want to trouble shoot the validity of a ZIP Code on your own?  Here are a few links: Post Office web site: https://tools.usps.com/go/ZipLookupAction!input.action?mode=1&refresh=true MelissaData web site:  http://www.melissadata.com/lookups/cityzip.asp  
View full article
In a situation with a large database of thousands of records, some addresses may not be valid.  Inevitably, CASS will output the bad addresses along with the good addresses. The issue is that certain addresses double up on the suite.  Since, in this example, the input file had address and suite in the same line the file looked similar to this: The CASS output would actually find the suite, but not add it to the CASS_Suite output.  This normally would not be an issue, but it actually would add it to the CASS_AddressPlusSuite field, creating a file that in fact looked like the following: This issue is inherent to the MelissaData engine which drives the CASS tool.  The best way to address the issue is to include the CASS_Results field in the output from the CASS Tool.  This will give a code that notifies the user of the CASS match level.  Good addresses will have a match level of AS01 or AS02, whereas bad addresses will have different codes.  The entire listing of all possible CASS_Results codes is available in the Help File by searching for CASS Results.  Eliminating the codes that highlight bad addresses will alleviate this issue.
View full article
Per Experian:   "CAPE data is always in nominal dollars (i.e. it includes inflation) – for CYE estimates, it cannot be any other way since we release the estimates a few months after the reference date (e.g. in end of September 2015, we released the estimates for the 2015B update that referred to July 1st, 2015). For FYP, however, it’s important to mention that CAPE estimates incorporate inflation projections – this also ensures consistency with CYE estimates. Although we may use ‘real dollar’ values (i.e. stripped from inflation effects) to project Consumer Expenditure target estimates, none of our final CAPE estimates are in ‘real dollars’.   - Thanks to Data Products for getting this information from Experian
View full article
When you drag and drop an Allocate Report tool onto the Alteryx canvas or are browsing reports in the Alteryx Gallery, do you feel overwhelmed by the number of Allocate reports visible?  Your next question might be, "what's included in each report?"  The report name helps somewhat but not always and there isn't a listing in Help to guide you.    Attached is a spreadsheet on the available reports grouped by Type (List, Rank, Summary, Comparison) and displaying report name, description and check marks in columns for key content items (Age, Employment, Income, Retail Demand, etc.).   If you have questions or comments, feel free to contact data_products@alteryx.com.    
View full article
Here we are in 2015.  The 2010 Census is five years behind us and the 2020 Census is five years away.  Have you wondered about the next Census?  How will data be collected?  Will the questionnaire catch up with current technology?  What happens to non-responders?  Since much of our demographic data is based upon the results from each Census (whether from the Census Bureau or demographic vendors like Experian), I went looking on the Census Bureau's web site for a preview of coming attractions.  And I found a page at A cost-effective 2020 Census answering my questions.    The decennial Census is mandated by the U.S. Constitution. If you answered the Census in 2000, you took black/blue pen to paper for either a short or long-form questionnaire. No Internet access back then.  In 2010 you still used a black/blue pen on paper and answered 10 simple questions even though the Internet was integrated in much of our day-to-day life.  Are we relegated to a black/blue pen on paper to answer the 2020 Census Questionnaire?  Based on information at census.gov, the next Census will encourage self-response via the Internet.  Nice!  And for those who do not respond, other existing governmental data may be used as a supplement.  This equates to cost reductions with fewer physical offices, fewer staff and less followup with non responders.  In 2010 there were 500+ Census offices and more than 750,000 staff on the ground.  The 2020 Census may have as few as 150 Census offices and 200,000 staff on the ground.   Technology may also influence another component of the U.S. Census - the Topologically Integrated Geographic Encoding and Referencing (TIGER) database.  These are reference maps, created for the Census, used to visualize geographic and statistical data.  Maps are the basis for companies such as TomTom who offer enhanced versions for licensing and inclusion in navigation products.  Alteryx users can find mapping layers in the Map Input, Reporting and Browse tools as backdrop references for spatial objects.   As referenced on census.gov, existing maps and address lists may be updated using technology, data and GPS to collect interviews efficiently.  In the past enumerators walked EVERY block in EVERY neighborhood in the United States gathering responses and information.  You can read more about the Census Bureau's 155-year history of mapping here:  155 years of mapping   From what I read, these changes have the potential to save taxpayer dollars, maintain a high level of accuracy and make responding to the Census easier.  So what happens next?  Testing these new processes began this year on a small-scale and national basis.  On April 1, 2017, Congress will be delivered the 2020 Census "topics."  On April 1, 2018, "question wording" will be delivered.  April 1, 2020 is Census Day!  On December 31, 2020 apportionment counts are delivered to the President.  Results of the Census were historically not instantaneously available but were released over a period of a few years.  But who knows what WILL be available in another 5 years.   http://census.gov/ is an excellent resource for information on the Census, American Community Survey (ACS), geographies, news and events. 
View full article
Question Are seasonal population figures included in total population counts? Answer It is very important to note that the CAPE ‘Seasonal Population’ only refers to the proportion of the population that is temporarily living in housing units that are defined as ‘For seasonal, recreational, or occasional use’. The CAPE ‘Seasonal Population’ therefore needs to be combined with the permanent ‘Residual Population’ to estimate the overall level of the population in each area by quarter.
View full article
We are trying to understand the difference between employees and daytime population.   It looks like some of the population may be double counted.  Can you explain what rows are used for the 2014 Total Daytime population #.      Methodologies are different for Employees and Daytime Population.   Employees & Establishments in Business Summary are sourced from the D&B Business list and summarized to a geographic level although delivered in the Experian CAPE release.  The employee counts are as accurate as the D&B employee value but are also subject to block centroid allocation used for population. Employment fields from the Occupation & Employment folder are based upon the American Community Survey, modeled to a current year value and are part of CAPE. Daytime Population is sourced from Experian and are compiled values using several CAPE fields.  The excerpt below is pulled from the Tech Overview delivered to clients.      Daytime PopulationDaytime Population – Current Year Estimates (CYE) The Daytime Population database is created using a variety of methodologies applicable for different subsets of the Total Daytime Population. These subsets are then added together to create the Total Daytime Population. The process starts by identifying key subsets of the residential population that are assumed to stay in or close to their home location during the day. In particular, the following subsets of population are assumed to remain in the same Block Group during the day as the Block Group in which they live (or reside): Residential Population : Children aged less than or equal to 2 Residential Population : Civilian aged 16+ population that are unemployed Residential Population : Civilian aged 16+ population that work at home Residential Population : Population aged 65+ who are retired Residential Population : Population aged 16+ who are homemakers Residential Population : Population aged 16+ who are in the Armed Forces All of the above variables can be directly obtained from previously calculated CAPE – Demographics – Current Year Estimate (CYE) residentially-based variables, except for the ‘Residential Population : Population aged 16+ who are homemakers’. This variable is calculated by applying suitable localized proportions to the existing ‘larger population’ variable of the ‘Civilian aged 16+ population who are ‘Not in Labor Force’. Applying these proportions determines the subset of this ‘larger population’ that are estimated to be homemakers. Once these initial subsets of Daytime Population who are assumed to stay in their residential Block Group during the daytime are defined and accounted for, then the daytime location of other population types are modelled. It is assumed that these remaining population types are much more likely to travel out of their residential Block Group to reach their typical daytime location than is the case for the population groups previously accounted for. However, flows from home address to daytime address that occur completely within the same Block Group are also possible for these types. First, the estimate of daytime population at place of work that has already been modelled for the Mosaic Workplace database is accounted for. This variable is: Daytime Population, Civilian 16+, at WorkplaceAfter the above, the main population groups left to be modelled are: Within the work to create Mosaic Workplace, this variable is estimated using Census Tract-to-Tract flows of workers from residence to workplace, and National Business Database data to update these flows and allocate them from Tract level to Block Group level. Daytime Population, Students : Prekindergarten to 8th grade Daytime Population, Students : 9th grade to 12th grade Daytime Population, Students : Post-secondary students Daytime Population: Any remaining Civilian aged 16+ population that are ‘Not in Labor Force’ and have not yet been accounted for. All of the three student populations are modelled using a variety of data from the National Center for Education Statistics (NCES) and also information from key institutions (i.e. universities/colleges) themselves. After making allowance for students registered at an institution but very unlikely to travel to that institution on a typical day (for example, students undertaking online courses), this information is compiled and modelled to create an initial estimate of the typical number of students that spend the day at the location (or campus) of each institution. These figures are then calibrated so that the initial estimates of students who spend a typical day at the location of each institution, and those who stay within their residential Block Group during a typical day, are balanced to equal the national number of students within each category (i.e. Prekindergarten to 8th grade, 9th grade to 12th grade, Post-secondary students). Once all students have been accounted for, current estimates of each relevant daytime population sub-group are tallied and compared to the national estimate of ‘Residential Population: Civilian aged 16+ population that are Not in Labor Force’. The above work does not yet account for a proportion of this population group. The, as yet unaccounted for, proportion of this group is therefore calculated and assumed to spend a typical day within the Block Group in which they live. Having allocated all of the relevant subsets of residential population to either the Block Group in which they reside, or to another Block Group which they are estimated to travel to in order to spend a typical day, then the two final variables in the database are calculated: Daytime Population Aged 16+ Total Daytime Population (i.e. all ages)
View full article
  Households or individuals may be excluded from the ConsumerView file for multiple reasons.  If an example list of names is provided to data_products@alteryx.com, they can be validated with Experian.  Otherwise, here are examples of exclusions: Households at addresses may be renters with no deed information available Household only has cell phones and is not in the phone book white pages Privacy - many people see to it their name is never on any kind of mailing list when doing business with companies Below is an excerpt on Experian's privacy & Compliance: Experian Marketing Services’ Approach to Privacy EMS is a steward of the information it collects, maintains, utilizes and shares.  Our stewardship is anchored in a values-based approach to privacy.  Our information values focus squarely upon the protection of information in our care and the safeguarding of consumer privacy through appropriate and responsible use.  For more information regarding our approach to privacy, please visit our web site at http://www.experian.com/privacy/index.html . Direct Marketing Association As a member and Board of Directors participant of the Direct Marketing Association (DMA), EMS drives the adoption of, and subsequently abides by, and encourages its clients to adhere to, the DMA’s Privacy Promise and Guidelines for Ethical Business Practices.  The Privacy Promise is a public assurance to American consumers that DMA members follow specific practices to protect consumer privacy.  Specifically, the Privacy Promise requires member companies to:   Provide notice of consumers’ ability to opt-out Honor consumer opt-out requests Maintain an in-house opt-out suppression file Use the DMA Preference Service suppression files (e.g., MPS, TPS, e-MPS) Promote industry-wide compliance with DMA self-regulatory guidelines  Why are some people not contained in the Experian ... For additional information, contact data_products@alteryx.com
View full article
If you can't find the ZIP code you are looking for, it is likely that it is a ZIP point, ZIP codes are assigned to military basis, college campuses and other large facilities. These ZIP Codes are not registered by default, to be added to Allocate they have to be registered. For this you will need to have Admin rights to your computer to do this. If you don't, you could try right clicking on the Allocate product and selecting Run as administrator. If this does not work, you will need your IT to register the file for you. First, open the stand alone Allocate product (outside of Alteryx):                                                 Choose the dataset you are using in the first window:                       Next, go to the Pick Geography tab and go to File > Manage Virtual Geographies…                       The below window will pop up, click on Register:                                Go to the Program Files (x86)AlteryxDataProductsPortfolio[your dataset]Data folder (or the folder for the dataset you are using):                        And select the ZIPs with Points VGF files, click Open:                        You will see that the Zip Codes w/Points is now loaded in the list, click OK:                                  Allocate will say it needs to restart, click OK. Once open again, go to the Pick Geography tab and the Zip Codes w/Points is now selectable and will also be available in the Allocate tool within Alteryx.                       
View full article
The attached document contains the Mosaic USA Group and Segment Descriptions
View full article
This two-page document provides concise, color-coded groupings and descriptions of the 71 Mosaic USA clusters.
View full article
Census data is calculated based on census designated boundaries which range in increasing size from Blocks - Block Groups - Tracts - Counties - etc.  However, when solving most business issues, custom polygons are often used.  Since custom polygons almost never perfectly mirror Census Blocks, a method to subdivide Blocks must be created. Block Centroid Retrevial Alteryx utilizes Block Centroid Retrieval when allocating demographic data to irregular polygons such as custom radii, ZIP Codes, and custom trade areas.  For the US datasets, this retrieval is based on the centroids of the US Census 2010 Blocks.  Each record is tagged with the percent of the household and population it represents as a fraction of its associated block group.  Development changes in the years since the census are not reflected in this inventory of block centroids.   To address this requirement, Alteryx designed a methodological approach to update block groups to reflect areas of growth.  By utilizing the Experian household database of 127 million U.S. consumer households in conjunction with the Census Bureau, Alteryx has created additional points, within the block inventory dataset, utilized by Allocate to represent population and household growth during the time since the previous census.
View full article
In June of 2012, the USPS made a change to how a P.O. Box operates.  That change now allows for a street address to be used in lieu of a P.O. Box.  This format, known as a P.O. Box Street Address (PBSA), is actually the address of the post office of where the P.O. Box is located. How does this affect Alteryx?  One client brought to our attention that this could potentially affect address and demographic analysis.  Let's say a user has a small list of competitors and they want to run a competitive analysis.  The only issue is that several of these records are using a PBSA.  The demographics for these records will revolve around the post office, not the actual business.   How do we combat this?  The first thing to keep in mind is that Alteryx is perfect for this sort of scenario.  The USPS has posted a list of their post office locations as a .txt file.  Attached to this post you will find an Alteryx Zip Package that informs you of where you can find the file (hint: here) and parses it.  Once it is parsed, one option would be to merge that with the D&B Business Matching Macro to validate or invalidate your list of businesses.  Another option would be to take your business list and compare it to this file to find out if any businesses are using a PBSA.  Either way, we have options! One final note is that it is important to remember that these are valid street addresses.  While the Alteryx Street Geocoder will mark a P.O. Box as invalid, it will see these and most likely geocode them all the way to the street level.   If you would like to read more about this USPS feature, click here.  Until next time! Chad Follow me on Twitter! @AlteryxChad Also, HUGE THANKS to John H. for his help with this post!
View full article
The Q3 2018 US Spatial package includes analytics-ready data from TomTom and the US Census as well as data-specific analysis tools to get the most from the spatial data. The documentation package attached includes –   Release notes, variable list and change log Spatial products include documentation on drive time methodology and Alteryx map layers   What's new in this release?   Sample workflows included with the data installs are now grouped within a top-level "Data Install Samples" category Annual updates for Places, Other Name Places, CBSAs, and CCDs/MCDs   Please download and extract the attached 'Q32018_US_Spatial.zip' for the complete documentation.
View full article