Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.
Community is experiencing an influx of spam. As we work toward a solution, please use the 'Notify Moderator' option on the ellipsis menu to flag inappropriate posts.
Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Help mapping values in lookup table to records in a separate table

gth639b
6 - Meteoroid

Hello,

 

I've been unable to accomplish the following task with joins, unions, and the find a replace function. 

Task

 Table 1 contains 70, 000 records of customer orders and each record contains a zipcode with other fields (multiple records can have the same zipcode while all the other fields will be different)

 

example of Table 1 records (72,000 total records):

 

Customer #Order #Zip code
00011000130071
00021000230087
00031000330071
00041000430051

 

Table 2 contains 50,000 records and each records contains just a zipcode, FIPS (Federal Information Processing Standards) code, and a Rural/Urban Code (i.e. "1"= rural area with pop less than 20K, "2" = metro area with pop greater than 100K) The zip codes in table 2 are unique, but multiple zipcodes can have the same Rural/Urban code (i.e. zipcodes 30071, 30087, and 30067 have the Rural/Urban code of "1"

 

Example of Table 2 records (50,000 total records)

 

ZipcodeFIPSRural/Urbanl Code
3007100871
3008700821
3007700812

 

What I want is

Table 1

Customer #  Order #Zip codeRural/ Urban code

0001

10001300711
000210002300871
000310003300711
000410004300772

 

What I get is this (using "find and replace" with the "entire field" value checked and "Append Fields to Record")

Customer #Order #Zip codeRural/ Urban code
000110001300711
000210002300871
000310003300711
000410004300771

 

The mapping of Zip codes to Rural/Urban code in table 1 is incorrect. The Rural/Urban codes range in values from 1-9, but when I use the find and replace function not all the values are used. The total number of records (72,000) is correct but the  Zip codes to Rural/Urban mapping is wrong.

 

When I use the join and union function the mapping is correct but the number of records is wrong. After I run the "join" and "union" function I get a total of 115,000 records instead of the original 72,000. I've attached a copy of the Rural/Urban code lookup table. The correct tab is FIPS  to Zips. Please let me know what I should do.

4 REPLIES 4
MSalvage
11 - Bolide

@gth639b,

 

It seems to me if you join these two tables on Zip code you should get what you want. The only thing that would mess this up is if there are multiple rows in table 2 for the same zip code.

 

You should only be "Unioning" Table 1 side and the joined.

 

Double check to see if there are multiple rows per zip in table 2. I did it with your sample data and it was correct(attached).

 

Best,

MSalvage

 

 

gth639b
6 - Meteoroid

MSalvage,

 

You are right. Table 2  does contain duplicate zip codes. I downloaded table 2 from the U.S. Department of Housing and Urban Development website and after reading your comments and doing more research I found out that a 1 zip code can be associated with multiple Rural/Urban codes because a zipcode can belong to two counties. So basically, 1 zip code can be associated to multiple Rural/ Urban codes and multiple zip codes can be associated to 1 Rural/Urban code.

MSalvage
11 - Bolide

@gth639b,

 

Do you have a hierarchy you are supposed to follow? If so I would just go by that.

 

If it is important to have all the rural/urban codes you can do some manipulation to make them separate fields(attached example). This would allow you to keep all the data but still end with your 72,000 rows.

 

Not Totally sure what the goal is but I hope this has helped.

 

Best,

MSalvage

gth639b
6 - Meteoroid

MSalvage,

 

Thank you this has been a tremendous help. The end goal is to try to predict whether an automobile part will be delivered late or not using logistic regression. I believe the urban/rural value of a customer impacts lateness which is why I wanted to included it as an extra field to run the regression on.

Labels
Top Solution Authors