Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

How do I add a field to my workflow which would be a look up to another data source?

ajaygajree
7 - Meteor

How do I add a field to my workflow which would be a look up to another data source?

 

I have a workflow with say 50,000 records (lines) and say 10 columns, I wish to add a field from a different source which has one field that matches in the workflow

I want to see an output with 50,000 records and 11 Columns, with the 11th Column populated only if a value exists in the 2nd course.

 

I have tried a Join with a Union to the Left Join but this is creating a bigger population.

 

Am I on the right track or is there another better method?

 

Thanks in Advance

 

Ajay

6 REPLIES 6
DultonM
11 - Bolide

Hi @ajaygajree,

 

If I am understanding your situation correctly, it does sound like a basic join/union would be the right thing to do. However, if you are ending up with more than 50,000 records after your join/union, then the field you are joining on is not unique in your data. For example, let's say both sources have a field called "Fruit" that you want to join on. If the first source (coming into the L-input of the join) has only 1 row with "Apple" but the second source (coming into the R-input of the join) has 3 rows with "Apple", your join will return 3 rows when joining these records. In other words, because the join was not 1-to-1, you get additional rows. If necessary, you may need to unique or summarize your data on each side so that your join is 1-to-1 not 1-to-many or many-to-many. If you are still experiencing difficulty, you can attach some sample data and your workflow and the Community can better help you! Good luck!

ajaygajree
7 - Meteor

Thanks a lot

 

Yes looking at my data that does seem to be the issue!

 

 

ajaygajree
7 - Meteor

Hi Again!

 

So in my 50,000 records I have an Employee ID field, and within the data set there are instances of multiple records with the same Employee ID.

It is sickness data so this is fine.

 

The data where I wish to pull a field from also has Employee ID and that is unique in that data set.

 

So for each Employee record in the workflow I want to pull in a value for their FTE from the 2nd data set which is a unique data set. If the workflow has multiple records then I wish to bring in the same FTE for each of those records.

 

Any thoughts on that?

 

Thanks

 

Ajay

 

DultonM
11 - Bolide

Hello again! Hopefully your Join is already accomplishing that!

 

If you could peak into the inner mechanism of the Join, you'd see that it is taking each individual record from the main data set and looking in the 2nd data set for a matching Employee ID. If later on another record comes along in the main data set with the same Employee ID, it'll match it to the same row in the 2nd data set that the other record matched to. In other words, records from the 2nd data set are NOT "used up", so they can be joined to multiple records from the main set. (In all of this I'm assuming your main data is the L-input of the join and the 2nd unique data set is the R-input)

 

If you sort your data by Employee ID after the join, hopefully you'll see the same FTE for a given Employee ID. If not, and assuming the 2nd data set only has 1 row per Employee ID, we'll have to investigate further.

ajaygajree
7 - Meteor

Hi Again

 

Apologies I was off work :)

 

So the 1st Set of Data is the left join and the 2nd set is the right join.

 

The first set of data is Sickness Data so a row is not unique for an employee ID as an Employee can have more than one sick period

 

The 2nd set of data is a set of unique records for each employee

 

So I need to add the FTE from the 2nd Data to every record in the sickness data (The first data set)

 

At the moment I am sending in say 50,000 records and getting out 50,500 for example, I am not sure why I am getting extra records?

DultonM
11 - Bolide

Interesting...the only thing I can think of is that the right side isn't truly unique on employee ID. Each record may be unique across all fields but not the employee ID field alone. Can you attach a workflow with an example of an employee that is appearing more than once in your output?

Labels