Alteryx Designer Desktop Discussions

shaheer · ‎12-16-2022

Hello, I am not sure what I am doing wrong. For example purposes I am simplifying my problem.

I have two files:

File 1:

ID	Name	Manager	Status
1	James	Annie	Open
5	Mark	Tyler	Closed

File 2:

ID	Number	Stage	NDA
1	100	Planning	Yes
2	101	Planning	No
...
100	205	Complete	Yes

NOTE: EVERY ID IN FILE 1 EXISTS IN FILE 2

My goal is to find where the ID's in File 1 exist in File 2, and to join their data. However, when I use the J output, I end up with MORE records than there are in file 1.

So in reality, File 1 has 4000 records. File 2 has 9100 records. After using the J output with the join tool, I end up with 5379 records. I have no clue how or why this is happening. I want there to be only 4000 records, the same number in file 1. Here is what my join tool looks like:

PLEASE NOTE: THERE ARE NO DUPLICATES IN FILE 1 OR 2

I have played around with the three L, J and R but still don't get my desired result. What am I doing wrong?

Felipe_Ribeir0 · ‎12-16-2022

Hi @shaheer

It seems that you have duplicated Project Number`s on your second file, if you did not this would not happen. Try to be sure about that. One way of being sure about it is putting a unique tool after the input tool of each dataset and assign the Project Number column.

EXAMPLE OF DUPLICATE SITUATION

File 1

Project Number

1

2

File 2

Project Number

1

2

JOIN Result

Project Number

1

2

ShankerV · ‎12-16-2022

Hi @shaheer

What you are doing is right, but the problem is with the dataset.

Breaking down to explain.

File 1 has 4000 records.

File 2 has 9000 records.

But the J anchor needs to output only 4000 records, but it gives 5000+ records.

Because ID as in 1 in file 1 is matched with ID as 1 in File 2 twice.

So leading to increase in count.

Many thanks

Shanker V

ShankerV · ‎12-16-2022

Hi @shaheer

In other words to explain you, if you have unique records in File 1 ID and unique records in File 2 ID.

Then only we can expect 4000records in J anchor.

Note: Unique refers here as the same value repeated twice in ID, not both files should have same record count.

Please find the attached examples with small dataset.

Note: The workflow used to achieve the solution is attached which can be downloaded to see how the solution works.

If you believe your problem has been resolved. Please mark helpful answers as a solution so that future users with the same problem can find them more easily!!!!

Many thanks

Shanker V

Alteryx Designer Desktop Discussions

Grab Data From File Where the ID in One File Exists in Another

Zero to Advanced in 20 days

Re: Zero to Advanced in 20 days

Re: Zero to Advanced in 20 days

Re: Unmerge Excel Rows and duplicate data

Re: How separate channel ID