How to compare a csv file and excel file to identify duplicates?
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I am new to Alteryx. I wanted compare a csv file with 136 characters with an excel worksheet to identify lines with common identifier. The output should be similar to the csv input file. But it should contain only the details of common identifiers. Please help. I am attaching samples.
As per the sample : there is a common identifier between csv SAMPLE and excel sample that is '00012-0123-45 with name ABCDE. So the output should be the eliminated csv file with no second line that is:
QTY0001205789622021STUVW00000001.23400000000000123.0000000000009999.99000000020000000002379.670000000000000.000000000002345.78.
The text file should only contain first line of SAMPLEs.
Solved! Go to Solution.
- Labels:
- Workflow
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
@Sunithaprdp please provide some more data, may be redesign of the workflow based on the data might solve the issue
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Sure I will modify the sample sheets and resend again. thanks a lot for your helping hands.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I am providing three sample data.
1) Sample worksheet
2)Sample text file
3) Desired output text file (in which i have removed unique data from the text file and only common NDC 11 data compared with the sheet has kept - manually)
Requirement: Sample worksheet and sample text file should be compared. There is a common identifier named NDC-11 . We should compare that. We should only get data of common NDC-11 in the desired output. In short we should identify duplicates among the worksheet as well as text file and should delete off the unique values from the text file which would be the output.
Thanks in advance!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
It is working with the sample, but not with the exact file. when I try with the exact file i am getting an output wherein which each line from the base text file is multiplied.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
@Sunithaprdp can you add a unique tool. let me know if it is still not working , will re-design the workflow
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
No unfortunately its not working. multiple lines are not getting but we are not receiving desired output wherein which the common NDCs are picked up. we are receiving all lines as such in the input text file with a header Field 1
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
@Sunithaprdp Do you have duplicate entries in your text input file?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Thank you so much it worked! but it is not coming up quarter wise.
like 22021 together and 32021 together. Instead it is coming up in the jumbled format. Also in the output file we are getting a header Field1 .
Is it something which we can do?
Can you please help me understand how to do and the workflow pattern.
Though it is jumbled the output is desired one. can we find solution to this too? Please advise.
