Duplicate Data removal
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi Team,
I have duplicate data in each row for example Sam/Sam coming twice and we "//" coming twice we need to have only one record but in case Sam/dam/Sam "SAM" coming twice but not together then it should remain as it is. My result should only delete if something coming twice together.
I have also attached result in case you need to see. Please refer below:
Data | Result Should Be |
Sam/Dam/Dam/Dan | Sam/Dam/Dan |
Sam/Dam/Dan/Dan | Sam/Dam/Dan |
Sam/Sam/Dan/Dan | Sam/Dan |
Sam/Sam/Dan//Dan | Sam/Dan |
Dro/Bro/Bro | Dro/Bro |
Dro/Dro/sri | Dro/sri |
KIN/Tin/Tin/Bin/Sin | KIN/Tin/Bin/Sin |
KIN/Tin/Tin/Bin/Sin/Sin | KIN/Tin/Bin/Sin |
KIN/Tin/Tin/Bin/Sin/Sin/kil | KIN/Tin/Bin/Sin/kil |
KIN/Tin/Tin/Bin/Sin/Sin/kil// | KIN/Tin/Bin/Sin/kil |
sin//bin/bin | sin//bin |
dum//cum | dum/cum |
Clu/Clu/dlu/dlu | Clu/dlu |
Ene/Ene//she/hee | Ene/she/hee |
Ene//tre/fre | Ene/tre/fre |
Solved! Go to Solution.
- Labels:
- Regex
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi Caltang,
Thanks for your reply.
If you see record 17, we have original record is ( Dro/klo/sro/klo/dro) Where klo is not duplicate since it is not coming together. In the result it has removed second "klo" and result showing as ( Dro/klo/sro/dro) However result should be as ( Dro/klo/sro/klo/dro)
Appreciate your help on this too. rest looks ok to me. I have millions of rows like this so there are chances of removing unnecessary data. Kindly generate both the column Original one and your result in final output so that it would be easy to compare.
Appreciate your help on this.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Thanks for your reply.
If you see record 17, we have original record is ( Dro/klo/sro/klo/dro) Where klo is not duplicate since it is not coming together. In the result it has removed second "klo" and result showing as ( Dro/klo/sro/dro) However result should be as ( Dro/klo/sro/klo/dro)
Appreciate your help on this too. rest looks ok to me. I have millions of rows like this so there are chances of removing unnecessary data. Kindly generate both the column Original one and your result in final output so that it would be easy to compare.
Appreciate your help on this.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
One more thing I'd like to recommend to you is before you run this workflow, convert your input into a YXDB file, then use the YXDB file as your input fo this workflow. It will help your workflow run faster.
Alteryx ACE
https://www.linkedin.com/in/calvintangkw/
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
You are superb.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi, @mmustkee
There is a easy way by formula for you :
Trim(Replace(REGEX_Replace([Data], '([[:alpha:]]+)\/{1,}\1(?=$|\/)', '$1'), '//','/'),'/')
BTW, only a question, what mind of you want 'sin//bin/bin' to 'sin//bin' , but 'Ene//tre/fre' to 'Ene/tre/fre' ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Ha! My bad on your question
It should result should with one "/" only so it should be "sin/bin"
Kindly share your workflow or can I use your formula in formula tool and it will work.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi, @mmustkee
So you can just put my expression in your formula to get your want.
******
If can help you get your want, please mark is as a solution and give a like for share more.
