Alteryx Designer Desktop Discussions

AlteryxUserFL · ‎05-21-2020

I'm working on a project and I need to remove duplicate user ID's from a concatenated list separated by commas. User ID's can have numbers and letters only. The below formula works great 99% of the time, but on occasion it will remove a user that it should not. The erroneous removals appear to happen when there are two usernames with the same letters, but they end in a different number. For example abcx2, abcx3 would remove abcx3. Any ideas on how to update the regex formula?

regex_replace([Concat ID List],"\b(\w+),(?=.*\b\1,?)","")

Hannah_Lissaman · ‎05-21-2020

Hi @AlteryxUserFL

Difficult to say what the issue is with your code without access to your data to troubleshoot - the RegEx looks fine, and worked as expected including the specific example you gave in your question.

Personally I find it simpler to split out the IDs to rows using Text to Columns, then remove duplicates with Unique, then concatenate back together - I've attached a sample workflow to illustrate what I mean.

RobertOdera · ‎05-21-2020

HI @AlteryxUserFL

Consider changing the \w+ to \.+ in your Regex.

I think that you need \.+ because you need specificity capturing alphanumeric versus just alpha.

Let me know if that works for you.

Cheers!

Alteryx Designer Desktop Discussions

Regex Replace - Remove Duplicates

Re: Row creation

Re: How to select columns dynamically using number...

Re: Batch macro to read 1000+ .xlsx files with var...

Re: Issue when using Block Until Done and Power BI...

Example workflow for setting up a custom list to u...