I'm working on a project and I need to remove duplicate user ID's from a concatenated list separated by commas. User ID's can have numbers and letters only. The below formula works great 99% of the time, but on occasion it will remove a user that it should not. The erroneous removals appear to happen when there are two usernames with the same letters, but they end in a different number. For example abcx2, abcx3 would remove abcx3. Any ideas on how to update the regex formula?
regex_replace([Concat ID List],"\b(\w+),(?=.*\b\1,?)","")
Solved! Go to Solution.
Difficult to say what the issue is with your code without access to your data to troubleshoot - the RegEx looks fine, and worked as expected including the specific example you gave in your question.
Personally I find it simpler to split out the IDs to rows using Text to Columns, then remove duplicates with Unique, then concatenate back together - I've attached a sample workflow to illustrate what I mean.
Consider changing the \w+ to \.+ in your Regex.
I think that you need \.+ because you need specificity capturing alphanumeric versus just alpha.
Let me know if that works for you.
Cheers!