Hello,
I need some assistance building out logic that would parse a string with data in the following format:
Project Description | Project ID |
PS_PSAP002534_Soft | PSAP002534 |
PS_PSEM005740_Black | PSEM00574 |
I want to parse the string to just remove the ID field. The challenge is that the ID field is not always the same character length each time. I think the best approach would be to use regex to find 3 characters and stop at underscore. I need help building out that logic in regex. Please let me know if you an idea.
Kind Regards,
Rahshid
Solved! Go to Solution.
Hi Rahshid
Is the ID always between the two underscores and will there always only be the two underscores?
If so, you can do this in the Formula tool with this (easier to explain this way than using the REGEX tool):
REGEX_Replace([Field], ".*?_(.+)?_.+", "$1")
The .* finds any character up to the underscore.
The ? character forces minimum match so that search stops on the first underscore.
The (.+)? finds all characters after the first underscore and marks them as the replace string to keep.
The _.+ finds all the characters after the second underscore.
"$1" means to use the first character group within the parentheses as the replacement text.
@DavidP Hi David,
Thanks for the prompt response. Text to column will not work in this case because the string values are much longer than what I have pasted above. Most of the values are similar to this :
PS_PSAP002589_Name_Name2_Name3_-_TW_Name_Name2_Name3
PS_PSUS005814_Former_ZZZ_Team_Legacy_project_trk
VS_10CFLL_PSEMV0000735_CARGOZZZ_OTY
Ideally I would not like to spread it across multiple columns. I would think regex would be the best approach since the ID isn't generally in the same place. Let me know what you think at your earliest convenience.
Kind Regards,
Rahshid
Hi @aria_rahshid1 ,
If you are just trying to parse everything inbetween the Underscores, this should work fine. This expression will remove everything before and after the underscores, and grab everything inbetween the underscores.
Hi @Philip ,
This is a good approach, but unfortunately the ID is not necessarily after or before the first underscore . There some values where the ID is embedded after a few underscores and text. An example of this would be :
VS__40FUAM_PSUSV0000849_APPLE_JZ_FZI_PCIDZTS
In this case I dont believe your solution will work. Would happen to have an alternative work around? I appreciate your help.
Kind Regards,
Rahshid
Regex depends on pattern matching. I'd need to know a dependable pattern. Does the ID always start with PS and no other portion starts with PS?
Or try this: if you were looking through a long list of these, how would you know an ID from all the other information?
Hi @DavidP ,
This is amazing. It is exactly what I was looking for. Thanks for your help!
Kind Regards,
Rahshid Aria