My source data has one field. That one field is a file path. There is sub-directory in each file path that uses a serial # for its name. I am trying to isolate that serial #. Nine times out of ten, that serial # is the first child in a file's path. But, nothing is ever that easy. Sometimes, it is the 3rd or 4th child in the path.
I tried using the Text to Columns to parse the serial #. It easily isolates the serial # for those entries where it appears as the first child under the file path. But, how do I extract it for when it is not?
Here is an example of the source data...
The serial # is always 13-digits or 16 characters with the dashes.
How do I extract just the serial #?
Path | What I Want | ||
AIL\5996-01-476-1136 | 5996-01-476-1136 | ||
AIL\5996-01-502-1417 | 5996-01-502-1417 | ||
Another Customer\Their Book 1\1095-00-866-2398 | 1095-00-866-2398 | ||
AIL\5996-56-582-4657 | 5996-56-582-4657 | ||
Another\Yada\Their Yada Yada\5487-98-567-5641 | 5487-98-567-5641 |
Solved! Go to Solution.
Hi @hellyars,
The serial numbers can be extract well by Regex tool:
Method 1: if you are sure that the pattern of the serial number is 4 digits, than a dash, 2 digits, a dash, 3 digits, a dash, 4 digits
what is the pattern of the 16-character serial number?
Method 2: if you are sure the serial will always be all texts after the last occurrence of '\'
@hellyars If your path always has slashes (\) and the serial number is at the end, you could use this formula to get the serial #:
FileGetFileName([Path])
It works on your sample data.
Method 1 worked perfectly.
I stumbled into another solution as well.
I used the Text to Parse tool against the Path. I then filtered the results using the formula regex_match(LEFT([Path2],1),"^\d"). I then further used the Text to Parse tool to parse the results until I was able to isolate the serial #. I then used the Union tool to join with the initial true results.
Both methods seem to work. Your approach is more direct. I am using it now. Thanks.
No, there are always sub-directories and files in the path after the serial #.