Bring your best ideas to the AI Use Case Contest! Enter to win 40 hours of expert engineering support and bring your vision to life using the powerful combination of Alteryx + AI. Learn more now, or go straight to the submission form.
Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

How do I extract a serial # from a file path when its not always in the same position?

hellyars
13 - Pulsar

My source data has one field.  That one field is a file path.  There is sub-directory in each file path that uses a serial # for its name.  I am trying to isolate that serial #.   Nine times out of ten, that serial # is the first child in a file's path.   But, nothing is ever that easy.  Sometimes, it is the 3rd or 4th child in the path.  

 

I tried using the Text to Columns to parse the serial #.  It easily isolates the serial # for those entries where it appears as the first child under the file path.  But, how do I extract it for when it is not?

 

Here is an example of the source data...

 

The serial # is always 13-digits or 16 characters with the dashes.  

 

How do I extract just the serial #?

 

PathWhat I Want  
AIL\5996-01-476-11365996-01-476-1136  
AIL\5996-01-502-14175996-01-502-1417  
Another Customer\Their Book 1\1095-00-866-23981095-00-866-2398  
AIL\5996-56-582-46575996-56-582-4657  
Another\Yada\Their Yada Yada\5487-98-567-56415487-98-567-5641  

 

4 REPLIES 4
jasperlch
12 - Quasar

Hi @hellyars

 

The serial numbers can be extract well by Regex tool: 

 

Method 1: if you are sure that the pattern of the serial number is 4 digits, than a dash, 2 digits, a dash, 3 digits, a dash, 4 digits 

what is the pattern of the 16-character serial number?

 

Method 2: if you are sure the serial will always be all texts after the last occurrence of '\' 

 

Capture1.PNG

patrick_digan
17 - Castor
17 - Castor

@hellyars If your path always has slashes (\) and the serial number is at the end, you could use this formula to get the serial #:

FileGetFileName([Path])

It works on your sample data.

hellyars
13 - Pulsar

Method 1 worked perfectly.

 

I stumbled into another solution as well.

 

I used the Text to Parse tool against the Path.   I then filtered the results using the formula regex_match(LEFT([Path2],1),"^\d").  I then further used the Text to Parse tool to parse the results until I was able to isolate the serial #.   I then used the Union tool to join with the initial true results.  

 

Both methods seem to work.  Your approach is more direct.  I am using it now.  Thanks.

hellyars
13 - Pulsar

No, there are always sub-directories and files in the path after the serial #. 

Labels
Top Solution Authors