Get Inspire insights from former attendees in our AMA discussion thread on Inspire Buzz. ACEs and other community members are on call all week to answer!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Regex FileName with different naming conventions

zekharya
6 - Meteoroid

Hi all,

 

I have different files with naming as follows:

Real estates [Data centre] [SGD, RM]

Real estates [Office] (SGD) reduce ROU

Real estates [Properties] [Part 1 (SOA)] [SGD]

Real estates [Properties] [Part 2 (1541)] [SGD]

Real estates [Properties] [Part 3] [SGD]

 

I would like to seek your help on extracting the following outputs in this desired manner:

Real estatesData centre[SGD, RM]
Real estatesOffice(SGD) reduce ROU
Real estatesProperties[Part 1 (SOA)] [SGD]
Real estatesProperties[Part 2 (1541)] [SGD]
Real estatesProperties[Part 3] [SGD]

 

My current Regular Expression (\<\w+\s+\w+\>).(\<\w+\s+\w+\>)*(\[.*\]).*(\[.*\]) is having an issue.

And my formula is Replace(Replace([RegExOut1], "[", ""),"]","")

My current results are as follows:

Real estatesData centre[SGD, RM]
   
Real estatesProperties Part 1 (SOA)[SGD]
Real estatesProperties Part 2 (1541)[SGD]
Real estatesProperties Part 3[SGD]

 

Thank you in advance for your help!

6 REPLIES 6
DataNath
17 - Castor

Hey @zekharya, will this always be the structure of your data? If so you could perhaps look to use something like this:

 

([^[]+)\s\[([^]]+)\]\s(.+)

 

DataNath_0-1669023885261.png

 

We're basically extracting sections of the string here based upon the square brackets.

zekharya
6 - Meteoroid

Hi @DataNath, thank you for the super fast response!

 

However, I made a mistake earlier as the workflow uses the full path instead of file name. Could you assist in providing the same result but with the source as below? Or is there a method in alteryx that allows me to just extract the filename aka just the part in bold first before I can use your given solution as above? Thank you so so much.

 

ie

C:\Users\zgoh022\Downloads\Brand\Alteryx Attempt\Real estates [Data centre] [SGD, RM].xlsx|||'#1 Korea$'

C:\Users\zgoh022\Downloads\Brand\Alteryx Attempt\Real estates [Office] (SGD) reduce ROU.xlsx|||'#3 Amazon$'

C:\Users\zgoh022\Downloads\Brand\Alteryx Attempt\Real estates [Properties] [Part 1 (SOA)] [SGD].xlsx|||'#4 Japan$'

DataNath
17 - Castor

Hey @zekharya, no problem. The RegEx is a simple change, just a slight adaptation to allow for extra bits (i.e. the rest of the filepath) either side of the target pattern(s):

DataNath_0-1669025226527.png

 

.+\\([^[]+)\s\[([^]]+)\]\s(.+)\.

 

Where is this field coming from? If it's the directory tool then that will also output a [FileName] field rather than using the [FullPath].

 

Quick edit: Noticed you had bolded the file extension (.xlsx) so if you want to keep that, you could change the expression to something like this;

 

DataNath_1-1669025388159.png

 

.+\\([^[]+)\s\[([^]]+)\]\s([^|]+)

 

zekharya
6 - Meteoroid

Hi @DataNath , thank you so much for your help. The regular expression you have kindly provided works perfectly!

.+\\([^[]+)\s\[([^]]+)\]\s(.+)\.

 

However, is it possible to include the final part in bold as one of the outputs as well?

C:\Users\zgoh022\Downloads\Brands\Alteryx Attempt\Real estates [Data centre] [SGD, RM].xlsx|||'#1 KL DC$'

 

So that the output is as follows:

Real estatesData centre [SGD, RM]#1 KL DC

 

Was tidying the workflow and realised this issue ><

Thanks once again!

DataNath
17 - Castor

Hey @zekharya, no problem! Try the following:

 

.+\\([^[]+)\s\[([^]]+)\]\s(.+)\.xlsx\|{3}'(.+)\$

zekharya
6 - Meteoroid

Sorry for the late reply, it works, thank you so much for your help!

Labels