Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Need help on a Parse Regex

Yanopoff14
5 - Atom

Hello the community,

 

I have some troubles on a Parse Regex for a txt file. This file is a products info list and I just want to extract some columns.

 

- Example (or see the image or file "Data set") :Data set.png

 

- What I am looking for is underlined (this is just 1 line):

18495746201812PROBIOTICA DARMBALANS CAPS STRIP 30ST 200910 C01A 8711744029548 15527557ONBEKEND CAPSULE 000000000000N.V.T. STRIP 00003000STUK 00001500STUK VEMEDIA# SNEDERLAND 000000000022PS

 

- What I want as output is:

RegEx out 1RegEx out 2RegEx out 3RegEx out 4RegEx out 5RegEx out 6RegEx out 7RegEx out 8RegEx out 9RegEx out 10RegEx out 11
18495746201812PROBIOTICACAPS STRIP 30STVEMEDIA#SNEDERLAND000000000022PS

 

I succeeded to parse the first three parts with this expression (\d+)(\d{6})(\u*) and I get blocked to continue and to extract the rest.

(see my attached workflow "Test regex").

 

Can anyone help me please? Should I continue with RegEx or maybe is there another tool which could do the trick?

 

Yanopoff

5 REPLIES 5
DavidP
17 - Castor
17 - Castor

This gets you a bit closer - just have to figure out a rule for CAPS STRIP 30ST

 

(\d+)(\d{6})(\u*).+\s(.+)\s(\w)(\w+)\s(\d{10})(\d)(\d)(\w{2})

Ladarthure
14 - Magnetar
14 - Magnetar

Hi,

 

why don't you use an input with a flat ascii file format to get the data you want, it seems to me that your data set.txt would be perfect to do so!

DavidP
17 - Castor
17 - Castor

I like @Ladarthure 's idea, but I've not given up on regex...

 

Granted, I've made a few assumptions that you might have to tweak, but here you go:

 

(\d+)(\d{6})(\u*)\s\w+\s(.+)\s\d{6}\s.+\s(.+)\s(\w)(\w+)\s(\d{10})(\d)(\d)(\w{2})

carlosteixeira
15 - Aurora
15 - Aurora

Hello @Yanopoff14 maybe this help!

best regards..

 

Thanks

Carlos A Teixeira
Yanopoff14
5 - Atom

Hi @DavidP , @Ladarthure  and @carlosteixeira ,

 

So great to have some help.

 

Thanks DavidP and carlosteixeira2005 for your regex. It took some time to try DavidP and carlosteixeira2005 solutions. Both were great and I learned a lot. However, it looks like the Regex cannot work for my file. From what I understand, it is because some products do not have all the columns filled (missing or nonexistent data). I tried modify them with the website regex101 but I failed!

 

I also tried Ladarthur solution. It worked just fine and you are right, it fit perfectly my file. Thanks a lot, I did not know about that extension and the way it works.

 

Thanks again everyone for your fast reply ;^)

Yanopoff

Labels