Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

RegEx Question

gtg925j
5 - Atom

HI Everyone, 

 

I am pretty new to Alteryx and to the community. The community has been a great resource for learning but I can't seem to apply other examples to this case. I could use some help with parsing an XML file. There is one column with all the text between the <p> tags in each row. Id like to pull out section number and their titles in separate columns. It looks like the chapters go down to 4 levels max. Thanks for your help!

 

Here is an example of what the row contains:

 

1.0 Main Chapter Title

1.1 Subtitle 1

1.1.1 Subtitle 2

1.1.1.1 Subtitle 3

4 REPLIES 4
MarqueeCrew
20 - Arcturus
20 - Arcturus

@gtg925j ,

 

If the data looks like above, then:

 

([\d\.]+)\s(.*)

 

Use the RegEx Tool (set to parse) with that formula....

That will get you the results (I think)

 

Cheers,

 

Mark

 

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
gtg925j
5 - Atom

Thanks Mark, for the quick reply! This is definitely a step in the right direction. It works for all of the lines with sections but it looks like it also pulls out the first number in a line as the section and the following text as the title or any text behind the first period in the row.

 

for example: 

Copyright 1998 some other text here

March 15, 1997 some more text

Pages 55-57 appendix 

some company, INC. 1234 Address

full sentence. 

see attachment 14.

 

Results in:

SectionTitle
1998Some other text here
1997Some more text
57appendix
.1234 Address
. 
14. 

 

Thanks again for your inputs!

 

Josh

MarqueeCrew
20 - Arcturus
20 - Arcturus

Is it possible to post sample data that contains all forms of inputs so that a PATTERN can be returned that does the right thing to the right records?  Given your original input, that formula appears to work.

 

Cheers,

 

Mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
apathetichell
19 - Altair

I don't think anything is going to be perfect but perhaps regex_replace([field],"^([\d\.]+\s)","")

Labels
Top Solution Authors