Alteryx Designer Desktop Discussions

jennahecht · ‎03-22-2021

Hi, I am hoping to parse the following data that I receive. I am familiar with the text to columns tool, and semi familiar with the reg ex tool, but unsure how to parse this data exactly how I would like to:

Jenna Applepie/Disco Ball: White Noise 41.0k -3% [AUDIO STREAMS 11.6m -4%, ALBUM 1.9k -13%](c)

In which I am hoping for to be broken out in 8 columns:

Jenna Applepie

Disco Ball: White Noise

41.0k - 3%

AUDIO STREAMS

11.6m -4%

ALBUM

1.9K - 13%

c

I am able to break the columns using specific delimters like the "/" and "[", but seperating tAny assistance is much appreciated!

Thank you,

Jenna

Jonathan-Sherman · ‎03-22-2021

Hi @jennahecht,

It's a little hard to ensure we're capturing all the idiosyncrasies with one line but this might work for you, you'll want to use the Regex tool in parse mode.

(.*?)\/(.*?)\s(\d.*?%).*?\[(.*?)\s(\d.*?%).*?(\w+).*?(\d.*?%).*?\((.*?)\)

If this solves your issue please mark the answer as correct, if not let me know!

Kind regards,

Jonathan

danilang · ‎03-22-2021

Hi @jennahecht

Can you provide more than 1 line of input? As many as you can would be best? Having more data to examine will help us find any patterns that are common across the rows

Dan

jennahecht · ‎03-22-2021

Sure - here are 13 lines of data - the information is sensitive so ignore my lack of creativity with these lines 😄

1) Jenna Gabrielle/Good Times: The Fun Album 11.0k -6% [AUDIO STREAMS 11.6m -1%, ALBUM 45k -99%](s)

2) Robert Dean/Vail Colorado Skiing 99.9k -1% [AUDIO STREAMS 40.2m -1%, ALBUM 1.0k +9%](u)

3) Blake Larry/Future Crying 16.5k +11% [AUDIO STREAMS 18.9m +16%, ALBUM 1.1k +41%](u)

4) Discoball/After Clock 11.1k +4% [AUDIO STREAMS 44.6m -1%, ALBUM 4.2k +46%](u)

5) Winne the Pooh/Love Party 90.4k -6% [AUDIO STREAMS 98.4m -2%, ALBUM 0.1k -19%](w)

6) Carol Baskin/When It Rains… It Pours 90.9k debut [AUDIO STREAMS 92.0m debut, ALBUM 1.9k debut] D1C

7) Big Mama/Your Turn 19.4k +1% [AUDIO STREAMS 96.8m +1%, ALBUM 0.4k +16%](u)

😎 Papa Bear/Dog Days 19.1k -2% [AUDIO STREAMS 92.4m -6%, ALBUM 0.1k -1%](u)

9) J Money/Workflow Party 12.5k +52% [AUDIO STREAMS 11.2m +90%, ALBUM 10.1k +115%](u)

10) Ignore Names/You See Me Now 16.8k -4% [AUDIO STREAMS 90.4m -4%, ALBUM 1.4k +9%]

11) Luke Neck/The Sun and the Moon . . . 16.2k debut [AUDIO STREAMS 1.6m debut, ALBUM 14.9k debut](w) D1C

12) Brothers/Night Time 16.4k debut [AUDIO STREAMS 11.1m debut, ALBUM 16.0k debut](u) D1C

13) Mom and Dad/Money Machine 16.0k -9% [AUDIO STREAMS 96.9m -9%, ALBUM 1.1k -10%](u)

jennahecht · ‎03-22-2021

This did not work for me

apathetichell · ‎03-22-2021

I think this is best done in a combo of regex and text to columns. There are lot of specific issues which pop up and I'm not sure if they are characteristic of the datasource or one off things. I hardcoded around a few of them and my regex is by no means perfect. To do a proper regex parse on this with all of the exceptions (different spaces and usage of regex special characters in the text) would be exceptionally long I think. I regexed out the artist/album title and then combined them back with the remainder which I text to columned out via splitting on space. Also - If "Audio streams" and "album" are static in every entry then you don't really need to parse them since you can just throw them back in as dummy column later. No idea how the parsing would work with album titles or artist names with splits in them...

Then I tried to clean it up as best as I could.

Hope this helps or at least points you in the right direction...

echuong1 · ‎03-22-2021

Try this:

(\u.+)\/(\u.+)\s(\d.+)\s\[(\u.+)\s(\d.+)\,\s(\u.+)\s(\d.+)\](.*)

estherb47 · ‎03-22-2021

Hi @jennahecht

The challenge with this data is some have parenthesis at the end and some don't

So I did it in two steps.

RegEx tool with the Parse method - .*\)\s+?(.*)\/(.*?)\s+?(\d.*)\s+?\[(.*)\s+?(\d.*),\s(.*?)\s+(\d.*?)\](.*$)

Next a formula tool on the final column. IIF(Contains([RegExOut8], "("),REGEX_Replace([RegExOut8], "$(.*)$", "$1"),"")

Let me know if that works.

Cheers!

Esther

jennahecht · ‎03-23-2021

This works! Thank you so much

jennahecht · ‎03-23-2021

& this works as well! thank you

Alteryx Designer Desktop Discussions

Reg Ex Parsing Help

Zero to Advanced in 20 days

Re: Zero to Advanced in 20 days

Re: Zero to Advanced in 20 days

Re: Single App that filters user selections dynami...

Re: Sorting Sheet Names