Alteryx Designer Desktop Discussions

jimmys · ‎01-14-2016

I have lots of records with an ID and html code. On that html page there are funky chunks of code that I highlighted in red below:

more html...

Bacon ipsum dolor amet salami t-bone pancetta, chuck leberkas tenderloin pork loin. Filet mignon strip steak pig venison meatball chuck spare ribs [[--ContentED.9uFvErVUB342rwukUx3H9||Shank sausage pancetta chicken||1758699LJ||Article--]] and [[--ContentED.V1cnQOilHA5234rs4z27ZuA7||Short ribs andouille short loin pork||1753243LJ||Article--]] for more information.

[[[[AssetED.w8dho5pdsfserwerJ0Q7]]]]¶
[[[[AssetED.icqFyEsdf354wtgs0HwYqwM9·height="305"·width="574"]]]]¶

...more html

I'd like to copy the ID (let's say 2345 for the example above) and the four (or whatever number it is) chuncks into a table that looks like this:

What methods can you suggest? Thanks! - Jimmy

MarqueeCrew · ‎01-14-2016

I've got an idea brewing. I might replace everything between ] and [ (that isn't another ']') with a delimiter.

That gives you JUNK followed by [. You can eliminate the junk. Then you have ] followed by JUNK. You can eliminate the junk.

You'll then have to format the data as required. I'll check in the morning to see if you've solved this.

Mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.

michael_treadwell · ‎01-14-2016

The RegEx pattern (\[{1,4}.*?\]{1,4}) should match strings found between 1 and 4 brackets including the brackets:

( begins a marked group
\[{1,4} matches anywhere between 1 and 4 left bracket characters
.*? matches all characters in a lazy (non-greedy) way
\] matches anywhere between 1 and 4 right bracket characters
) end a marked group

Use the RegEx Tool and check tokenize with the pattern above. Select 'Split to Rows'. I've also attached the module so that you can see for yourself.

jimmys · ‎01-15-2016

Michael,

Beautiful. That worked as I had hoped. Loved learning that too. I have been needing to learn RegEx more and this helps motivates me to do it. Thank you! -Jimmy

Alteryx Designer Desktop Discussions

find text in html code and put pieces into a table

RegEx text find and replace

Parse HTML Table with some attribute tags from Tex...

Downloading HTML tables, Alteryx not finding table...

Extracting text from HTML Tags

Multi-Column Text to row table

Re: Date Time Function - Prioritization Base on Du...

Re: If formula to determined a value based on date...

Re: Running multiple alteryx workflows within alte...

Re: Selecting the columns coming after a specific ...

Re: Regex(?) formula to remove values matching the...