find text in html code and put pieces into a table
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I have lots of records with an ID and html code. On that html page there are funky chunks of code that I highlighted in red below:
more html...
<p style="padding-left: 30.0px;">Bacon ipsum dolor amet salami t-bone pancetta, chuck leberkas tenderloin pork loin. Filet mignon strip steak pig venison meatball chuck spare ribs <span style="font-family: arial, helvetica, sans-serif; font-size: 10pt;">[[--ContentED.9uFvErVUB342rwukUx3H9||Shank sausage pancetta chicken||1758699LJ||Article--]] and [[--ContentED.V1cnQOilHA5234rs4z27ZuA7||Short ribs andouille short loin pork||1753243LJ||Article--]] for more information.</span></p>
<p style="padding-left: 30.0px;">[[[[AssetED.w8dho5pdsfserwerJ0Q7]]]]</p>¶
<p style="padding-left: 30.0px;">[[[[AssetED.icqFyEsdf354wtgs0HwYqwM9·height="305"·width="574"]]]]</p>¶
...more html
I'd like to copy the ID (let's say 2345 for the example above) and the four (or whatever number it is) chuncks into a table that looks like this:
What methods can you suggest? Thanks! - Jimmy
Solved! Go to Solution.
- Labels:
- Preparation
- Regex
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
That gives you JUNK followed by [. You can eliminate the junk. Then you have ] followed by JUNK. You can eliminate the junk.
You'll then have to format the data as required. I'll check in the morning to see if you've solved this.
Mark
Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
The RegEx pattern (\[{1,4}.*?\]{1,4}) should match strings found between 1 and 4 brackets including the brackets:
- ( begins a marked group
- \[{1,4} matches anywhere between 1 and 4 left bracket characters
- .*? matches all characters in a lazy (non-greedy) way
- \] matches anywhere between 1 and 4 right bracket characters
- ) end a marked group
Use the RegEx Tool and check tokenize with the pattern above. Select 'Split to Rows'. I've also attached the module so that you can see for yourself.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Michael,
Beautiful. That worked as I had hoped. Loved learning that too. I have been needing to learn RegEx more and this helps motivates me to do it. Thank you! -Jimmy
