Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Parse text from SharePoint Input Tool

newuser980345483
7 - Meteor

I have a SharePoint Input Tool which brings in a list with multiple fields. One field in specific (called 'Additional Items') is a free text field that the end user fills out. Typical structure of the field in SharePoint is as follows: 

 

Description / Description :PRODUCT 1 NAME
Item / Art :12345678
UPC / CUP :12345678901
Retail / P. det :$11.96
Size / Format :EA
  
Description / Description :PRODUCT 2 NAME
Item / Art :23456789
UPC / CUP :23456789012
Retail / P. det :$13.98
Size / Format :EA
  
Description / Description :PRODUCT 3 NAME
Item / Art :34567890
UPC / CUP :34567890123
Retail / P. det :$11.96
Size / Format :EA
  
Description / Description :PRODUCT 4 NAME
Item / Art :45678901
UPC / CUP :45678901234
Retail / P. det :$11.96
Size / Format :EA
  
Description / Description :PRODUCT 5 NAME
Item / Art :56789012
UPC / CUP :56789012345
Retail / P. det :$11.96
Size / Format :EA
  
Description / Description :PRODUCT 6 NAME
Item / Art :67890123
UPC / CUP :67890123456
Retail / P. det :$95.68
Size / Format :EA
  
Description / Description :PRODUCT 7 NAME
Item / Art :78901234
UPC / CUP :78901234567
Retail / P. det :$19.97
Size / Format :EA
  
Description / Description :PRODUCT 8 NAME
Item / Art :89012345
UPC / CUP :89012345678
Retail / P. det :$19.97
Size / Format :EA

 

When the Additional Items field pulls into Alteryx, it looks like this:

 

<div class="ExternalClassDD65885A5A8245CF91E842F2A32D7147"><pre>
&#160;</pre><table><tbody><tr><td>Description / Description &#58;</td><td>PRODUCT 1 NAME</td></tr><tr><td>Item / Art &#58;</td><td>123456789</td></tr><tr><td>UPC / CUP &#58;</td><td>12345678901</td></tr><tr><td>Retail / P. det &#58;</td><td>$11.96</td></tr><tr><td>Size / Format &#58;</td><td>EA</td></tr><tr><td>&#160;</td><td>&#160;</td></tr><tr><td>Description / Description &#58;</td><td>PRODUCT 2 NAME</td></tr><tr><td>Item / Art &#58;</td><td>234567890</td></tr><tr><td>UPC / CUP &#58;</td><td>23456789012</td></tr><tr><td>Retail / P. det &#58;</td><td>$13.98</td></tr><tr><td>Size / Format &#58;</td><td>EA</td></tr><tr><td>&#160;</td><td>&#160;</td></tr><tr><td>Description / Description &#58;</td><td>PRODUCT 3 NAME</td></tr><tr><td>Item / Art &#58;</td><td>345678901</td></tr><tr><td>UPC / CUP &#58;</td><td>34567890123</td></tr><tr><td>Retail / P. det &#58;</td><td>$11.96</td></tr><tr><td>Size / Format &#58;</td><td>EA</td></tr><tr><td>&#160;</td><td>&#160;</td></tr><tr><td>Description / Description &#58;</td><td>PRODUCT 4 NAME</td></tr><tr><td>Item / Art &#58;</td><td>456789012</td></tr><tr><td>UPC / CUP &#58;</td><td>45678901234</td></tr><tr><td>Retail / P. det &#58;</td><td>$11.96</td></tr><tr><td>Size / Format &#58;</td><td>EA</td></tr><tr><td>&#160;</td><td>&#160;</td></tr><tr><td>Description / Description &#58;</td><td>PRODUCT 5 NAME</td></tr><tr><td>Item / Art &#58;</td><td>567890123</td></tr><tr><td>UPC / CUP &#58;</td><td>56789012345</td></tr><tr><td>Retail / P. det &#58;</td><td>$11.96</td></tr><tr><td>Size / Format &#58;</td><td>EA</td></tr><tr><td>&#160;</td><td>&#160;</td></tr><tr><td>Description / Description &#58;</td><td>PRODUCT 6 NAME</td></tr><tr><td>Item / Art &#58;</td><td>678901234</td></tr><tr><td>UPC / CUP &#58;</td><td>67890123456</td></tr><tr><td>Retail / P. det &#58;</td><td>$95.68</td></tr><tr><td>Size / Format &#58;</td><td>EA</td></tr><tr><td>&#160;</td><td>&#160;</td></tr><tr><td>Description / Description &#58;</td><td>PRODUCT 7 NAME</td></tr><tr><td>Item / Art &#58;</td><td>789012345</td></tr><tr><td>UPC / CUP &#58;</td><td>78901234567</td></tr><tr><td>Retail / P. det &#58;</td><td>$19.97</td></tr><tr><td>Size / Format &#58;</td><td>EA</td></tr><tr><td>&#160;</td><td>&#160;</td></tr><tr><td>Description / Description &#58;</td><td>PRODUCT 8 NAME</td></tr><tr><td>Item / Art &#58;</td><td>890123456</td></tr><tr><td>UPC / CUP &#58;</td><td>89012345678</td></tr><tr><td>Retail / P. det &#58;</td><td>$19.97</td></tr><tr><td>Size / Format &#58;</td><td>EA</td></tr></tbody></table></div>

 

From the above, I would like to parse out the number that follow the 'Item/Art' flag (highlighted in red text above). 

 

Any insight on how to cleanly parse out these item numbers, either into one field (without everything else), or onto separate rows? 

 

Thanks! 

3 REPLIES 3
Thableaus
17 - Castor
17 - Castor

Hi @newuser980345483 

 

Is this big html text in a single row?

 

 

Cheers,

Thableaus
17 - Castor
17 - Castor

Hi @newuser980345483 

 

Here's a solution.

 

clipboard_image_1.png

 

- Use Regex Tokenize Tool to split your data to rows

([^><]+)(?:<[^>]+>|$)

 

This expression basically selects everything that is not between (<>), these html specifiers.

 

- Use Multi-Row formula field to flag the pair of Item Art and respective number.

 

IF Contains([Field], "Item / Art") THEN 0
ELSEIF Contains([Row-1:Field],"Item / Art") THEN 1
ELSE NULL() ENDIF

 

- User Filter to Flag only the numbers

 

- Use Select Tool to have only a single field.

 

WF attached. Version 2019.3

 

Cheers,

danilang
19 - Altair
19 - Altair

Hi @newuser980345483 

 

For extra credit, here's a dynamic solution based on @Thableaus excellent start that parses all the fields in the dataset and transposes them by product

 

WF.png

 

Start by removing the blank lines and restoring the colon character ":".  Then group the rows by product and assign a line number within each product.  The odd numbered rows become the column headers and the even ones are the values so the Filter tool splits them into 2 sets.  The 2 sets are joined by record position and the crosstab tool flips the data into columns.  The Unique tool and the Dynamic Rename work around the fact the default behaviour of the Crosstab tool is to  reorder the columns alphabetically and replace all non- alpha characters with underscores. 

 

The results are as follows

Results.png

 

Dan

  

Labels