Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Search String for a Substring and return the preceding number value

ChrisB_dup_72
7 - Meteor

Hi,

I have a table in and two of the columns have a lot of text.  I'm trying to get specific information from those columns that might come in different forms.

 

Here is an example of the text in 1 cell:

AHU-1 (Basement floor) - 4,280 CFM's w/VFD, The 15 HP VFD was added in 2014, 10HP on the return side. heating and cooling coils, ducted distribution, diffusers and plenum return. Poor access to unit.

 

I need to search for CFM and HP then I need the preceding number values.  There may or may not be a space between the number and the acronym.  

This particular example has two HP values.  In this case I would want the larger number.  So I would like two resulting fields, "CFM"  with a value of 4,280 and "HP" with a value of 15.

 

Any help would be appreciated.

Thank you

 

 

2 REPLIES 2
estherb47
15 - Aurora
15 - Aurora

Hi @ChrisB_dup_72 

Great challenge here!

First step is replace commas with nothing (I did this with a formula tool. There are many other ways)

Then, we need to separate into columns what you'd like to extract. A RegEx tool, Tokenizing into rows, works really well here. I used the following as my statement: (\d+\s*(?:CFM|HP))

Find a number, which may or may not be followed by a space, then followed by CFM or HP. The ?: is an unmarked group, which is telling alteryx to look for either "CFM" or "HP" as a group of letters

EstherB47_0-1588215648931.png

 



With the example you gave, that gives 3 rows. Another RegEx tool extracts the number and text into separate columns. You could also use a formula tool and regex_replace to pull the numbers into one column, and the text into another. Just be sure to set the number column as Integer type.

EstherB47_1-1588215680107.png

 



A summarize tool finds the greatest number per text qualifier. If you'd like that to be in columns, add a Cross Tab tool afterwards.

 



Let me know if this helps.

Cheers!

Esther

ChrisB_dup_72
7 - Meteor

Hi Ester,

I just got back on this workflow tonight.  You're solution worked out beautifully.  Thank you so much!

 

I'm new the RegEx tool.  Is there a good source to interpret statements like (\d+\s*(?:CFM|HP))?  I'd like to learn more about that because I'm going to need to use in on datasets similar to the one in this example.

 

Thanks again,

Chris

Labels