Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Filtering Fields based on Text

AustinRiggs94
8 - Asteroid

Hi All,

 

I am trying to filter columns containing certain software. Is there a way to not only filter the column based on a text i.e., SAP but to also delete all other text in that cell except for the Text (text in the filter criteria). 

 

Below is an example of text in a cell. Please let me know, thanks

 

FileMaker Pro, ITTIA ODBC, Microsoft Access, Microsoft SQL Server, Microsoft SQL Server 2008, Microsoft Team Foundation Server (TFS), MongoDB, Oracle Database (RDBMS), SAP HANA, Sybase, UDB

9 REPLIES 9
Kenda
16 - Nebula
16 - Nebula

You cannot filter and modify the field's value at the same time. You would have to use a Filter tool then add a Formula tool to get rid of the extra text you don't want.

 

Hope this helps!

AustinRiggs94
8 - Asteroid

That makes sense. Thanks. I am having an issue though, im getting no true outputs. Below is the formula im using an below that is an example of a field. Please let me know, thanks

(Column name is Business Intelligence/Big Data)

 

 

[Business Intelligence/Big Data] IN ("SAP Business Intelligence (BI)","SAP BusinessObjects","SAP Crystal Reports","SAP NetWeaver BPM","SAP BusinessObjects Web Intelligence")

 

 

Adobe Analytics, Adobe SiteCatalyst (Omniture), Apache Hadoop, Clearfit, ClickTale, Domino, Gerber AccuMark, Google Analytics, IBM Cognos Enterprise, IBM SPSS Advanced Statistics, iDashboards, Keen, Keynote, L2, MEGA, MicroStrategy Business Intelligence, Oracle BI Publisher, Oracle BlueKai, Oracle Business Intelligence Discoverer, Oracle Business Intelligence Enterprise Edition, Oracle OBIEE, PARIS, SAP BusinessObjects, SAP BusinessObjects Web Intelligence, SAP Crystal Reports, SAS Enterprise BI Server, SAS Visual Analytics, Splunk, Tableau, Tableau Desktop, Twitter Universal Website Tag
Kenda
16 - Nebula
16 - Nebula

@AustinRiggs94  Because you're using the IN() function, Alteryx is looking for the keywords you're passing to it to be the entire value of the field. Instead, you'll want to use the Contains() function to check if your values are in any part of the field.

AustinRiggs94
8 - Asteroid

@Kenda Thanks for your help, i got the formulas working, however im having trouble removing the extra text within the field. I understand the RegEx is probably my best option, but i do not know how to properly write a RegEx formula to look for text such as "SAP BusinessObjects". Do you have an example or a link that explains? Please let me know, thanks.

Kenda
16 - Nebula
16 - Nebula

Hey @AustinRiggs94 

 

You're right, RegEx is probably your best bet. Using your above example, I was able to get rid of the extra text with this expression in a Formula tool:

REGEX_Replace([Business Intelligence/Big Data], "(.*)(SAP BusinessObjects)(.*)", "$2")

This is basically splitting up your field into three sections (each set of parenthesis) then tells Alteryx to only keep what's within the second set.

 

Hope this helps!

AustinRiggs94
8 - Asteroid

Thanks! @Kenda im still understanding how Regex works and how to write certain expressions. Is there a way for RegEx to pull multiple phrases such as SAP BusinessObjects, SAP Crystal Reports, SAP HANA? Or would i have to write an expression for each phrase?

Kenda
16 - Nebula
16 - Nebula

@AustinRiggs94  If you're wanting to keep multiple phrases, I have an alternative way to do this that might be better. 

 

Instead of editing the existing field to parse it down to only the phrases you want, what if you create a new field and build it using only the phrases you want? For example, create a new field in a Formula tool with this expression:

IIF(Contains([Business Intelligence/Big Data], "SAP BusinessObjects"), "SAP BusinessObjects", "")

Then, create another expression in the same Formula tool to edit the field you just created like this:

IIF(Contains([Business Intelligence/Big Data], "SAP Crystal Reports"), [programs]+", SAP Crystal Reports", [programs])

Capture.PNG

 

 

Continue doing this with all of the phrases you want to search for. In the end, you will get a new field that only has a list of your desired programs that are contained in the original field. In the picture below, I only have two because I only checked for two.

Capture.PNG

 

 

estherb47
15 - Aurora
15 - Aurora

Hi @AustinRiggs94 ,

 

Great work so far with @Kenda . Regex is an amazing tool, and I highly recommend looking into the Weekly Challenges to learn more about RegEx and see how people are leveraging its power.

 

I took a very different approach. Took the long string and parsed into rows using Text to Columns, and cleaned up extra spaces with Data Cleansing (unnecessary, I just prefer that). Then used a filter for all rows containing SAP. Finally, if you want them all in the same cell, concatenate back together with a summarize tool, where the separator is ", "  a comma followed by a space.

 

If you have multiple rows of this complicated cell, add a Record ID tool before the Text to Columns, and then in the summarize tool, use the Record ID field as a "Group By" field.

 

No RegEx, still flexible. The beauty of Alteryx is being able to solve the same challenge in many ways.

Cheers!

Esther
image.png

AustinRiggs94
8 - Asteroid

Hi All,

 

Thanks @Kenda and @estherb47 i tried both of your ways, and both of them worked. I ended up having to do a set of formulas for each software name. The excel i had, had multiple columns of these fields, for multiple entities. (see attached). For future reference is there a way to Regex_replace or parse an entire spreadsheet? Im sure in the future i will come across this and i would like to be prepared. I know there has to be someway to do this, i just do not know how yet. Please let me know, thanks again for all of your help.

 

Please note, there are multiple columns in the original excel that look like the one given.

Labels