Get Inspire insights from former attendees in our AMA discussion thread on Inspire Buzz. ACEs and other community members are on call all week to answer!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Filter out data from one column based on values from another column

Chirag_Gandhi07
8 - Asteroid

Hi everyone, 

 

I have a column (Software Title) that contains software titles and another column (Q4) that contains the values yes or no based on email addresses. I would like to filter out any software title that appears more than once and has yes and no for (Q4). Below is an example;

 

Email                         Q4                   Software Title

test@test.com           Yes                  Tableau

test1@test.com         No                    Tableau

test2@test.com         Yes                   Alteryx        

test3@test.com         No                     Visual Studio

 

So the filtered data would be:

 

Email                        Q4                     Software Title

test@test.com          Yes                    Tableau

test1@test.com        No                      Tableau

 

(The filtered data is as is because Tableau shows up twice AND has both yes and no for Q4). Any help on how this can be achieved would be much appreciated. Thanks!

19 REPLIES 19
Kenda
16 - Nebula
16 - Nebula

Hey @Chirag_Gandhi07!

 

I have attached a workflow that should get you the output you're looking for. I added one more sample row of data to show what happens when you have a duplicate software title but with the same Q4 response. The Browse on the right matches the output you provided in your post while the left Browse shows the rest of the results. Note you need the CReW Only Unique macro to use this solution. 

 

Capture.PNG

 

If you don't have the CReW macros or don't want to use them, an alternative would be to use a Summarize to GroupBy and Count Software Title then Filter to only keep the counts greater than one. Then you could Join back with your original data and would get the same result as you would from the Only Unique tool. I have provided this solution in a Tool Container to the far right of the workflow.

 

Hope this helps!

Chirag_Gandhi07
8 - Asteroid
 
Kenda
16 - Nebula
16 - Nebula

@Chirag_Gandhi07 After the right Join, just add a Formula tool to the J output to create a new field with "Split" as the expression. Then output to the Union tool. The other records will be null for this new field. You can leave them that way or give them a different value in the same way you made this one "Split."

 

Capture.PNGCapture1.PNG

Chirag_Gandhi07
8 - Asteroid

@KendaJust an update, rather than doing all this, is it possible to add a formula tool to the original data and add a column split that will populate "Split" into the rows where software titles occur more than once and have both yes and no for Q4? 

Kenda
16 - Nebula
16 - Nebula

@Chirag_Gandhi07 Will there ever be more than 2 records per Software Title? 

 

If yes:

See the attached workflow (the left stream of data). I took it down to just 3 main tools. I first use a Summarize to GroupBy both Software Title and Q4, in that order. Then I added a Multi-Row Formula tool with this expression:

iif(([Q4]="No" && [Row+1:Q4]="Yes") || ([Row-1:Q4]="No" && [Q4]="Yes"),"Split",null())

and grouping by Software Title. To get all of the records back, I added a Join tool.

 

 

If no:

See the attached workflow (the right stream of data). You could simply add a Sort tool to Sort by Software Title first then Q4. Then add the same Multi-Row Formula tool as above and you should get your desired output.

 

Hope this helps!

Chirag_Gandhi07
8 - Asteroid

@KendaThanks for the response. For my Q4, the is a third value of unrespondent along with yes and no. When I use your solution, it only shows software titles with yes and no. I could have an example of software title a with different emails and yes, yes, no, yes, no, unrespondent for Q4.  I didn't think that would make a difference with your solution by it did and only displayed the software titles with yes and no that appear once.

Kenda
16 - Nebula
16 - Nebula

@Chirag_Gandhi07 Are those the only 3 possible values for the Q4 field? If so, the attached workflow should do the trick. I just changed the expression in the Multi-Row Formula tool from the previous post in order to account for the third possible value but still checking that yes and no are both present to apply the split. The expression is quite long, so let me know if you need help deciphering it.

Chirag_Gandhi07
8 - Asteroid

@KendaThanks! Thee thing is that if the value for Q4 is respondent, it shouldn't have spilt. It should be null. Only if it is yes and no, than split should be displayed in that column. (The software Title can have multiple yes and no's)

Kenda
16 - Nebula
16 - Nebula

@Chirag_Gandhi07 The left flow of data in second to last workflow that I attached previously should work, then. The one where I wrote out the code in the post. I have reattached here for convenience. Is this not what you are looking for?

 

Input dataInput data

 

 

OutputOutput

Labels