Showing results for 
Search instead for 
Did you mean: 

Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.

Comparing data sets


I am trying to build a workflow to compare two data sets(first input is the excel file and other input is connected to a query in database) to see if the account in one data set exists in the other and return the matched and unmatched records. Any suggestions to do this better would be appreciated. I am totally new to Alteryx and in the process of learning things to do more efficiently. Thank you!

Alteryx Certified Partner
Alteryx Certified Partner

Hi @aparna0208 

I think a Join will give you what you're looking for.

Take an input data tool and read the excel file, connect it to the L input connector of the Join Tool.

Then grab another one and query the database and connect it to the R input connector of the Join.


Then try them to join by account number and you'll get all results in three ways:

- From the L output connector: All records from the L input dataset (the Excel file) that doen't exist on your DB Query

- From the R output connector: All records in the DB Query that doesn't exists in the Excel

- From the J output connector (The middle one): All existing in both datasources. 


This Venn diagrams below the Join tool field config give you an idea of what you're going to get from the outputs (Left UNJOINED, joined and right UNJOINED)


Hope this helps.




This will definitely help as this is exactly what I was looking for. Thank you so much @aguisande

Alteryx Certified Partner
Alteryx Certified Partner

Great! Anytime @aparna0208 !


Hi Aguisande,


Is there any other option to check the two values and trigger an error message or stop the flow, if it is "false"?





I have a version table which has one record for every change done for one or more column in the table at same time.

I am trying to use Alteryx to find what columns were changed in each version and the corresponding value.

Is it possible to do in Alteryx.




100A1B2 11-Jan-2019
100A2B13 12-Jan-2019


In the above example, for ID:100, value changed for COL1 in version 2 and value changed for both COL1 and COL2 in version 3.


I want to show each version wise the attribute whose value changed and the value of same.




Alteryx Certified Partner
Alteryx Certified Partner

Hi @cheerspankaj 

What is awesome with Alteryx is that the answer to the question "Is it possible to do in Alteryx." is YES!


What I'd do is :

- First: Filter all UPDATED_DATE that are not Empty or Null (This should mean that that record had changed)

- Then: Use a Summarize Tool, and Group by: UPDATED_DATE, VERSION, COL1 (Assuming is the original Date) and get COL2 using Concatenate as the Aggragation method.


Hope this helps.

Hi, @aguisande this is helpful.

I can group by specific column in a table and concatenate all values for each column for different records in the table for group by column, so I am able to see all attribute and its value in each version as per group by column.


Is there a way I can make it more dynamic where Alteryx can only shows those columns and its value which are changed between versions, see the example below.

100MEMBER_X110-Jan-2019 1PACity112345
101MEMBER_Y110-Mar-2019 1NJCity378645


Now, the result expected should have like: ( only showing columns whose value has changed)



So In above table, we need to display only columns whose value has changed between two version of records ( user would like to see only attributes whose value changed between versions).


Anther possible output format could be: Only showing column whose value is changed.


100Creation Date10-Jan-201911-Jan-2019
101Creation Date10-Mar-201911-Mar-2019


I think creating above output format may be complex, where columns will increase automatically depending on more and more versions of records created in the source table.


Let me know your thoughts, how we can create above output formats dynamically.

Alteryx Certified Partner
Alteryx Certified Partner



To get the first expected result, I'll suggest this:


- Filter the modified records (UPDATE_DATE is Not Null). Based on the data, this gives you ONLY those records that registered a modification.

- Group the records by MEMEBER_ID and concatenate the values (CREATION_DATE, VERSION, etc).




- Once you have both streams, join them together by MEMBER_ID and deselect all Left incoming records



You'll end up with this:



The only shortcoming of this is that it can't evaluate that STATE remains unchanged.