I am trying to build a workflow to compare two data sets(first input is the excel file and other input is connected to a query in database) to see if the account in one data set exists in the other and return the matched and unmatched records. Any suggestions to do this better would be appreciated. I am totally new to Alteryx and in the process of learning things to do more efficiently. Thank you!
Solved! Go to Solution.
Hi @aparna0208
I think a Join will give you what you're looking for.
Take an input data tool and read the excel file, connect it to the L input connector of the Join Tool.
Then grab another one and query the database and connect it to the R input connector of the Join.
Then try them to join by account number and you'll get all results in three ways:
- From the L output connector: All records from the L input dataset (the Excel file) that doen't exist on your DB Query
- From the R output connector: All records in the DB Query that doesn't exists in the Excel
- From the J output connector (The middle one): All existing in both datasources.
This Venn diagrams below the Join tool field config give you an idea of what you're going to get from the outputs (Left UNJOINED, joined and right UNJOINED)
Hope this helps.
This will definitely help as this is exactly what I was looking for. Thank you so much @aguisande
Great! Anytime @aparna0208 !
Hi Aguisande,
Is there any other option to check the two values and trigger an error message or stop the flow, if it is "false"?
Thanks
Hi,
I have a version table which has one record for every change done for one or more column in the table at same time.
I am trying to use Alteryx to find what columns were changed in each version and the corresponding value.
Is it possible to do in Alteryx.
Example:
ID | COL1 | COL2 | VERSION | CREATION_DATE | UPDATED_DATE |
100 | A | B | 1 | 10-Jan-2019 | |
100 | A1 | B | 2 | 11-Jan-2019 | |
100 | A2 | B1 | 3 | 12-Jan-2019 | |
In the above example, for ID:100, value changed for COL1 in version 2 and value changed for both COL1 and COL2 in version 3.
I want to show each version wise the attribute whose value changed and the value of same.
Thanks,
Pankaj
What is awesome with Alteryx is that the answer to the question "Is it possible to do in Alteryx." is YES!
What I'd do is :
- First: Filter all UPDATED_DATE that are not Empty or Null (This should mean that that record had changed)
- Then: Use a Summarize Tool, and Group by: UPDATED_DATE, VERSION, COL1 (Assuming is the original Date) and get COL2 using Concatenate as the Aggragation method.
Hope this helps.
Hi, @Aguisande this is helpful.
I can group by specific column in a table and concatenate all values for each column for different records in the table for group by column, so I am able to see all attribute and its value in each version as per group by column.
Is there a way I can make it more dynamic where Alteryx can only shows those columns and its value which are changed between versions, see the example below.
MEMBER_ID | MEMBER_NAME | CREATION_DATE | UPDATE_DATE | VERSION | STATE | CITY | ZIP |
100 | MEMBER_X1 | 10-Jan-2019 | 1 | PA | City1 | 12345 | |
100 | MEMBER_X1 | 11-Jan-2019 | 11-Jan-2019 | 2 | PA | City2 | 45678 |
101 | MEMBER_Y1 | 10-Mar-2019 | 1 | NJ | City3 | 78645 | |
101 | MEMBER_Y1 | 11-Mar-2019 | 11-Mar-2019 | 2 | NJ | City4 | 89564 |
Now, the result expected should have like: ( only showing columns whose value has changed)
MEMBER_ID | CREATION_DATE | VERSION | CITY | ZIP |
100 | 10-Jan-2019|11-Jan-2019 | 1|2 | City1|City2 | 12345|45678 |
101 | 10-Mar-2019|11-Mar-2019 | 1|2 | City3|City4 | 78645|89564 |
So In above table, we need to display only columns whose value has changed between two version of records ( user would like to see only attributes whose value changed between versions).
Anther possible output format could be: Only showing column whose value is changed.
MEMBER_ID | ColumnName | Version1 | Version2 |
100 | Creation Date | 10-Jan-2019 | 11-Jan-2019 |
100 | City | City1 | City2 |
100 | Zip | 12345 | 45678 |
101 | Creation Date | 10-Mar-2019 | 11-Mar-2019 |
101 | City | City1 | City2 |
101 | Zip | 78645 | 89564 |
I think creating above output format may be complex, where columns will increase automatically depending on more and more versions of records created in the source table.
Let me know your thoughts, how we can create above output formats dynamically.
To get the first expected result, I'll suggest this:
- Filter the modified records (UPDATE_DATE is Not Null). Based on the data, this gives you ONLY those records that registered a modification.
- Group the records by MEMEBER_ID and concatenate the values (CREATION_DATE, VERSION, etc).
- Once you have both streams, join them together by MEMBER_ID and deselect all Left incoming records
You'll end up with this:
The only shortcoming of this is that it can't evaluate that STATE remains unchanged.