Alteryx Designer Desktop Discussions

aparna0208 · ‎01-23-2019

I am trying to build a workflow to compare two data sets(first input is the excel file and other input is connected to a query in database) to see if the account in one data set exists in the other and return the matched and unmatched records. Any suggestions to do this better would be appreciated. I am totally new to Alteryx and in the process of learning things to do more efficiently. Thank you!

Aguisande · ‎01-23-2019

Hi @aparna0208

I think a Join will give you what you're looking for.

Take an input data tool and read the excel file, connect it to the L input connector of the Join Tool.

Then grab another one and query the database and connect it to the R input connector of the Join.

Then try them to join by account number and you'll get all results in three ways:

- From the L output connector: All records from the L input dataset (the Excel file) that doen't exist on your DB Query

- From the R output connector: All records in the DB Query that doesn't exists in the Excel

- From the J output connector (The middle one): All existing in both datasources.

This Venn diagrams below the Join tool field config give you an idea of what you're going to get from the outputs (Left UNJOINED, joined and right UNJOINED)

Hope this helps.

aparna0208 · ‎01-23-2019

This will definitely help as this is exactly what I was looking for. Thank you so much @aguisande

Aguisande · ‎01-23-2019

Great! Anytime @aparna0208 !

ssabapa5 · ‎04-27-2019

Hi Aguisande,

Is there any other option to check the two values and trigger an error message or stop the flow, if it is "false"?

Thanks

cheerspankaj · ‎05-14-2019

Hi,

I have a version table which has one record for every change done for one or more column in the table at same time.

I am trying to use Alteryx to find what columns were changed in each version and the corresponding value.

Is it possible to do in Alteryx.

Example:

ID	COL1	COL2	VERSION	CREATION_DATE	UPDATED_DATE
100	A	B	1	10-Jan-2019
100	A1	B	2		11-Jan-2019
100	A2	B1	3		12-Jan-2019

In the above example, for ID:100, value changed for COL1 in version 2 and value changed for both COL1 and COL2 in version 3.

I want to show each version wise the attribute whose value changed and the value of same.

Thanks,

Pankaj

Aguisande · ‎05-14-2019

Hi @cheerspankaj

What is awesome with Alteryx is that the answer to the question "Is it possible to do in Alteryx." is YES!

What I'd do is :

- First: Filter all UPDATED_DATE that are not Empty or Null (This should mean that that record had changed)

- Then: Use a Summarize Tool, and Group by: UPDATED_DATE, VERSION, COL1 (Assuming is the original Date) and get COL2 using Concatenate as the Aggragation method.

Hope this helps.

cheerspankaj · ‎05-15-2019

Hi, @Aguisande this is helpful.

I can group by specific column in a table and concatenate all values for each column for different records in the table for group by column, so I am able to see all attribute and its value in each version as per group by column.

Is there a way I can make it more dynamic where Alteryx can only shows those columns and its value which are changed between versions, see the example below.

MEMBER_ID	MEMBER_NAME	CREATION_DATE	UPDATE_DATE	VERSION	STATE	CITY	ZIP
100	MEMBER_X1	10-Jan-2019		1	PA	City1	12345
100	MEMBER_X1	11-Jan-2019	11-Jan-2019	2	PA	City2	45678
101	MEMBER_Y1	10-Mar-2019		1	NJ	City3	78645
101	MEMBER_Y1	11-Mar-2019	11-Mar-2019	2	NJ	City4	89564

Now, the result expected should have like: ( only showing columns whose value has changed)

MEMBER_ID	CREATION_DATE	VERSION	CITY	ZIP
100	10-Jan-2019\|11-Jan-2019	1\|2	City1\|City2	12345\|45678
101	10-Mar-2019\|11-Mar-2019	1\|2	City3\|City4	78645\|89564

So In above table, we need to display only columns whose value has changed between two version of records ( user would like to see only attributes whose value changed between versions).

Anther possible output format could be: Only showing column whose value is changed.

MEMBER_ID	ColumnName	Version1	Version2
100	Creation Date	10-Jan-2019	11-Jan-2019
100	City	City1	City2
100	Zip	12345	45678
101	Creation Date	10-Mar-2019	11-Mar-2019
101	City	City1	City2
101	Zip	78645	89564

I think creating above output format may be complex, where columns will increase automatically depending on more and more versions of records created in the source table.

Let me know your thoughts, how we can create above output formats dynamically.

Aguisande · ‎05-15-2019

@cheerspankaj

To get the first expected result, I'll suggest this:

- Filter the modified records (UPDATE_DATE is Not Null). Based on the data, this gives you ONLY those records that registered a modification.

- Group the records by MEMEBER_ID and concatenate the values (CREATION_DATE, VERSION, etc).

- Once you have both streams, join them together by MEMBER_ID and deselect all Left incoming records

You'll end up with this:

The only shortcoming of this is that it can't evaluate that STATE remains unchanged.

MEMBER_ID	CREATION_DATE	VERSION	CITY	ZIP
100	10-Jan-2019\|11-Jan-2019	1\|2	City1\|City2	12345\|45678
101	10-Mar-2019\|11-Mar-2019	1\|2	City3\|City4	78645\|89564

Alteryx Designer Desktop Discussions

Comparing data sets

Zero to Advanced in 20 days

Re: Zero to Advanced in 20 days

Re: Zero to Advanced in 20 days

Passed the Advanced Certification Exam!

Re: Identify duplicates in a specific column, and ...