Alteryx Designer Desktop Discussions

ssphv · ‎02-11-2019

I have 2 tables as below and below is my requirement:

1. Find if there are duplicates for any column in any row. For example, row1 in table1 and row1 in Table2 have ‘2’ in common. Pull such records and generate a record combining both and removing duplicate like row1 in Table3(Final output)

2. One condition is when combining such records if total values exceed 20, split into 2 records. Like Row4 in Table1 and Row4 in Table2 together after removing duplicates have 22 values, from Table3 you can see that it is split into 2 rows.

Table1:

A	B	C	D	E	F	G	H	I	J	K	L	M	N	O	P	Q	R	S	T
2	3
1	11	56	48	22	38
81	54
10	9	8	7	6	5	12	13	14	15	16	17	18	19	20	21	23	25

Table2:

A	B	C	D	E	F	G	H	I	J	K	L	M	N	O	P	Q	R	S	T
77	2	78
60	61	62	63
88	89	54	86	90
29	30	31	10	9	8	45

Final Output(Table3):

A	B	C	D	E	F	G	H	I	J	K	L	M	N	O	P	Q	R	S	T
2	3	77	78
1	11	56	48	22	38
81	54	88	89	86	90
60	61	62	63
10	9	8	7	6	5	12	13	14	15	16	17	18	19	20	21	23	25	29	30
31	45

How can I achieve this. Thanks in advance!

MarqueeCrew · ‎02-11-2019

Try this approach:

Use a TRANSPOSE tool to group by ROW (RECORDID will help). This will give you RecordID plus name value pairs. Filter out all blank values. Repeat for table 2. Union the data together and then CROSSTAB the data back to a single row by GROUPING on Record ID, using Name as Header and Value as Value. In the crosstab, you can select FIRST as the method. This will remove duplicates. If it doesn't, then use a unique tool (all 3 fields checked) and try it using concatenate.

This should get you past your basic problem. Next you'll have to deal with limits of 20 fields.

Cheers,

Mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.

Thableaus · ‎02-11-2019

Hi @ssphv

Not the easiest solution here, but it might work.

Check the workflow details and let me know if it suits you. Hard to explain, but it was quite a challenge.

Cheers,

estherb47 · ‎02-11-2019

Hi @ssphv

Wow!! What a challenge this one was. I wanted to build something that would match any value in any row or column, as per your request, which took a few transpose, joins, cross tabs, and unions. May be more convoluted than it needs to be. But since Row 5 could match to row 1, transposing and cross-tabbing back together wouldn't work.

Figuring out the "over 20 columns" was fun too. Decided to reassign numbers to the headers, and anything over 20 had the numbers restart at 1. Those records also got a letter appended to their record ID so that the data would build back in the proper order.

Please let me know if any of this doesn't make sense, and if it works for you.

cheers!

Esther

estherb47 · ‎02-11-2019

Just tried it with different sets of numbers, and it still works :)

ssphv · ‎02-14-2019

This works perfectly and yes I forgot to mention that it should match any value in any row and any column. I appreciate your help!! Thank you!

ssphv · ‎02-20-2019

@estherb47 Thanks for providing the solution. Today I noticed a new case which is not being handled by this workflow. I updated my input data.

Table 1 : 5th row

Table 2: 5th and 6th rows.

5th row in table1 matches with both 5 and 6 in table2. In this case the result is not expected and I couldn't figure out why it is aggregating values. Any help will be appreciated.

Thank you in advance!

estherb47 · ‎02-25-2019

Hi @ssphv ,

I'll take a look at this today. Apologies for the lengthy delay - was travelling last week.

Cheers!

Esther

ssphv · ‎02-26-2019

No problem. Thank you in advance!

estherb47 · ‎02-26-2019

Thought about this some more. I think to get the results of matching to multiple rows, not just one row, you'd need a batch macro that compares one row of the bottom input against the whole top data set, then moves down to the next row and repeats the process. After the macro runs, you could remove the duplicate rows.

I think you can just turn my original flow into this batch macro, and have the second input feed the Control parameter. Let me know if you need more help with implementing (and then please let me know which version of Alteryx you're running so I can be sure the macro will work for you)

Cheers!

Esther

Alteryx Designer Desktop Discussions

Find duplicates in different rows and different columns

Re: Date Time Function - Prioritization Base on Du...

Re: Running multiple alteryx workflows within alte...

Re: Selecting the columns coming after a specific ...

Re: Regex(?) formula to remove values matching the...

Re: Python ECC SAP Extract into Alteryx Workflow

A	B	C	D	E	F	G	H	I	J	K	L	M	N	O	P	Q	R	S	T
2	3
1	11	56	48	22	38
81	54
10	9	8	7	6	5	12	13	14	15	16	17	18	19	20	21	23	25

A	B	C	D	E	F	G	H	I	J	K	L	M	N	O	P	Q	R	S	T
2	3	77	78
1	11	56	48	22	38
81	54	88	89	86	90
60	61	62	63
10	9	8	7	6	5	12	13	14	15	16	17	18	19	20	21	23	25	29	30
31	45

A	B	C	D	E	F	G	H	I	J	K	L	M	N	O	P	Q	R	S	T
2	3
1	11	56	48	22	38
81	54
10	9	8	7	6	5	12	13	14	15	16	17	18	19	20	21	23	25

A	B	C	D	E	F	G	H	I	J	K	L	M	N	O	P	Q	R	S	T
2	3	77	78
1	11	56	48	22	38
81	54	88	89	86	90
60	61	62	63
10	9	8	7	6	5	12	13	14	15	16	17	18	19	20	21	23	25	29	30
31	45

A	B	C	D	E	F	G	H	I	J	K	L	M	N	O	P	Q	R	S	T
2	3
1	11	56	48	22	38
81	54
10	9	8	7	6	5	12	13	14	15	16	17	18	19	20	21	23	25

A	B	C	D	E	F	G	H	I	J	K	L	M	N	O	P	Q	R	S	T
2	3	77	78
1	11	56	48	22	38
81	54	88	89	86	90
60	61	62	63
10	9	8	7	6	5	12	13	14	15	16	17	18	19	20	21	23	25	29	30
31	45