Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Find duplicates in different rows and different columns

ssphv
7 - Meteor

I have 2 tables as below and below is my requirement:

 

1.       Find if there are duplicates for any column in any row. For example, row1 in table1 and row1 in Table2 have ‘2’ in common.  Pull such records and generate a record combining both and removing duplicate like row1 in Table3(Final output)

2.       One condition is when combining such records if total values exceed 20, split into 2 records. Like Row4 in Table1 and Row4 in Table2 together after removing duplicates have 22 values, from Table3 you can see that it is split into 2 rows.

Table1:

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

Q

R

S

T

2

3

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1

11

56

48

22

38

 

 

 

 

 

 

 

 

 

 

 

 

 

 

81

54

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

10

9

8

7

6

5

12

13

14

15

16

17

18

19

20

21

23

25

 

 

 

Table2:

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

Q

R

S

T

77

2

78

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

60

61

62

63

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

88

89

54

86

90

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

29

30

31

10

9

8

45

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Final Output(Table3):

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

Q

R

S

T

2

3

77

78

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1

11

56

48

22

38

 

 

 

 

 

 

 

 

 

 

 

 

 

 

81

54

88

89

86

90

 

 

 

 

 

 

 

 

 

 

 

 

 

 

60

61

62

63

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

10

9

8

7

6

5

12

13

14

15

16

17

18

19

20

21

23

25

29

30

31

45

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

How can I achieve this. Thanks in advance!

18 REPLIES 18
MarqueeCrew
20 - Arcturus
20 - Arcturus

Try this approach:  

 

Use a TRANSPOSE tool to group by ROW (RECORDID will help).  This will give you RecordID plus name value pairs.  Filter out all blank values.  Repeat for table 2.  Union the data together and then CROSSTAB the data back to a single row by GROUPING on Record ID, using Name as Header and Value as Value.  In the crosstab, you can select FIRST as the method.  This will remove duplicates.  If it doesn't, then use a unique tool (all 3 fields checked) and try it using concatenate.

 

This should get you past your basic problem.  Next you'll have to deal with limits of 20 fields.

 

Cheers,


Mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
Thableaus
17 - Castor
17 - Castor

Hi @ssphv 

 

Not the easiest solution here, but it might work.

 

Check the workflow details and let me know if it suits you. Hard to explain, but it was quite a challenge.

 

Solution.PNG

 

Cheers,

estherb47
15 - Aurora
15 - Aurora

Hi @ssphv 

 

Wow!! What a challenge this one was. I wanted to build something that would match any value in any row or column, as per your request, which took a few transpose, joins, cross tabs, and unions. May be more convoluted than it needs to be. But since Row 5 could match to row 1, transposing and cross-tabbing back together wouldn't work.

Figuring out the "over 20 columns" was fun too. Decided to reassign numbers to the headers, and anything over 20 had the numbers restart at 1. Those records also got a letter appended to their record ID so that the data would build back in the proper order.

Please let me know if any of this doesn't make sense, and if it works for you.
image.png
cheers!

Esther

estherb47
15 - Aurora
15 - Aurora

Just tried it with different sets of numbers, and it still works :)

ssphv
7 - Meteor

This works perfectly and yes I forgot to mention that it should match any value in any row and any column. I appreciate your help!! Thank you!

ssphv
7 - Meteor

@estherb47  Thanks for providing the solution. Today I noticed a new case which is not being handled by this workflow. I updated my input data. 

 

Table 1 : 5th row

Table 2: 5th and 6th rows. 

 

5th row in table1 matches with both 5 and 6 in table2. In this case the result is not expected and I couldn't figure out why it is aggregating values. Any help will be appreciated.

 

Thank you in advance!

estherb47
15 - Aurora
15 - Aurora

Hi @ssphv ,

 

I'll take a look at this today. Apologies for the lengthy delay - was travelling last week.

 

Cheers!

Esther

ssphv
7 - Meteor

No problem. Thank you in advance!

estherb47
15 - Aurora
15 - Aurora

Thought about this some more. I think to get the results of matching to multiple rows, not just one row, you'd need a batch macro that compares one row of the bottom input against the whole top data set, then moves down to the next row and repeats the process. After the macro runs, you could remove the duplicate rows.

 

I think you can just turn my original flow into this batch macro, and have the second input feed the Control parameter. Let me know if you need more help with implementing (and then please let me know which version of Alteryx you're running so I can be sure the macro will work for you)

 

Cheers!

Esther

Labels