We are celebrating the 10-year anniversary of the Alteryx Community! Learn more and join in on the fun here.
Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

De-Duping Relationship and Inverse Relationship Records

kcroninkrein
5 - Atom

Hi all,

 

I have a Relationship export that has two records representing each unique relationship (example: one record for Parent-Child and another for Child-Parent). I need to de-dupe this dataset so that each relationship only has 1 record. The RelationshipType and InverseRelationships are set up such that a given RelationshipType can have only 1 possible InverseRelationship, and vice versa. I've included an example of the dataset below:

 

LegacyContactIdLegacyRelatedContactIdRelationshipTypeInverseRelationship
123456ChildParent
456123ParentChild
7891011BusinessOwner
1011789OwnerBusiness
12131415SiblingSibling
14151213SiblingSibling

 

Note that for simplicity sake, I have put the relationship pair of records next to each other, but in the real dataset they may be far apart. 

 

I thought that I could resolve this using a self-join with the Join tool, connecting Left: LegacyContact = Right: LegacyRelatedContactId and Left: LegacyRelatedContactId = Right: LegacyContactId, but that gives me the same number of records when I would have expected the count to decrease by half.

 

How would y'all suggest that I approach this? Thank you!

2 REPLIES 2
PangHC
13 - Pulsar

add 2 new columns, sort them by formula

 

formula

1: if [LegacyContactId]<[LegacyRelatedContactId] then [LegacyContactId] else [LegacyRelatedContactId] endif

2: if [LegacyContactId]<[LegacyRelatedContactId] then [LegacyRelatedContactId] else [LegacyContactId] endif

 

 

Screenshot 2023-09-19 094609.png

then add unique tool and keep unique by the new 2 column.

 

kcroninkrein
5 - Atom

Thank you, Pang_Hee_Choy!  That seems to have worked. I added in a third column that added in a formula for

  • IF [RelationshipType] < [InverseRelationship] THEN [RelationshipType] ELSE [InverseRelationship] ENDIF

This pulls in whichever relationship is alphabetically earlier than the other (which works because each Relationship has only one InverseRelationship option). That way it isn't removing instances where the two accounts might have a second valid relationship (example: Parent-Child and Team Leader-Team Member). That facet wasn't present in the example dataset, but is in the real data so wanted to mention it for others who might encounter a similar issue. 

Labels
Top Solution Authors