Get Inspire insights from former attendees in our AMA discussion thread on Inspire Buzz. ACEs and other community members are on call all week to answer!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Tag Duplicate Records based on Specific Criteria

jrabjohns
5 - Atom

Hi All,

 
First time posting here! I am facing an issue I cannot seem to crack around deduplication of records based on specific criteria. 
 
I am trying to implement the logic that if a caller [Caller Number] calls a store [Shop#], then the call will be tagged as a lead. then any call from that caller to the same store would be considered a duplicate until 60 days [StartDateTime] have passed from the initial call. After 60 days, this record can be considered a new lead, then all subsequent records from the same caller to the same shop would again be considered a duplicate for 60 days. This would continue until a new caller + shop# combination starts, then the process would repeat. 
 
Example cases:
1) So if a phone number (123-345-6789) calls the same shop twice, once on 3/01/19 and once on 4/01/19, only the 3/01/19 call should be considered a lead. If they called again on 5/01/19 (more than 60 days after, that would be considered an additional lead.

2)If a phone number (111-111-1212) calls shop1 and shop2 within 1 minute of each other, these should be considered 2 separate leads.      
 
I was able to ID the original call of a given caller + shop combination, but cannot figure out how to have Alteryx reference that lead and dedup accordingly. I have attached an example workflow of tagging this first call along with a sample input and output with only the first call tagged. 
 
I also attached a file that is the ideal output of which should be tagged as a lead or a duplicate. 
 
Looking forward to any advice or thoughts anyone can offer!
4 REPLIES 4
The_Dev_Kev_Env
9 - Comet

Not my most elegant answer, but this should be functional. Few steps here. Calculate the running days after first call for the same shop and same caller. I divide that number by 60 and take the ceiling so I can tell the first call after that 60 day period. Then another multi row tool that will analyze that field as well as new combinations of shop and user. 

 

See example attached

danilang
19 - Altair
19 - Altair

Hi @jrabjohns 

 

I think this workflow gives what you're looking for.  The key here is to get the data sorted in the proper order. 

 

WF.png

 

 

Start by adding a Record ID so that you can put the data in the original order at the end of the process.  Next,  sort by shop, caller and day.  A Multi-row tool, grouped by shop and caller, then generates a running StartDate until 60 days have elapsed, then changes to a new StartDate.  The next Multi-row tool, also grouped by shop and caller, looks at the StartDate.  If it's the same as the StartDate from the previous row, it's a Duplicate otherwise it's a Lead.  Then, sort into the original order and cleanup the extra fields.

 

Results

 

Results.png

 

Dan

jrabjohns
5 - Atom

Hi Dan,

 

This worked and solved my issue perfectly! Thank you so much for your workflow and solution. 

 

James

jrabjohns
5 - Atom

Hi The_Dev_Kev_Env ,


Thank you for your prompt solution suggestion. This worked for the sample data but unfortunately when I applied this to the large data set it was calculating the number of days differently than the actual difference of the days but much appreciated!

Labels