Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

How to join based on closest date

l_blumberger
7 - Meteor

Hi all!

 

I'm having trouble joining my two datasets based on some date logic. I have one dataset with survey response data and another with visitation data. They both have the same unique respondent ID fields, however there are multiple visits and only one response date. I need to match the response data to the visit that happened directly prior to that date. In other words, I need to join the second dataset by respondent ID and the date that is closest (but before) the date in the first dataset.

 

Based on the similar post "How to Join on a Date Range" I thought maybe I could generate dates for the 7 days prior to the response rate and then do a standard join? I'm not sure how to generate these rows. 

 

Thanks in advance for the help!!

 

Lauren

6 REPLIES 6
BCarley
7 - Meteor

Hi Lauren,

 

Would you be able to post some example data and I would be happy to put together a workflow for you.

 

Ben Carley - Keyrus UK

l_blumberger
7 - Meteor

For sure!

Attached is a sample of each dataset.

 

BCarley
7 - Meteor

Lauren, 

 

I am having trouble accessing the gallery to upload an example workflow but I have approached the task in a slightly different way. I have joined the two data sources on client ID with no other field. This means you will match every item to every other item and end up duplicating your data. 

 

Once you have done this you will be able to filter all the results where the response data is before your visitation date (as I am assuming that you do not want responses on visits that haven't happened)

 

Once you have filtered out the invalid rows you want to pick just one of the rows for each pair, so sort by customer ID, Response Date, Visitation Date and the use the sample tool to pick the first row as grouped by Visitation ID and Customer ID.

 

By using the combination of Sort and Sample you should pick only the most recent response for each visitation. Does that make sense or am I misunderstanding your data? (or making a mistake!)

 

-Ben

 

workflow example.PNG

BCarley
7 - Meteor

Lauren, 

 

Attached is a workflow that should be appropriate, I've noticed that some times you have two visits on the same day and we don't have logic for picking which visit the response is for. How do you want to treat that?

 

Ben - Keyrus UK

l_blumberger
7 - Meteor

I think for a visit that is on the same day I need to combine the other metrics (dwell time and number of stores visited) so there is only one visit per day.

BCarley
7 - Meteor

Great, you could put all of your visit results through a summarize and use a sum where you group on visitation date. I think that should work for you.

Labels