Join / Merge two data sets based on multiple field criteria
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Question-
Is Alteryx able to join / merge two data sets based on multiple field criteria?
With one field criteria being relative/ range of time.
Problem -
Trying to find an automated way to associate Caller Name and Call reason to our telephone records.
Data set 1 is Telephone records
Data set 2 is Agent records
Fields in each data set are absolute in regards to Client, Start Date, Service Rep.
“Start Time” in Agent records can be 0-5min after the telephone record start time; therefore not an exact match
(I.e. – Telephone record has 9:01am and Agent record has 9:03am)
Data set 1 Telephone Records–
Client | Start Date | Start Time | Calling Party | Service Rep |
BBF | 5/25/2017 | 9:10:20 AM | 9707815688 | Smith, John |
BBF | 5/25/2017 | 9:10:49 AM | 9048991238 | Wallace, Andrew |
BBF | 5/25/2017 | 9:20:20 AM | 3039875688 | Tibby, Tim |
BBF | 5/25/2017 | 9:20:30 AM | 3039851238 | Summer, Anne |
ABC | 5/25/2017 | 9:25:15 AM | 3039851234 | Smith, John |
BBF | 5/26/2017 | 9:01:49 AM | 2126486878 | Vasil, George |
BBF | 5/26/2017 | 9:05:30 AM | 2128514568 | Wallace, Andrew |
BBF | 5/26/2017 | 9:07:51 AM | 2038689638 | Summer, Anne |
BBF | 5/26/2017 | 9:08:00 AM | 2124486448 | Smith, John |
BBF | 5/26/2017 | 9:10:55 AM | 6106488568 | Murphy, Jessica |
ABC | 5/26/2017 | 9:15:28 AM | 2038689846 | Smith, John |
Data Set 2 Agent Records-
Client | Start Date | Start Time | Service Rep | Caller Name | Call Reason (1) |
BBF | 5/25/2017 | 9:10:56 | Smith, John | Kramer, Cosmo | Account Balance |
BBF | 5/25/2017 | 9:11:48 | Wallace, Andrew | Seinfield, Jerry | Policy/Procedure |
BBF | 5/25/2017 | 9:21:49 | Tibby, Tim | Costanza, George | Redemption |
BBF | 5/25/2017 | 9:22:52 | Summer, Anne | Benes, Elaine | Redemption |
BBF | 5/26/2017 | 9:02:40 | Vasil, George | Bing, Chandler | Policy/Procedure |
BBF | 5/26/2017 | 9:08:37 | Summer, Anne | Tribiani, Joe | Account Balance |
BBF | 5/26/2017 | 9:10:49 | Smith, John | Gellar, Monica | Transaction Inquiry |
BBF | 5/26/2017 | 9:13:08 | Murphy, Jessica | Green, Rachel | Beneficiary Inquiry |
ABC | 5/26/2017 | 9:16:01 | Smith, John | Chan, Jackie | Account Balance |
Ideal Outcome-
Client | Start Date | Start Time | Calling Party | Answering Agent | Start Time | Caller Name | Call Reason (1) |
BBF | 5/25/2017 | 9:10:20 AM | 9707815688 | Smith, John | 9:10:56 AM | Kramer, Cosmo | Account Balance |
BBF | 5/25/2017 | 9:10:49 AM | 9048991238 | Wallace, Andrew | 9:11:48 AM | Seinfield, Jerry | Policy/Procedure |
BBF | 5/25/2017 | 9:20:20 AM | 3039875688 | Tibby, Tim | 9:21:49 AM | Costanza, George | Redemption |
BBF | 5/25/2017 | 9:20:30 AM | 3039851238 | Summer, Anne | 9:22:52 AM | Benes, Elaine | Redemption |
ABC | 5/25/2017 | 9:25:15 AM | 3039851234 | Smith, John |
|
|
|
BBF | 5/26/2017 | 9:01:49 AM | 2126486878 | Vasil, George | 9:02:40 AM | Bing, Chandler | Policy/Procedure |
BBF | 5/26/2017 | 9:05:30 AM | 2128514568 | Wallace, Andrew |
|
|
|
BBF | 5/26/2017 | 9:07:51 AM | 2038689638 | Summer, Anne | 9:08:37 AM | Tribiani, Joe | Account Balance |
BBF | 5/26/2017 | 9:08:00 AM | 2124486448 | Smith, John | 9:10:49 AM | Gellar, Monica | Transaction Inquiry |
BBF | 5/26/2017 | 9:10:55 AM | 6106488568 | Murphy, Jessica | 9:13:08 AM | Green, Rachel | Beneficiary Inquiry |
ABC | 5/26/2017 | 9:15:28 AM | 2038689846 | Smith, John | 9:16:01 AM | Chan, Jackie | Account Balance |
Solved! Go to Solution.
- Labels:
- Common Use Cases
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
To clarify, I'm trying find a automated way to associate Caller name and Call Reason to the phone records.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Can you Trim out the seconds from Start Time, and just compare based on Hour and Minutes? Would that be enough precision for matching purposes if you had a new field "StartTimeMatch" to join on, leaving your actual Start Time data with hours:minutes:seconds intact?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Alex,
Building on GCs idea, I think rounding time to minutes into new fields in both data sets is your starting point. Then we just need to deal with the possible variation in time between sets. You're probably always going to run into problems with close cases (i.e. when an agent completes two phone calls within 5 minutes) so we just need to trap those cases and deal with them manually. Based on your sample data, that doesn't look like it's common for your agents to speak that quickly with two clients.
My solution would be to take your records in one table (consequence of that choice discussed later) and create five copies of each record, with each copy having its RoundedTime value increased/decreased by 1 minute. We'll say that you chose to create duplicates of your Table 2, Agent Records. If there was a record in that table with a Start Time at 9:22:35, you would have RoundedTime fields of 9:22, 9:21, 9:20, 9:19, 9:18. Do this by feeding the original record set into five formula tools that are set to subtract 0,1,2,3, or 4 minutes, and then union the results back together. Then you can join on Agent Name and RoundedTime fields in both tables. Because you have unique records in Data set 1, your join should produce 1:1 matches on the RoundedTime fields.
Based on your ideal outcome, I'd suggest doing the 'multiple copies' on Agent Records, because it's easier to deal with the unjoined records in Telephone Records. If you want to do it the other way, you may just have to unique unjoined records (and remember to increase your RoundedTime rather than decrease it.)
This is a pretty brutish approach, but I think it's a simple possible solution. I would be sure to add a unique ID to both tables, and then filter our duplicate records after your join. Duplicates indicate you've got some overlap between two Agent records and would be worth investigating. If you need an example of this, I can probably put one together this weekend (at Inspire right now!)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
To me this seems like a join/sort. Since you will never have perfect data to join on time and since the time can vary wildly it might be a pain to put blocks of time around each call for a more definitive join.
I got your results with this join on rep/client/date and then sorting results by date/time/caller/rep etc. The edge case if a call is at 11:59pm and the report is next day, could be an issue depending on your hours.his:
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Thanks everyone for your responses!
Anthony, this is very close to what I need.
I added another Telephone record to the sample (11:15am) and the join forumula associates it to a 9am agent record.
I tried creating new field using the DateTimeDiff formula to create a formula to null out the agent records if the difference is greater than X.
Unfortunately, it does not like the excel time data and provides the errror "DATETIMEDIFF1: "09:10:20" is not a valid DateTime
Any thoughts?
Ideal Outcome (Revised to include 11:15am record)-
Client | Start Date | Start Time | Calling Party | Service Rep | Start Time | Caller Name | Call Reason (1) |
BBF | 5/25/2017 | 9:10:20 AM | 9707815688 | Smith, John | 9:10:56 AM | Kramer, Cosmo | Account Balance |
BBF | 5/25/2017 | 9:10:49 AM | 9048991238 | Wallace, Andrew | 9:11:48 AM | Seinfield, Jerry | Policy/Procedure |
BBF | 5/25/2017 | 9:20:20 AM | 3039875688 | Tibby, Tim | 9:21:49 AM | Costanza, George | Redemption |
BBF | 5/25/2017 | 9:20:30 AM | 3039851238 | Summer, Anne | 9:22:52 AM | Benes, Elaine | Redemption |
ABC | 5/25/2017 | 9:25:15 AM | 3039851234 | Smith, John | |||
BBF | 5/26/2017 | 9:01:49 AM | 2126486878 | Vasil, George | 9:02:40 AM | Bing, Chandler | Policy/Procedure |
BBF | 5/26/2017 | 9:05:30 AM | 2128514568 | Wallace, Andrew | |||
BBF | 5/26/2017 | 9:07:51 AM | 2038689638 | Summer, Anne | 9:08:37 AM | Tribiani, Joe | Account Balance |
BBF | 5/26/2017 | 9:08:00 AM | 2124486448 | Smith, John | 9:10:49 AM | Gellar, Monica | Transaction Inquiry |
BBF | 5/26/2017 | 9:10:55 AM | 6106488568 | Murphy, Jessica | 9:13:08 AM | Green, Rachel | Beneficiary Inquiry |
ABC | 5/26/2017 | 9:15:28 AM | 2038689846 | Smith, John | 9:16:01 AM | Chan, Jackie | Account Balance |
BBF | 5/26/2017 | 11:15:45 AM | 2038688657 | Summer, Anne |
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
This is a nice little macro to convert various time formats to iso time format. I used this on a 100M row table and it took a long time so best to clean up large tables with other methods. But for millions of rows it works great.
Date formulas only work on iso time so any excel formats will need to be converted first.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I learned this when trying to work with your data, but some of the date time functions seem to need a formal DateTime datatype to work (or at least need the underlying data to be properly formatted as a complete Date + Time value). You can create one from your Date and Time fields by following the instructions here: https://community.alteryx.com/t5/Data-Preparation-Blending/Combining-a-Time-and-Date-Field/td-p/3615...
Just set a new field to DateTime and do [Date] + ' ' + [Time] before running your functions.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
You can actually do this without a join. Instead, union the data and sort to find the "order" then use a multi-row tool to check that the dates are within a certain time frame and then mark the corresponding "pairs" that fit the criteria.
See the attached workflow for details as it's a little hard to explain in text.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi Jarrod,
This is great! I will need to study it a few times to fully absorb.
Thanks!
