Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Join / Merge two data sets based on multiple field criteria

AlexL
7 - Meteor

Question-

Is Alteryx able to join / merge two data sets based on multiple field criteria?

With one field criteria being relative/ range of time.

 

Problem -

Trying to find an automated way to associate Caller Name and Call reason to our telephone records.

 

Data set 1 is Telephone records

Data set 2 is Agent records

 

Fields in each data set are absolute in regards to Client, Start Date, Service Rep.

“Start Time” in Agent records can be 0-5min after the telephone record start time; therefore not an exact match

(I.e. – Telephone record has 9:01am and Agent record has 9:03am)

 

Data set 1 Telephone Records–

Client

Start Date

Start Time

Calling Party

Service Rep

BBF

5/25/2017

9:10:20 AM

9707815688

Smith, John

BBF

5/25/2017

9:10:49 AM

9048991238

Wallace, Andrew

BBF

5/25/2017

9:20:20 AM

3039875688

Tibby, Tim

BBF

5/25/2017

9:20:30 AM

3039851238

Summer, Anne

ABC

5/25/2017

9:25:15 AM

3039851234

Smith, John

BBF

5/26/2017

9:01:49 AM

2126486878

Vasil, George

BBF

5/26/2017

9:05:30 AM

2128514568

Wallace, Andrew

BBF

5/26/2017

9:07:51 AM

2038689638

Summer, Anne

BBF

5/26/2017

9:08:00 AM

2124486448

Smith, John

BBF

5/26/2017

9:10:55 AM

6106488568

Murphy, Jessica

ABC

5/26/2017

9:15:28 AM

2038689846

Smith, John

 

Data Set 2 Agent Records-

Client

Start Date

Start Time

Service Rep

Caller Name

Call Reason (1)

BBF

5/25/2017

9:10:56

Smith, John

Kramer, Cosmo

Account Balance

BBF

5/25/2017

9:11:48

Wallace, Andrew

Seinfield, Jerry

Policy/Procedure

BBF

5/25/2017

9:21:49

Tibby, Tim

Costanza, George

Redemption

BBF

5/25/2017

9:22:52

Summer, Anne

Benes, Elaine

Redemption

BBF

5/26/2017

9:02:40

Vasil, George

Bing, Chandler

Policy/Procedure

BBF

5/26/2017

9:08:37

Summer, Anne

Tribiani, Joe

Account Balance

BBF

5/26/2017

9:10:49

Smith, John

Gellar, Monica

Transaction Inquiry

BBF

5/26/2017

9:13:08

Murphy, Jessica

Green, Rachel

Beneficiary Inquiry

ABC

5/26/2017

9:16:01

Smith, John

Chan, Jackie

Account Balance

 

Ideal Outcome-

Client

Start Date

Start Time

Calling Party

Answering Agent

Start Time

Caller Name

Call Reason (1)

BBF

5/25/2017

9:10:20 AM

9707815688

Smith, John

9:10:56 AM

Kramer, Cosmo

Account Balance

BBF

5/25/2017

9:10:49 AM

9048991238

Wallace, Andrew

9:11:48 AM

Seinfield, Jerry

Policy/Procedure

BBF

5/25/2017

9:20:20 AM

3039875688

Tibby, Tim

9:21:49 AM

Costanza, George

Redemption

BBF

5/25/2017

9:20:30 AM

3039851238

Summer, Anne

9:22:52 AM

Benes, Elaine

Redemption

ABC

5/25/2017

9:25:15 AM

3039851234

Smith, John

 

 

 

BBF

5/26/2017

9:01:49 AM

2126486878

Vasil, George

9:02:40 AM

Bing, Chandler

Policy/Procedure

BBF

5/26/2017

9:05:30 AM

2128514568

Wallace, Andrew

 

 

 

BBF

5/26/2017

9:07:51 AM

2038689638

Summer, Anne

9:08:37 AM

Tribiani, Joe

Account Balance

BBF

5/26/2017

9:08:00 AM

2124486448

Smith, John

9:10:49 AM

Gellar, Monica

Transaction Inquiry

BBF

5/26/2017

9:10:55 AM

6106488568

Murphy, Jessica

9:13:08 AM

Green, Rachel

Beneficiary Inquiry

ABC

5/26/2017

9:15:28 AM

2038689846

Smith, John

9:16:01 AM

Chan, Jackie

Account Balance

 

9 REPLIES 9
AlexL
7 - Meteor

To clarify, I'm trying find a automated way to associate Caller name and Call Reason to the phone records.

gc
9 - Comet

Can you Trim out the seconds from Start Time, and just compare based on Hour and Minutes? Would that be enough precision for matching purposes if you had a new field "StartTimeMatch" to join on, leaving your actual Start Time data with hours:minutes:seconds intact?

david_fetters
11 - Bolide

Alex,

 

Building on GCs idea, I think rounding time to minutes into new fields in both data sets is your starting point.  Then we just need to deal with the possible variation in time between sets.  You're probably always going to run into problems with close cases (i.e. when an agent completes two phone calls within 5 minutes) so we just need to trap those cases and deal with them manually.  Based on your sample data, that doesn't look like it's common for your agents to speak that quickly with two clients.

 

My solution would be to take your records in one table (consequence of that choice discussed later) and create five copies of each record, with each copy having its RoundedTime value increased/decreased by 1 minute.  We'll say that you chose to create duplicates of your Table 2, Agent Records.  If there was a record in that table with a Start Time at 9:22:35, you would have RoundedTime fields of 9:22, 9:21, 9:20, 9:19, 9:18.  Do this by feeding the original record set into five formula tools that are set to subtract 0,1,2,3, or 4 minutes, and then union the results back together.  Then you can join on Agent Name and RoundedTime fields in both tables.  Because you have unique records in Data set 1, your join should produce 1:1 matches on the RoundedTime fields.

 

Based on your ideal outcome, I'd suggest doing the 'multiple copies' on Agent Records, because it's easier to deal with the unjoined records in Telephone Records.  If you want to do it the other way, you may just have to unique unjoined records (and remember to increase your RoundedTime rather than decrease it.)

 

This is a pretty brutish approach, but I think it's a simple possible solution.  I would be sure to add a unique ID to both tables, and then filter our duplicate records after your join.  Duplicates indicate you've got some overlap between two Agent records and would be worth investigating.  If you need an example of this, I can probably put one together this weekend (at Inspire right now!)

anthony
11 - Bolide

To me this seems like a join/sort. Since you will never have perfect data to join on time and since the time can vary wildly it might be a pain to put blocks of time around each call for a more definitive join.

 

I got your results with this join on rep/client/date and then sorting results by date/time/caller/rep etc. The edge case if a call is at 11:59pm and the report is next day, could be an issue depending on your hours.Capture.PNGhis:

AlexL
7 - Meteor

Thanks everyone for your responses!

 

Anthony, this is very close to what I need.  

 

I added another Telephone record to the sample (11:15am) and the join forumula associates it to a 9am agent record.

I tried creating new field using the DateTimeDiff formula to create a formula to null out the agent records if the difference is greater than X.

Unfortunately, it does not like the excel time data and provides the errror "DATETIMEDIFF1: "09:10:20" is not a valid DateTime

 

Any thoughts?

 

 Filter Join- Sample 1.jpg

 

Filter Join- Sample 2.jpg

 

Ideal Outcome (Revised to include 11:15am record)-

ClientStart DateStart TimeCalling PartyService RepStart TimeCaller NameCall Reason (1)
BBF5/25/20179:10:20 AM9707815688Smith, John9:10:56 AMKramer, CosmoAccount Balance
BBF5/25/20179:10:49 AM9048991238Wallace, Andrew9:11:48 AMSeinfield, JerryPolicy/Procedure
BBF5/25/20179:20:20 AM3039875688Tibby, Tim9:21:49 AMCostanza, GeorgeRedemption
BBF5/25/20179:20:30 AM3039851238Summer, Anne9:22:52 AMBenes, ElaineRedemption
ABC5/25/20179:25:15 AM3039851234Smith, John   
BBF5/26/20179:01:49 AM2126486878Vasil, George9:02:40 AMBing, ChandlerPolicy/Procedure
BBF5/26/20179:05:30 AM2128514568Wallace, Andrew   
BBF5/26/20179:07:51 AM2038689638Summer, Anne9:08:37 AMTribiani, JoeAccount Balance
BBF5/26/20179:08:00 AM2124486448Smith, John9:10:49 AMGellar, MonicaTransaction Inquiry
BBF5/26/20179:10:55 AM6106488568Murphy, Jessica9:13:08 AMGreen, RachelBeneficiary Inquiry
ABC5/26/20179:15:28 AM2038689846Smith, John9:16:01 AMChan, JackieAccount Balance
BBF5/26/201711:15:45 AM2038688657Summer, Anne   

 

 

anthony
11 - Bolide

This is a nice little macro to convert various time formats to iso time format. I used this on a 100M row table and it took a long time so best to clean up large tables with other methods. But for millions of rows it works great.

https://community.alteryx.com/t5/Engine-Works-Blog/Introducing-BB-Date-Date-Parsing-for-the-Rest-of-...

 

Date formulas only work on iso time so any excel formats will need to be converted first.

david_fetters
11 - Bolide

I learned this when trying to work with your data, but some of the date time functions seem to need a formal DateTime datatype to work (or at least need the underlying data to be properly formatted as a complete Date + Time value).  You can create one from your Date and Time fields by following the instructions here: https://community.alteryx.com/t5/Data-Preparation-Blending/Combining-a-Time-and-Date-Field/td-p/3615...

 

Just set a new field to DateTime and do [Date] + ' ' + [Time] before running your functions.

jarrod
ACE Emeritus
ACE Emeritus

You can actually do this without a join. Instead, union the data and sort to find the "order" then use a multi-row tool to check that the dates are within a certain time frame and then mark the corresponding "pairs" that fit the criteria.

 

See the attached workflow for details as it's a little hard to explain in text.

AlexL
7 - Meteor

Hi Jarrod,

 

This is great!  I will need to study it a few times to fully absorb.

 

Thanks!

Labels