Hi all,
I need help grouping data on a consumer level based on if they received communications within a 30 day window, with the 30 days starting from the 1st communication. It is meant to study how effective sending multiple communications is to a Consumer. The challenge is to create a differentiating group - if the same consumer receives another communication 30 days from the 1st communication. The example raw data looks like this:
customerid | comm_date | comm_type |
1234 | 1/1/2022 | |
1234 | 1/2/2022 | Direct |
1234 | 1/15/2022 | |
1234 | 2/4/2022 | |
1234 | 2/5/2022 | Direct |
1234 | 3/10/2022 | Direct |
2345 | 1/1/2022 | |
2345 | 2/5/2022 | |
2345 | 2/6/2022 | Direct |
The first step is to group together the following records and then have them assigned to a RecordID.
The 2/4/22 Email communication falls +30 days from the very first 1/1/22 Email Communication for Consumer "1234" and is thus, a "new" group for that consumer. Then the 3/10/22 Direct communication is another group because 3/10 is +30 days from 2/4/2022 in the previous group. I am having a bit of a trouble dynamically creating this min_comm_date based on each group and starting the 30 day window from that point on.
The final dataset I wish to achieve, looks something like this:
customerid | min_comm_date | max_comm_date | no_touches | Direct | RecordID | |
1234 | 1/1/2022 | 1/15/2022 | 3 | 2 | 1 | 1 |
1234 | 2/4/2022 | 2/5/2022 | 2 | 1 | 1 | 2 |
1234 | 3/10/2022 | 3/10/2022 | 1 | 0 | 1 | 3 |
2345 | 1/1/2022 | 1/1/2022 | 1 | 1 | 0 | 4 |
2345 | 2/5/2022 | 2/6/2022 | 2 | 1 | 1 | 5 |
If this isn't explain well, please let me know if you have any questions. Thanks all for your help!
Solved! Go to Solution.
@Felipe_Ribeir0 the Generate Rows was perfect to assign the groups. I wouldn't have gotten that on my own, so thank you so much for your help.
Hi @Felipe_Ribeir0 , I noticed something may have gotten incorrectly grouped. The result of the workflow mis-grouped some of the records.
customerid | comm_date | comm_type | RecordID | FirstDate_Ideal | EndDate_Ideal | RecordID_Result | FirstDate | EndDate |
1234 | 7/6/2022 | 1 | 7/6/2022 | 8/4/2022 | 1 | 7/6/2022 | 8/4/2022 | |
1234 | 7/7/2022 | Direct | 1 | 7/6/2022 | 8/4/2022 | 1 | 7/6/2022 | 8/4/2022 |
1234 | 8/9/2022 | 2 | 8/9/2022 | 9/8/2022 | 2 | 8/5/2022 | 9/3/2022 | |
1234 | 10/3/2022 | 3 | 10/3/2022 | 11/2/2022 | 3 | 9/4/2022 | 10/3/2022 | |
1234 | 10/6/2022 | Direct | 3 | 10/3/2022 | 11/2/2022 | 4 | 10/4/2022 | 11/2/2022 |
1234 | 11/8/2022 | 4 | 11/8/2022 | 12/8/2022 | 5 | 11/3/2022 | 12/2/2022 |
RecordID is the desired the result, the current output I'm getting is in the "RecordID_Result". We want to start the "count" again if the next communication is past 30 days from the first initial communication. So since the 10/3/22 communication is +30 days from 8/9/22 Email Communication, we want to group any communication together 30 days from 10/3/22, if that makes sense. So the 10/3/22 and 10/6/22 communication need to be together.
Hope this makes sense, appreciate your help!