Hi Community,
I’m working on a workflow where I need to group rows based on two conditions:
- Group by `[Name]`
- Within each name, group rows where the date difference between records is **10 days or less**
If the gap between rows is **more than 10 days**, a **new group ID** should be assigned.
Here’s a simplified example of my data:
My expected output are as below:
Thanks in advance for any help or suggestions!
Here is a way of doing it using the Multi-Row Tool:
First convert your dates into the standard format for Alteryx, then create a 'Grouping' column with a default value of 1, then using the Multi-row tool update the Grouping column with the below formula
IF (DateTimeDiff([DateTime_Out],[Row-1:DateTime_Out],"days") <= 10) AND ([Name] = [Row-1:Name])
THEN [Row-1:Grouping]
ELSE [Row-1:Grouping]+1
ENDIF
The way that @davidskaife has done it, is the way that I would as well. Nice, clean and simple! It's the way that I would tell people to do it if running a training session as well.
I will note though:
Hi @davidskaife ,
Thanks for your help with my earlier issue. I’ve tested your workflow using my own dataset, but it seems the method isn’t working as expected on this new data.
Specifically, for Ali’s data:
Group #4 should only include dates up to 2021-01-11, which is 10 days from Ali's first date (2021-01-01). However, the workflow is currently including 2021-01-13 in the same group, which is incorrect.
The correct behavior should be:
Group #4 includes dates from 2021-01-01 to 2021-01-11.
Group #5 should then begin with 2021-01-13, continuing to include any dates within the next 10 days from that point.
Appreciate your help in looking into this again!
Hey @HakimRazalieq1
This differs from the original requirement; to confirm what it should be doing is checking if its within 10 days of the first record for that grouping, not within 10 days of the previous record?
Hi @davidskaife
Yes, that’s correct. The logic should group records based on whether each date is within 10 days of the first date in the group, not just 10 days from the previous record.
Here you go, one approach to it! I've built a small Iterative Macro that cycles through the data and assigns the 'First_DateTime_Out' for that named group. i.e all Ali are assigned to 2021-01-01 as they are within 10 days, but not 2021-01-13 or 2021-01-24 - these records iterate through and get checked again, and repeat!
I've tidied up the main workflow a bit as well, removed the Grouping = 1 Formula tool (this is handled by the Multi-row tool now, where Record ID = 1), and updated the Grouping Formula so it checks both the Names and the DateTime_Out rows match.
IF [RecordID] = 1
THEN 1
ELSEIF [Name] = [Row-1:Name] AND [First_DateTime_Out] = [Row-1:First_DateTime_Out]
THEN [Row-1:Grouping]
ELSE [Row-1:Grouping]+1
ENDIF
You end up with this grouping, which should be what you're looking for?
I'd create a workflow in the general lines of:
Ingest File => Summarize Min Date per User => Join with Original Date Set => Filter Out Records Older than Min Date + 10 => Select Columns per Desired Layout
User | Count |
---|---|
106 | |
82 | |
70 | |
54 | |
40 |