The idea is to make the sum of the count for each ID but only in 15 day increments. In the end, I want to identify scenarios where a single ID made, for example, more than 10 transactions in any 15 day period.
Input:
Expected Output:
Solved! Go to Solution.
Here you go. Join the file to itself on id = id, then filter for the second datetime >= or <= 15 days before or after the first datetime, then aggregate it to get the count and rejoin it. There's another way to do this using the multirow tool, but this is quicker and easier IMO.
I just realized, I misinterpreted your ask for a 15 day window both before and after the datetime, where you only wanted a 15 day window after. So change the formula tool to this: [Right_DateTime_Out] >= [DateTime_Out] and [Right_DateTime_Out] < datetimeadd([DateTime_Out],15,'days') and it should meet your need.
You apparently have 27 rows in your dataset where the recordid and datetime are exactly the same, just so you know in case it isn't expected.
Amazing, thank you for the quick reply. This works perfect for the IDs and Dates that have only 1 Count (I should have specified this is not always the case). It seems that those with counts greater than 1 get added twice.
Here's an example with ID 55850:
Any idea how to fix this?
Yeah, I was afraid of that, which is why I mentioned it. Slap a group by before the first join on the second file that looks like this:
That will reduce your multiple joins and get you 6 instead of 12 on your result set.
Hope this works for you! Please mark my post as accepted as a solution if it does.
Hi, my needs have changed a bit. I wanted to ask if there was a way to calculate the sum of the 15 day period in a way to create clusters. I mean that when I take the sum of the column, I want each transaction counted one time. So for each transaction after the first in the 15 day period, it should report 0 or null and any transaction outside of the 15 day period should be counted even if the ID is the same. Here is an example of what I'm looking the expected output to be:
I was trying a multi-row tool to modify the values I already have but I imagine there is a better way to calculate this. In summary, the sum of both columns should be the same.