Hello Alteryx community,
I'm working on a dataset containing dates, visitor IDs, and visit IDs, and my objective is to calculate the difference, in hours, between visits. Here's an example pertaining to a single visitor:
Date | Time | Visitor_ID | Visit_ID |
2019-09-01 | 9:00:00 AM | 123456789 | A |
2019-09-01 | 10:00:00 AM | 123456789 | B |
2019-09-01 | 11:00:00 AM | 123456789 | C |
2019-09-02 | 4:00:00 PM | 123456789 | D |
2019-09-02 | 7:00:00 PM | 123456789 | E |
2019-09-03 | 3:00:00 PM | 123456789 | F |
As you can see, Visitor 123456789 visited my site in six different sessions over the course of three days. Of course, my dataset contains hundreds of thousands of Visitor_IDs, so it's not possible to break them out and work on them individually. Ultimately, I'd like to get to this:
Date | Time | Visitor_ID | Visit_ID | Hours Since Last Visit |
2019-09-01 | 9:00:00 AM | 123456789 | A | First Visit |
2019-09-01 | 10:00:00 AM | 123456789 | B | 1 |
2019-09-01 | 11:00:00 AM | 123456789 | C | 1 |
2019-09-02 | 4:00:00 PM | 123456789 | D | 29 |
2019-09-02 | 7:00:00 PM | 123456789 | E | 3 |
2019-09-03 | 3:00:00 PM | 123456789 | F | 20 |
Hopefully my example here is straightforward. I've tried a few ways but what I seem to get stuck on is ensuring that the calculation of hours since last visit stays within the Visitor_ID, i.e. Alteryx does not just calculate the time difference between each row regardless of the ID that groups it.
As a final note, you can see that I marked the first row as 'First Visit.' For each Visitor_ID, there's going to be a row with nothing to calculate previous to it. I'm open to suggestions on the right way to label that, as obviously I don't want to set it to 0.
Solved! Go to Solution.
@essemMLB I like @BrandonB 's solution. The key part to this is to concatenate the Date+Time in the formula tool by using the DateTimeParse([DateTime],'%Y-%m-%d %I:%M:%S %p') formula and then using the multi-row formula tool to look at the below rows.
In the multi-row formula tool, you can use the datetimediff() function to calculate the different in hours for the below rows.
if isnull([Row-1:New Field]) then 'First Visit' else datetimediff([X],[Row-1:X],'Hours') endif
Hey! Thanks for the quick reply.
I'm getting this error when applying the formula to my workflow:
ConvError: Formula (42): DATETIMEPARSE: Cannot convert "2019-01-12 05:27:54" to a date/time with format "%Y/%m/%d %I:%M:%S %p": DateTime error: read an hour 1 to 12 (%I), but did not read an AM/PM indicator (%P): '%Y/%m/%d %I:%M:%S %p'
It's important that I keep these times in 24-hour format, though. Is there a way to rectify this error without losing that format?
Actually, what I did to solve my error was: read the documentation and switch the %I to %H.
Who would have thought that the solution would be to actually read Alteryx's documentation?! What a novel concept! haha
@essemMLB Sorry didn't get a chance to reply, yes that should do it!