Hello Community!
I'm looking to create a new column called "Minutes" that would compare each row against every other row to determine the minutes difference for the ones that overlapped. I used Python to determine if a time was overlapping based on "Provider Name", "Start" and "End Times". If "Overlapped?" = "Did not overlap", minutes can return 0 since we don't really care about it.
In python, I would do the below to find the overlapping intervals. Now that we've purchased Alteryx, I'd like to do the minutes part here. Just can't figure it out...
df['Overlapped?'] = False
for i, row in df.iterrows():
start, end = row["Start"], row["End"]
df.loc[(df['Providers Name'] == row['Providers Name'])
& (((df["Start"] >= start) & (df["Start"] <= end)) |
((df["End"] >= start) & (df["End"] <= end)) |
((end >= df["Start"]) & (end <= df["End"])) |
((start >= df["Start"]) & (start <= df["End"]))), 'Overlapped?'] = True
The use case for this new output would be that a user would filter on Minutes where Minutes is >= 15 and consider that as a TRUE overlap. I tried DateTimeDiff([Row+1:Start], [End], "minutes") within a Multi-Row Formula which gets close to what I want.
I'm attaching a file where the first tab is my sample data and the second tab is the expected output. I'm fairly new to Alteryx so if you could attach a workflow with your suggestion so I can follow along, it would be much appreciated!!
Thank you in advance.
Solved! Go to Solution.
Hi @JarekSkudrzyk,
I've been struggling a bit to get this solution to work on my actual data and maybe I'm not understanding something. I'm hoping an explanation would get me back on the right track. At the end of the "Determine which records overlap" container, you put in a Text to Columns tool which split the values into two separate columns: Record_1 and Record_2.
I can't tell why my values have more than 2 values which forces me to change "Number of Columns to +10" where I have to join 10+ Record IDs at the end. Any idea where I could be going wrong?
Hi,
if you have more than 2 values after text to columns it probably means that for this time period there are more than 2 overlapping records.
If you provide an updated sample I can try to figure out how to update the workflow to make it work.