Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Find Overlapping Timestamps

peddy
7 - Meteor

Hi,

 

I couldn't find the exact solution to my problem, so creating a new topic.

 

I have the following dataset:

 

Customer IDProduct IDStart DateEnd DateValue
11111120200131202012311
11122220200131202012312
11133320200131202012313
22211120200131202012314
22222220200131202012315
22222220200401202004156
22233320200131202012317
33311120200131202012318
33311120210131202112319

 

I need to create a workflow which adds a column at the end and flags the records that overlap with other ones. So in the example above records with values 5 and 6 should be flagged.

 

FYI...The real dataset contains around 100k records.

 

Thanks,

Pedram

16 REPLIES 16
atcodedog05
22 - Nova
22 - Nova

Hi @peddy 

 

Sorry jumped too fast.

atcodedog05
22 - Nova
22 - Nova

Hi @peddy 

 

Have a doubt why are last 2 rows not colored. Same Customer and product ID but different time.

Can you be clear with the logic.

 

33311120200131202012318
33311120210131202112319
jdunkerley79
ACE Emeritus
ACE Emeritus

I would suggest:

jdunkerley79_0-1602155286282.png

 

First, sort the data by Customer, Product, Start Date

Use a multirow formula you can create a new column which is the current max end date of a period

Then a second multi-row can detect overlaps

 

Have attached a sample

 

peddy
7 - Meteor

@atcodedog05 

because one is 2020 and the other is 2021

peddy
7 - Meteor

@jdunkerley79 

 

Thanks - However, your solutions doesn't fully work. As you can see below line 2, 10 and 15 should have been true. I'm assuming the formula in the second multi-formula tool needs to be changed, but not 100% sure to what exactly. 

 

peddy_0-1602255625226.png

I did try a slightly different approach by summarizing on customer and product to get the max end date which seemed to resolve one of false negatives, but that's as far as I managed to get...

 

peddy_0-1602256301395.png

 

 

 

 

jdunkerley79
ACE Emeritus
ACE Emeritus

First things to check, make sure Group By is selected in Multi Row formula tools

 

Can you post a little more sample data and your workflow then will check through?

peddy
7 - Meteor

There's a few attempts in my workflow 🙂

 

I should mention that the end state needs to be that the dates are changed so there are no overlaps

 

so for example, if we have the following overlap

 

20200101202012318.37
20200301202003319.21
202004012020043011.45

 

The data set needs to change to...

 

20200101202002298.37
20200301202003319.21
202004012020043011.45
20200501202012318.37

 

So basically throughout 2020 the price is 8.37 expect in March and April where it's 9.21 and 11.45 respectively.

jdunkerley79
ACE Emeritus
ACE Emeritus

OK thats a little fiddlier - specifically how to split record into parts. WIll have a play and come back to you

jdunkerley79
ACE Emeritus
ACE Emeritus

Ok, have built a workflow but this one is complicated... 

 

Going to go through it in bits and then you can see if it works on your full set.

jdunkerley79_0-1602276798349.png

So start by ordering the records Customer, Product, Start

Add a grouped record ID within each Customer, Product (using a Multi Row formula)

Next transpose so Start and End dates are on separate rows

Now sort into Customer, Product, Date (either start or end)

 

jdunkerley79_1-1602276979156.png

 

The next tool is where it gets complicated.

Using a multi-row formula you track which rows are still active. This involves keeping a string and adding a new entry at the end or removing one.

After this use a sample tool to eliminate any days where multiple events occur

Finally, a formula tool is used to get the last active period and add CustomerID and ProductID to make a key

 

jdunkerley79_2-1602277286128.png

Now, back to more normal processes...

Use a find and replace to get the value for the current record

A little more working out to put together a start and end date and then some filtering.

 

I think this is close to what you need, though I admit it is pretty complicated.

 

 

 

Labels