Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Help with looking for matching records and date filtering

JW
6 - Meteoroid

I am pretty new to Alteryx and trying to figure out how to do something and have no idea where to start. I've got a list of records with a Unique ID, Date/Time, Status, and Value. The Status can be either Pass or Fail. There might be 4 records that were a Fail, and then 1 or more that was a Pass. Or no passes.

 

Sample data:

Unique ID     Date/Time                 Status     Value

12345           1/1/2021 14:00:01       Fail         5

12345           1/1/2021 14:00:02       Fail         5

12345           1/1/2021 14:03:07       Fail         5

12345           1/1/2021 14:05:01       Fail         5

12345           1/1/2021 14:06:00       Pass       5

12345           1/2/2021 12:00:00       Pass       5

12345           1/3/2021 13:05:01       Pass       5

 

I want to match the fail records to the next passing record that shares the same Unique ID and value. If it passed by the end of the following day, then it's good and a "match". So like in this example, the desired output would look like this:

 

Unique ID     Date/Time                 Status     Value      Match Status        Matching Unique ID       Matching Date/Time       Match Status     Match Value

12345           1/1/2021 14:00:01       Fail         5                Match                       12345                            1/1/2021 14:06:00              Pass                       5

12345           1/1/2021 14:00:02       Fail         5                Match                       12345                            1/2/2021 12:00:00              Pass                       5

12345           1/1/2021 14:03:07       Fail         5               No Match

12345           1/1/2021 14:05:01       Fail         5               No Match

 

The output could also be like this if it's easier.

Unique ID     Date/Time                 Status     Value     Match Status

12345           1/1/2021 14:00:01       Fail         5                Match

12345           1/1/2021 14:06:00       Pass       5                Match

12345           1/1/2021 14:00:02       Fail         5                Match

12345           1/2/2021 12:00:00       Pass       5                Match

12345           1/1/2021 14:03:07       Fail         5             No Match

12345           1/1/2021 14:05:01       Fail         5             No Match

12345           1/3/2021 13:05:01       Pass       5             No Match

 

The goal is just to be able to identify which fail records didn't have a pass that occurred after the fail but before the end of the next day, but being able to see the details of which record matched to the fail would be nice (hence the 1st sample output which I like)

 

Was thinking to do something like.. filter the fails to their own group, filter the passes to their own group, and then do a join on Unique ID, Value. But the date thing is throwing me for a loop (The pass can only show as a match if it's 1) after the fail and 2) by the end of the next day), and the other thing is that each pass can only show as a match one time ever - so after it is a match for something, it has to be removed from being considered for other matches. It's like I need to look at my Fail data, and for each Fail record, iterate through each Pass to see if it's a match, and if it is, remove it from the Pass list and then continue on through my list of Fails or something. I might be thinking about this wrong though. 

 

Am a little bit lost at how to go about this so any help or direction would be appreciated!

Thanks

 

6 REPLIES 6
JarekSkudrzyk
11 - Bolide

@JW 

Hi, please find the solution attached.

Jarek_0-1646919135846.png

If this solves your problem please mark this answer as a solution.

If it does not - please let me know, I will try to improve it.

JW
6 - Meteoroid

Thanks Jarek!

I ran this as is and with the sample data it works as expected, but ported this over to my real data and the final result was not as expected. I went back to this workflow that you shared and if I add another 1 "fail" record to the sample data with another Unique ID, it seems to end up in the results 3 times and also shows that it matches to one of the other unique ID's and amounts that are different. 

 

 

The record I added is:
Unique ID: 54321

Date/Time 12/1/2021 14:00:00

Status: Fail    (note it is actually " Fail " - that threw me off for a second :) )

Value: 4

DailyDate: 2021-12-01

Time: 14:00:00

 

Any thoughts?

Thank you!

 

JarekSkudrzyk
11 - Bolide

@JW 

the issue was with the join tool - I have added second field to the configuration (uniqueID):

JarekSkudrzyk_0-1646987285748.png

It should work now (I also added data cleansing for status to get rid of the problem with spaces).

 

Hope it is ok now.

JarekSkudrzyk
11 - Bolide

you can also add a "RecordID" tool at the beginning of the workflow and then at the end of the workflow add "sort" tool to sort the data by this added column to have the order of the records unchanged.

JW
6 - Meteoroid

Jarek,

Your help is very much appreciated. I have been going through step by step to see what each part of this is doing and testing with my real data. One thing I found, and I don't know how to account for this, is in this flow I need to only match a fail record to a pass record that occurred after it (along with the pass occurring by the next day). One criteria of the logic here is calling it a match if the datetimediff is < 1 day. But in my real data I have instances of things like..

 

UniqueID          DateTime                       Status         Value

123                 1/1/2022 20:20:00           Pass            100

123                 1/1/2022 20:22:00           Pass            100

123                 1/1/2022 20:25:00           Fail              100

123                 1/1/2022 20:26:00           Fail              100

123                 1/1/2022 20:27:00           Pass            100

 

This would match the 1st 2 fails to the passes that occurred before it on the same day which is not what I am looking to do. In this case the 1st fail should match to the 20:27:00 pass and the 2nd fail should have no match. I thought to adjust the AuxFlag calc to say 

IF DateTimeDiff([Matching_DateTime],[DateTime],"days") >1
THEN "no match"
elseif [Matching_DateTime] < [DateTime] then "no match"
elseif isnull([Matching_RecordID]) then "no match"
ELSE "match" ENDIF

 

That doesn't quite work correctly though because of how the logic of the flow works. I am not sure how to articulate my thoughts.. only want to check for matches on fail records that have a pass that occurred after them, but I think this matches the fails to passes that may have occurred before them. Does that make sense? 

JarekSkudrzyk
11 - Bolide

@JW 

Sure, It makes sense now, I must have missed this piece of information before.

 

I had to change the workflow quite a bit - it now contains interative macro, but appears to be working fine.
When you will be testing it on your data please adjust the maximum number of iterations to the amount of "Fail" records in your data or higher.

If you will set this amount so that it is lower than number of fails - not all fails will be matched. On the other hand please note that the higher the number, the longer the workflow will take to process the data.

NB I think 10 million is the highest number of iterations that Alteryx allows - if it is a problem, please let me know I ll try to figure out some other solution.


In order to adjust the number of iterations you need to:

- open the macro (attached in the yxzp package)

- open "Interface Designer" view (ctrl + alt + d)

- click the "Properties" tab and adjust "Maximum number of iterations" as shown on print screen below.

Hope it now works fine:) In case it doesn't - feel free to let me know:)

JarekSkudrzyk_0-1647295003602.png

 

Labels