Hello,
I am currently using sort tool and a multi row formula tool to get my data set into a partition format along with row numbers. However, I want to compare rows belonging to a partition and pick specific row based on certain condition in that partition as well as the entire data set for all partitions.
Sample data set below. I need to compare rows at the partition level of Name in this case where I will pick row for oscar with flag as 1. However if rows partitioned at name are having flag same values then I will pick the one with earliest date and if the dates are same too then I will pick the row that has highest priority value for that partition. If priority is same as well then I will pick any row.
Name | Number | Priority | date | Flag |
Oscar | 1234567890 | 1 | 01/01/2011 | 1 |
Oscar | 1234567890 | 2 | 01/01/2011 | 0 |
Jane Joe | 2345678901 | 4 | 01/02/2011 | 1 |
Jane Joe | 2345678901 | 3 | 01/02/2011 | 1 |
Don Jon | 3456789012 | 3 | 03/03/2013 | 0 |
Don Jon | 3456789012 | 2 | 03/04/2013 | 0 |
Bill Will | 4567890123 | 2 | 04/05/2012 | 0 |
Bill Will | 4567890123 | 1 | 04/06/2012 | 1 |
Banner Stark | 5678901234 | 2 | 10/10/2020 | 1 |
Banner Stark | 5678901234 | 1 | 10/11/2020 | 1 |
Tony Bruce | 6789012345 | 2 | 10/20/2021 | 1 |
Tony Bruce | 6789012345 | 1 | 10/20/2021 | 1 |
Is this possible if to do dynamically?
TIA
Solved! Go to Solution.
@Yuri24 The first thing I did was convert the date field to an actual date so I could find the minimum/earliest per Name. Then I filtered data so only those records where Flag=1 passed. After that, I used a Summarize tool to group by Name and find Min_date, then joined that back to the output of the Filter tool on Name=Name and date=Min_date. After that join, I used the same type of logic to determine which of the duplicate names have priority.
Hi @Yuri24
I actually saw your post yesterday and I wanted to send you my workflow. But somehow, your post was deleted and I couldn't send it to you. But I still kept my workflow, and I'm not sure if you added more context. Please find it attached.
@caltang @Prometheus sorry if i confused by not putting the required output
Name | Number | Priority | date | Flag |
Oscar | 1234567890 | 1 | 1/1/2011 | 1 |
Jane Joe | 2345678901 | 4 | 1/2/2011 | 1 |
Don Jon | 3456789012 | 3 | 3/3/2013 | 0 |
Bill Will | 4567890123 | 2 | 4/5/2012 | 0 |
Banner Stark | 5678901234 | 2 | 10/10/2020 | 1 |
Tony Bruce | 6789012345 | 2 | 10/20/2021 | 1 |
both solutions are close but not exact.
If only 1 flag is 1 in a partition then pick that row, if both flags are 1 then pick the one with earliest date (oldest date). If both flags are 0 then again pick the one with earliest date.
If both flags are 1 or if both flags are 0 and both dates are same then pick the one with highest priority.
Based on that logic, shouldn't it be:
Name | Number | Priority | date | Flag |
Oscar | 1234567890 | 1 | 1/1/2011 | 1 |
Jane Joe | 2345678901 | 4 | 1/2/2011 | 1 |
Don Jon | 3456789012 | 3 | 3/3/2013 | 0 |
Bill Will | 4567890123 | 1 | 4/6/2012 | 1 |
Banner Stark | 5678901234 | 2 | 10/10/2020 | 1 |
Tony Bruce | 6789012345 | 2 | 10/20/2021 | 1 |
Bill has two records. One record already has Flag = 1, therefore it takes precedence.
Based on your logic, the order of precedence:
I've attached the fixed version of the workflow. PFA.
@caltang thanks again. I am also trying to add another condition where for a given partition of rows if number in row1 and row2 has a matching 6 or more digit sequence then mark the second row as 0 in a new field.
I used multi row formula as shown below:
If [Name] = [Row-1:Name] and
(REGEX_Match([Number], '.*(\d{6}).*') = REGEX_Match([Row-1:Number], '.*(\d{6}).*'))
then
0
Else
1
Endif
Sample input
Name | Phone |
Oscar | 234567890 |
Oscar | 345678901 |
Zoro | 1122333444 |
Zoro | 9988776600 |
expected output:
Name | Phone | result |
Oscar | 234567890 | 1 |
Oscar | 345678901 | 0 |
Zoro | 1122333444 | 1 |
Zoro | 9988776600 | 1 |
but I am getting
Name | Phone | result |
Oscar | 234567890 | 0 |
Oscar | 345678901 | 1 |
Zoro | 1122333444 | 0 |
Zoro | 9988776600 | 1 |