Alteryx Designer Desktop Discussions

Yuri24 · ‎10-10-2023

I am trying to solve for a condition where for a given partition of rows if number in row1 and row2 has a matching 6 or more digit sequence then mark the second row as 0 in a new field.

I used multi row formula as shown below:

If [Name] = [Row-1:Name] and
(REGEX_Match([Number], '.*(\d{6}).*') = REGEX_Match([Row-1:Number], '.*(\d{6}).*'))
then
0
Else
1
Endif

Sample input

Name	Phone
Oscar	234567890
Oscar	345678901
Zoro	1122333444
Zoro	9988776600

expected output:

Name	Phone	result
Oscar	234567890	1
Oscar	345678901	0
Zoro	1122333444	1
Zoro	9988776600	1

but I am getting

Name	Phone	result
Oscar	234567890	0
Oscar	345678901	1
Zoro	1122333444	0
Zoro	9988776600	1

Appreciate any solution. TIA

caltang · ‎10-10-2023

Hi @Yuri24 - you've posted this as a comment here: https://community.alteryx.com/t5/Alteryx-Designer-Desktop-Discussions/Compare-data-in-a-parition/m-p...

Just tidying up the logic

Calvin Tang
Alteryx ACE
https://www.linkedin.com/in/calvintangkw/

caltang · ‎10-10-2023

Hi @Yuri24 can you confirm the logic below?

If you are saying 6 digits repetition, are you saying for BOTH in the cell repetition and count of repetition across two rows?

In addition, will there by cases whereby a name exists more than 2 rows?

Calvin Tang
Alteryx ACE
https://www.linkedin.com/in/calvintangkw/

caltang · ‎10-10-2023

Based on your logic, shouldn't it be:

Sample input

Name	Phone
Oscar	234567890
Oscar	345678901
Zoro	1122333444
Zoro	9988776600

expected output:

Name	Phone	result
Oscar	234567890	1
Oscar	345678901	1
Zoro	1122333444	1
Zoro	9988776600	1

For Oscar, it's between rows having 34567890 as matching values.

For Zoro, there's 9 duplicate numbers within each of their rows.

Something doesn't line up in your logic. Can you give more examples and check again?

Calvin Tang
Alteryx ACE
https://www.linkedin.com/in/calvintangkw/

Yuri24 · ‎10-10-2023

@caltang Second row for oscar should be 0 since the 6 or more digit sequence number matches for oscar's second row number. Basically looking for any 6digit or more sequence matches in both rows.

In zoro's case there is no match of a 6 or more digit number or sequence hence both rows should be 1. Yes there could be cases where there are more than 2 rows for same name

PangHC · ‎10-10-2023

try to find a regex, no luck on it. hence here the super long ways.

1. duplicate each rows. (as 1st row compare to 2nd row and 2nd row compare to 3rd row, hence 2nd require twice)

2. assign compareID (1st vs 2nd as 1, 2nd vs 3rd as 2, etc...)

3. duplicate len-6 rows (to generate all possible any 6 char)

4. sumarize with compareID and 6char. if more than 2. then it match.

5. use join, union, unique etc. to return to clean table.

Screenshot 2023-10-11 110119.png

Screenshot 2023-10-11 110128.png

caltang · ‎10-11-2023

Took me awhile @Yuri24 but I got something:

Major caveats:

My method works because I'm assuming the data size you're working with is not that huge (sub 10k). Because I'm using REGEX to split your phone number to Rows, each person will have 10 rows. So if you have 10k customers, you will have 10k x 10 = 100k rows. It gets bigger and more exponential as you increase the person count, and the frequency of which they appear in the data.
Now, I am only using Name Sequence (Their appearance # - for instance, Oscar appears twice and Zoro appears twice. So 1st and 2nd for each of their unique names). Because of the Name Sequence, I am currently only accounting for 2 rows at a time. The more rows you have, the more you have to add Filter tools and Summarize + Join Tools before the 2nd last Union tool. This will get you the proper count and if they are J, it means the appearance after the 1st will be flagged.
So in this case, I've added a third row for you for now to test. If you have scenarios that have people with more than 3 phone numbers then you need to add more Filters.

I understand that this may not be a fully dynamic solution nor the best, but for its business use case and logic, I thought of this to achieve what you want as you can see in the results pane.

Let me know if this works.

Best,

Calvin

Calvin Tang
Alteryx ACE
https://www.linkedin.com/in/calvintangkw/

Yuri24 · ‎10-12-2023

@PangHC , @caltang Thanks. However, if there are continuous numbers irrespective of the actual numbers it seems to pickup the next row as a match. For example if I add 11111111111 or 9999999999 next to zoro it is considering that number as a match too in both workflows.

I am wondering why the multi row formula is not working.

caltang · ‎10-12-2023

Hmm, perhaps you can give us more data and more expected output, then we will use the multi-row again to see.

Otherwise your 2 examples are not enough... because as you confirmed, there are scenarios where there are more than 2 rows.

Calvin Tang
Alteryx ACE
https://www.linkedin.com/in/calvintangkw/

dwstada · ‎10-13-2023

hey @Yuri24 please find attached a workflow that solves your problem

there's two options: numbers should only match the first number or numbers should match the previous number

I wasn't sure which one you wanted so I included both (probably the second?)

also regarding your question why the multi row formula is not working as you intended - the regex_match function only returns TRUE or FALSE, not the actual matched sequence. So your formula just looks if there are 6 digits in the row above it and if there are 6 digits in its own row, but not what digits are in those rows.

I am not sure if there is a way to call back to capture groups from one expression to another expression, so my workflow is a short workaround for that

Alteryx Designer Desktop Discussions

Multi row formula not working

Re: Date Time Function - Prioritization Base on Du...

Re: If formula to determined a value based on date...

Re: Running multiple alteryx workflows within alte...

Re: Selecting the columns coming after a specific ...

Re: Regex(?) formula to remove values matching the...