Hi all- I'm working on identifying financial accounts that might be 'Sample' accounts. I've been finding account numbers in my data like '123456' or '5678901' that I've confirmed with business users are indeed fictitious accounts used for testing purposes. What I would like to do is find a way to isolate the digits in these account numbers to test if they are sequential so I can review to see if they are testing/sample accounts.
I'm thinking there's probably something I can do in RegEx to help me but I'm not very experienced with RegEx and am unsure where to start. I'm sure there's also a macro solution, but am feeling a little stuck overall. I would love guidance/opinions on the best approach and how to get started. Thank you in advance!
Account Number (Input) | Is Sequential? (Desired Output) |
Z5160 | FALSE |
2345 | TRUE |
7890123 | TRUE |
012345 | TRUE |
12345678 | TRUE |
1589673 | FALSE |
987654 | TRUE |
Solved! Go to Solution.
Interesting challenge. Thanks! Instead of a "solution", I put together a possible path to the solution. In the workflow (attached), I count the observations of numbers ascending, descending and equal when you compare each digit to the prior. You could modify to ascending 13579 as an example and not flag that number because the numbers differ by more than 1 digit. If 9 out of 9 digits are ascending, our 9 out of 16 digits are ascending, that might be a hit. Your business logic rules. I'm just demonstrating an approach to be refined.
Cheers,
Mark
Hi Kayers,
You could try FindString("1234567890123456789",[String])<0
Hope this helps If does, can I ask you to mark it as the solution? this will help other users to find it and will allow us to close the thread. Many thanks!
Best,
Diego
Thank you so much @MarqueeCrew - I didn't realize you could use the RegEx expression '.' to split every character. That's useful to know! I think this will get me really close to what I need - I appreciate the help and guidance.
That's an interesting approach @DiegoParker - the only limitation is that it won't flag an account number with a descending count (e.g. 98765) but I bet I could use another formula - FindString("9876543210987654321",[String]) to catch those, too and flag any account where either of those fields is less than 0. Thank you!
Hi @kelsey_kincaid
Absolutely! What I did was create a new field to reorder the numbers, in case they are descending, so that they'd all be ascending. You can then split into individual components using a Regex Parse, Tokenize method into rows, where the code is simply .
A multirow formula checks if the next number in the sequence is one greater than the previous by simple subtraction, with a check for zeros. Then a Summarize tool to check the min and max on that multi row result. If both are 1, then the numbers are sequential.
Let me know if this helps.
Cheers!
Esther
These are all really cool solutions!!! @kelsey_kincaid , mine takes into account descending numbers too.
Cheers!
Esther
Hi @kelsey_kincaid ,
Great challenge!
Looks like a long solution, but here is mine!
Best,
Fernando Vizcaino
@estherb47 Thank you! This looks like it would work great, too. I'll likely try all 3 and see what works best - I have over 60k account numbers to check so the likelihood is high that there's some nuance in the data that one of these approaches will handle better than the others. It's always interesting to see how different people approach the same problem. I always learn something new!