Hello,
I'm trying to create networks/groups based on common sequences of numbers. I have thousands of transactions I needed grouped by the identifier column. However, I do not know all the number combinations because the data set is so large. I need to be able to group people so I can identify if there is a connection and mitigate/identify risk.
Data Example:
Identifier | Name | Acct No. |
123456 | Jane | 1234 |
654321 | Ben | 1235 |
112225 | Kurt | 6541 |
856740 | Joe | 1237 |
856740 | Frank | 1238 |
654321 | Sarah | 1239 |
654321 | Cole | 1230 |
654321 | Amy | 3210 |
123456 | Harry | 3215 |
856740 | Jake | 3258 |
856740 | Don | 8626 |
856740 | Luke | 8542 |
555660 | Bonnie | 9653 |
856740 | Ron | 7518 |
112225 | Flynn | 1234 |
Output I'm looking for
Identifier | No. of Accts | Acct Nos. |
123456 | *number of accts using identifier* | 1234 6541 3258 |
654321 | *number of accts using identifier* | 1235 1547 |