Hello,
I'm trying to create networks/groups based on common sequences of numbers. I have thousands of transactions I needed grouped by the identifier column. However, I do not know all the number combinations because the data set is so large. I need to be able to group people so I can identify if there is a connection and mitigate/identify risk.
Data Example:
| Identifier | Name | Acct No. |
| 123456 | Jane | 1234 |
| 654321 | Ben | 1235 |
| 112225 | Kurt | 6541 |
| 856740 | Joe | 1237 |
| 856740 | Frank | 1238 |
| 654321 | Sarah | 1239 |
| 654321 | Cole | 1230 |
| 654321 | Amy | 3210 |
| 123456 | Harry | 3215 |
| 856740 | Jake | 3258 |
| 856740 | Don | 8626 |
| 856740 | Luke | 8542 |
| 555660 | Bonnie | 9653 |
| 856740 | Ron | 7518 |
| 112225 | Flynn | 1234 |
Output I'm looking for
| Identifier | No. of Accts | Acct Nos. |
| 123456 | *number of accts using identifier* | 1234 6541 3258 |
| 654321 | *number of accts using identifier* | 1235 1547 |

