Hi Everyone
I have a dataset where the data in multiple fields contains delimiters; see below.
Name | ID | Record ID |
John Smith|John Smith | 1|23 | 1|2 |
Jane Smith|Jane Smith | 2|50 | 3|4 |
Anyone have any thoughts how to split all the fields into multiple rows? Please see below for the output I am trying to achieve.
Name | ID | Record ID |
John Smith | 1 | 1 |
John Smith | 23 | 2 |
Jane Smith | 2 | 3 |
Jane Smith | 50 | 4 |
Please let me know. Thank you!
EY
Solved! Go to Solution.
How about the attached?
The workflow will add a ReocrdID (uniquely identify source rows), Transpose to a Name-Value pair for every cell, add a NameID (uniquely identify fields), Summarize Group by NameID,Name for use in renaming the fields later, split the Value field on the | deliminator into rows, add a ValueID (uniquely identify each value and sequence in a record's cell), cross tab into normal structure, deselect unnecessary field, and rename fields.
You might notice that I do not let the Cross Tab tool use the Name field to name the fields. This is because it will change the order of the fields to alphabetical and it changes the field names, replacing some characters with an underscore.
Hey, Joe
I made a slight change to your answer, only because in my case it was staggering the rows (all of the data of the first column would populate, with the remaining columns showing as null until the first column finished). To fix this I edited the 2nd formula (ValueID: [Row-1:ValueID]+1) to IF [Row-1:Name]==[Name] THEN [Row-1:ValueID] +1 ELSE 1 ENDIF
Hi Joe_Mako,
Is there any way to make this work when there are other additional columns of data which you want to be duplicated in the additional rows which have been inserted? I have adjusted EYamada's example below to explain.
Before
Name | ID | Record ID | City | Country |
John Smith|John Smith | 1|23 | 1|2 | New York | USA |
Jane Smith|Jane Smith | 2|50 | 3|4 | London | UK |
After
Name | ID | Record ID | City | Country |
John Smith | 1 | 1 | New York | USA |
John Smith | 23 | 2 | New York | USA |
Jane Smith | 2 | 3 | London | UK |
Jane Smith | 50 | 4 | London | UK |
Really appreciate your help.
The attached macro is similar to the above workflow, but with some improvements and the added capability you are looking for.
- Field Info to get the field names, so the fields are in the same order on output as input.
- After creating the ValueID, tho Summarize tools get the Max ValueID per Record, and per RecordID-FieldID combination, and joining on both RecordID and Max ValueID to find the cells that need to be filled down.
- Generate Rows starts at ValueID+1 so there is no duplication.
- Everything else is the same logic as detailed above.
Hi Joe,
I am using this macro and its amazing!
Thank you for this solution!
I have 1 question, I have some columns where the data is cut after the transformation.
For example, the word "Global" in 1 column becomes "Glob" (missing al)
However, other columns that have greater content like paragraphs don't get cut.
I noticed most of them are from my first few columns.
What should I do?
Awesome! many thanks for this solution. It helped me in similar kind of case.
User | Count |
---|---|
19 | |
15 | |
13 | |
9 | |
8 |