I have the following data set I extracted from client statements using the Image to Text & PDF to Text.
Some client names on the statement are very long that while it gets extracted as the same page it appears in two rows as below (e.g pg 88 and pg 91)
Page | Row | Client |
77 | 1 | John Smith |
78 | 1 | Annie |
79 | 1 | Jeffrey Boesch |
80 | 1 | Lalala Land |
81 | 1 | Dixie Normous |
82 | 1 | Wolverine |
83 | 1 | Joyce M. Hellraiser |
84 | 1 | Bradley Looper dtd 9/18/2019 |
85 | 1 | Ivannah Umplot |
86 | 1 | Kendrick Lamar |
87 | 1 | Barbenheimer |
88 | 1 | Walter White & Walter White Junior Trust dtd 12/20/2013 |
88 | 2 | #89.291-34802204 |
89 | 1 | Kirby Nintendo |
90 | 1 | Imagine Dragons |
91 | 2 | -XX/YY/ZZZZ |
91 | 1 | abcdefughijklmnopqrstuvwxyzyxwvu |
92 | 1 | Leo Dicaps |
How do i go on about to make it this desired format?
Page | Row | Client |
77 | 1 | John Smith |
78 | 1 | Annie |
79 | 1 | Jeffrey Boesch |
80 | 1 | Lalala Land |
81 | 1 | Dixie Normous |
82 | 1 | Wolverine |
83 | 1 | Joyce M. Hellraiser |
84 | 1 | Bradley Looper dtd 9/18/2019 |
85 | 1 | Ivannah Umplot |
86 | 1 | Kendrick Lamar |
87 | 1 | Barbenheimer |
88 | 1 | Walter White & Walter White Junior Trust dtd 12/20/2013 #89.291-34802204 |
89 | 1 | Kirby Nintendo |
90 | 1 | Imagine Dragons |
91 | 1 | abcdefughijklmnopqrstuvwxyzyxwvu - XX/YY/ZZZZ |
92 | 1 | Leo Dicaps |
Thank you in advance and help would be appreciated!
Add a summarize tool after that and group on Page then use String > Concatenate on the Client. This will concatenate the multiple rows into one row. To make sure its in the right order, i would add a sort tool before it and sort on the Row.
User | Count |
---|---|
57 | |
26 | |
24 | |
21 | |
20 |