Hello - I'm trying to parse these fields into 5 columns using RegEx, however, sometimes the data in column 5 is blank. The spacing between each field is not consistent as its a PDF converted to text.
sdfrgh Doe, Jane Yes Sales Rep I Disney-Land |
123566 Smith, John No Consultant Disney-World,Disney-Land |
12dvs6 Smith, Jane Yes Sales Rep |
Desired Output
sdfrgh | Doe, Jane | Yes | Sales Rep I | Disney-Land |
123566 | Smith, John | No | Consultant | Disney-World,Disney-Land |
12dvs6 | Smith, Jane | Yes | Sales Rep |
@elclark
It appears to be difficult for me to tell the diifference between last 2 columns.
Any chance that the last column always starts from Disney or null?
Hi, @elclark
A dynamic resolution for you :
And if your first line of input isn't the field name, it's easier to deal with.
hi @elclark
As @Qiu says, it is difficult to parse the last 2 columns. My idea is:
First, parse data by the following Regex. You will get total 4 columns(last 2 columns are still combined)
In the 3rd column, value should be Yes or No so its nature can be used as a good anchor in regex expression.
^(.+?)\s{2,}(.+?)\s+(Yes|No)\s{2,}(.+)$
Second, parse the 4th column by Text to Column tool into 2 columns, and you will get the desired output.
@flying008 the dynamic resolution almost worked, the scenario I ran into is sometimes the name section is blank but the format is consistent, unique id followed by name (or a blank space where the name should be) and then followed by either No or Yes (and this field is never blank and only those options).
@Qiu The last column always starts with 1 of 9 locations (so like Disney-xxx, Paris-xxxx, Rome-xxx, etc, those 9 locations are always the same but what follows after the - can be different) or its Null