I have been playing with this data set for quite some time and just can't get it right, so I wanted to reach out and see if I could get some assistance from a wiz in the community.
1. This is mocked up data that sort of matches some of the characteristics that I'm dealing with
2. Not all tags are on a new line so you will see in the sample file that the first line of table data is at the end of the last table header
3. Character values included in sample data text are the only characters that I have found from source (- , " ' . ~ & ; ) so find/replace can't be used on these values but you can use ^ $ @ | if you need to create generic break points
4. All relevant table row tags are self terminating like </tr> and in the actual file are never on a new line, however for the sake of viewing in the sample, I have modified the layout so that it is easier to see the breakpoints and validate parsing a little easier.
5. the <BR> tags are not necessary and can be removed entirely, these values will be going into one cell as a comma separated string
Thanks in advance for any assistance you could provide <3
Ideally I would like to have each of the TH rows to be the table header (8 columns) and all of the data within each tr to be listed as a value under each of these columns.
Please let me know if you have any questions
Solved! Go to Solution.
@aasmith8116
I guess the trick here is to understand the xml code to know what to do.
TH - Table Header
TR - Table Row
TD - Table Data = cell
TD will be open of place holder and \TD will be the end of it
So now that you know where a table row ends you know where the next row starting
So what you need to do is create so flags for each of the rows get the values that inside the >< and the with Cross Tab or Summarize tool you could concatenate the rows and then with Text to Column get the 8 columns. But first you will need to clean the data from all the xml coding so you will stay only the values.
Thanks Cog, I have been struggling with the RegEx tool but this does the trick and is easy to follow.