Hi,
I need to load a file where the last field contains a description.
This description is a comment to a meeting with the client, where the user can use "enter", "tab"...
In other words, the field contains all sorts of information that I do not need. (e.g. : <P> </P>)
Furthermore, it creates inconsistency in my data meaning that, because of these special characters, my fields shift to the next record...
What is the best way to treat this field?
It's a dat-file with delimiters - the moment I read the file, the records are already messed up.
Thanks,
Dirk
Solved! Go to Solution.
Hi Dirk
This data format doesn't sound ideal - are there any qualifiers or anything to identify the fields? Is the end of line indicator different to the special characters used in the field?
Can you share a few example rows?
Regards
Chris
Hi Chris,
Not ideal is the least you can say.
I've generated some rows (4), took out the confidential data but I kept the special characters.
The first line is one that works without a problem, the remaining 3 show the complexity of the data and possible problems.
If you know the solutions to this, I would be very grateful.
Best Regards,
Dirk
Give this a try...
I have the assumption that the records seem to start with "ID#&" so that's how I built it. If that is not the case the concept should still work as you have something consistent you can use as a 'primary line' identifier.
You basically group on the 'primary' record and identify the rest of the lines that go with that primary record. Then with some transposition and summarization, you bring all of the 'extra junk' into one record and tack it on to the 'junk field' in the primary record.
Also, in your sample data you sent, it looks like you might have put a period in place of a pipe? Which moved it over to the comments in my process. Not sure if that is correct, but hopefully you can get somewhere with this concept.
Super. I'll have a look at this.
User | Count |
---|---|
107 | |
85 | |
76 | |
54 | |
40 |