Remove Non printable character from text file
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Dear Friends,
i used to work with the clients ERP data which i used to receive in text file format and at time the file will be in broken because of Non printable characters(non keyboard characters) , New line , Carriage return, white space.
I am finding difficult to handle that data cleanse work , i need your suggestion/ if any workflow to help me handle this.
Thanks for your support.
Solved! Go to Solution.
- Labels:
- Data Investigation
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Please copy the below data after the commented line (delimiter is "#|#") and paste in txt file, as i couldn't upload due to some restrictions,
"�" non printable characters like this to be removed (line above 104)
Remove New line & Carriage return & White space/blanks
'--------------------------------------------------------------------------------------------------------------------------------------------
MANDT#|#MBLNR#|#MJAHR#|#ZEILE#|#PARENT_ID#|#BWART#|#XAUTO#|#MATNR#|#WERKS#|#LGORT#|#CHARG#|#INSMK#|#ZUSTD#|#SOBKZ#|
103#|#1111111113#|#2020#|#11111#|#000000#|#222#|##|#000000000010002222#|#XX02#|#YY01#|#X444444#|##|##|##|##|#1000111111#|##|#000000#|#0000#|#Z#|#EUR#|#11.21#|#11.55#|#0.00#|##|#5.000#|#BLS#|#5.000#|#CAR#|##|#00000#|##|##|#0000#|##|##|#0000#|##|#8315#|##|#2020#|#7929#|##|#000#|##|##|##|#L#|##|##|##|##|#0001364101#|#0000101401#|#00#|#000000000010002042#|#MA01#|#WA01#|#X#|#X#|#9074.000#|#136490.29#|#S
�#|#0.00#|##|#F
104#|#111111111�#|#2020#|#11111#|#000000#|#222#|##|#000000000010002222#|#XX02#|#YY01#|#X444444#|##|##|##|##|#1000111111#|##|#000000#|#0000#|#Z#|#EUR#|#11.21#|#11.55#|#0.00#|##|#5.000#|#BLS#|#5.000#|#CAR#|##|#00000#|##|##|#0000#|##|##|#0000#|##|#8315#|##|#2020#|#7929#|##|#000#|##|##|##|#L#|##|##|##|##|#0001364101#|#0000101401#|#00#|#000000000010002042#|#MA01#|#WA01#|#X#|#X#|#9074.000#|#136490.29#|#S#|#0.00#|##|#F
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi,
You can use the below to cleanse the data to remove the unnecessary data from the files. Please let me know if this helps or if you were looking for something else. You can give this in a multi field formula to apply over all the fields post parsing with a delimiter.
regex_Replace([field],"[^\w+]",'')
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Or if you're trying to get rid of those question marks,
REGEX_Replace([Field1], '[^\x00-\x7f]', '')
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Thanks for your response but the solution you provided to handle the non printable characters is not working.
Can you try with the Data i provided in the previous post, copy and paste in a text file and create a solution if possible.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Thanks for your response but the solution you provided to handle the non printable characters is not working. i tried using replace too but still its not handled.
Can you try with the Data i provided in the previous post, copy and paste in a text file and create a solution if possible.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Thanks Philip,
It worked using the same you provided but i implemented in formula.
