This question is related to the behavior of the Output Data tool.
When creating CSV output with Code Page option set to Unicode UTF-8 the output file contains a byte order sequence (BOM) as the first character. This sequence is used to explicitly indicate the endianess of the text. Many consider the inclusion of this in the file to be bad practice. Is there a way to configure Alteryx to not output the BOM? If not, could a feature be considered to allow the user to toggle?
Here is more information about BOM:
http://en.wikipedia.org/wiki/Byte_order_mark
http://stackoverflow.com/questions/2223882/whats-different-between-utf-8-and-utf-8-without-bom
This is causing some pain as I am creating very large text files in UTF8 format (many GB) and having to do a post-Alteryx scripting process to remove the BOM.
Solved! Go to Solution.
I'd love to see this option as well as we have the same issue.
Hey @ben_stroud,
Good news! We're releasing with 11.0 the ability to ouput a .csv file that uses UTF-8 without a byte order
mark (BOM) via a new option in the Output Data tool. You will also be able to read the .csv file without the BOM via an input tool by selecting the UTF-8 code page.
We hope this helps with the issue you described!
- @KatieH
Now if we can just add the ability to put in a header record and trailer record in the output my life will be much easier!
I'm having a similar problem now with JSON files: UTF-8 encoding is writing the BOM, and a client language doesn't like it.
Is there plans to roll the "Include BOM" toggle for other file formats?
I would also like that it will be an available setting as such in the Output file, without need for workarounds.
User | Count |
---|---|
18 | |
14 | |
13 | |
9 | |
8 |