Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Why can't we use the UTF8 \x07 (BELL) character for delimiter parsing?

dubblej21
5 - Atom

Hi All,

 

As part of the ETL process, we use Alteryx and Python. 
Between the two there is not an ideal format, we use Parquet in Python, but Alteryx doesn't support this, so we use an intermediate format to transfer. Consider that all the tricks with Python embedding in Alteryx don't really work, since the file sizes are bigger than what pandas can handle, therefore this route doesn't seem to work (or is not performant enough). 

Therefore we ended up to intermittingly convert to CSV files with rare delimiter, since the 'cleansing tool' is not the most performant of the suite, and if we can just handle this with a 'rare delimiter' this would resolve the issue as well. Consider that often any other trick, using quotes etc. may result in other issues as well. We started using the \x07 (BELL) character for python processing, which works great. However now the issue is that Alteryx doesn't support it.

I was wondering why we can't use the "\x07" character for delimiter parsing.
Just consider that we use it often, since it's a 1 byte character and it's a very safe option for large files.

Therefore it's almost preferred in our internal pipelines which part partially based on python processing and with parts of Alteryx.

Is there any way how we can read/write this effectively, at least through the 'standard' method.


And perhaps as well, is there any plan on the roadmap to start supporting these 1 byte characters?

 

Regards!

1 REPLY 1
NeoInfiniTech
11 - Bolide

Hi @dubblej21,

 

I also encountered an error when I tried to use the bell character () as a CSV delimiter in Alteryx, I can recommend another rarely used character such as the interrobang (or reverse interrobang) which I sometimes use to work on tab-delimited data by replacing tabs with interrobang character to easily identify any inconsistencies such as unexpected line breaks and additional tabs in description fields.

 

As for Parquet input/output, it is supported by Alteryx Designer since version 2024.1.

Labels
Top Solution Authors