Alteryx Designer Desktop Discussions

Jake5 · ‎11-04-2022

Hello,

Is there a way for Alteryx to output as a field value the type of csv file of the input file? More specifically, I would like to know if the csv type is CSV (Comma delimited), CSV UTF-8, CSV (Macintosh), or CSV (MS-DOS). Thanks.

IraWatt · ‎11-04-2022

Hey @Jake5,

Interesting question, is there any way to differentiate these files in the file properties?

Could you attach a CSV (Macintosh)? To see if there is any way to tell them apart.

Jake5 · ‎11-04-2022

Not sure either. I have a CSV that originated as CSV (Comma Delimited) and did a save as CSV UTF-8 (Comma Delimited) and another as CSV (Macintosh)...but even in Windows all 3 files just show as CSV but do not call out the specific type of CSV.

TimN · ‎11-04-2022

Hi,

Take a look here. It says in general It's not possible but you can try with Python.

https://chardet.readthedocs.io/en/latest/faq.html

danilang · ‎11-06-2022

Hi @Jake5

You're looking at two different standards here. Windows, Mac and Linux formats deal with line termination characters. ISO, UTF-8 and UTF-16 have to do with character encoding, i.e. the way each byte or group of bytes maps to a displayable character.

For the first one, Windows line ending format is CR\LF (0D0A). Original Mac format used CR(0D). Linux and now Mac OS X use LF(0A). So if you find a CR\LF pair in your file it's Windows, CR only it's original Mac and LF only it's Linux\Mac OS X.

For the encoding formats, UTF-8 starts the file with the three bytes EF,BB,BF. UTF-16 starts with FF,FE and the ISO formats don't have any leading characters.

You can't use an Input Data tool to analyze these, because it performs all kinds of behind the scenes magic to figure these things out and present you with the data, without telling you the formats. However, you can use a Blob Input, which reads the raw data of the file, followed by a Blob Convert to convert the blob to a series of hex bytes. Then you just have look at the first few bytes to determine the encoding and then look for the various combinations of 0D0A to find the line ending format

I've also attached the 4 csv files that I tried it with

Dan

Jake5 · ‎11-08-2022

Thank you, @danilang

Using your Mac.csv, I'm getting a LineEnding value of Windows which I gather is happening because it has located "ODOA" within the blob. I was expecting it to return "Mac." I've included a screenshot to help illustrate. Also please clarify if your Elseif formula for the Type field should be updated from 2 to 4 as shown within my screenshot. Thanks.

Alteryx Designer Desktop Discussions

Detecting various CSV file types

Re: Is there any way the computer vision tools can...

Re: Batch Macro

Re: How to get cell reference address from excel

Re: Replacing Forecast columns with Actual Data

Re: Row creation