Alteryx Designer Desktop Discussions

Hans020 · ‎10-28-2019

Hello,

I have the following problem, and since I'm new to Alteryx I was hoping to get some advice.

In a directory, we have some comma separated files (extensions .dat and .csv).
Files have different layouts, so not all files have same number of columns / column names.
Some files are Unix type files (record delimiter = CR), some files are DOS type (record delimiter = CR / LF).

We also have an Excel file that contains some basic metadata on the files:

Filename	Header
File_A.dat	col1,col2,col3
File_B.csv	colA,colB,colC,colD

The idea is to follow these steps:

Use Input to read filename + expected header from Excel
Feed filename to Dynamic Input and read only 1 row (containing the header data)
Compare the actual header record with the expected header

However, the Dynamic input requires an Input Data Source Template. How do i configure this to cope with both Unix and DOS type files?

Basically, from each file, we only want the first record. Record length is variable.

Any guidance, thoughts, or examples would be greatly appreciated.

Hans

BrandonB · ‎10-28-2019

It might be easier to use a macro like the one here: https://community.alteryx.com/t5/Alteryx-Designer-Discussions/Input-Batch-Macro-AKA-the-Extra-Dynami...

By using a macro like this, you don't need a template. It leverages a control parameter to pull a list of files and stacks based on common headers. Given that you want to read in the header data, you will need to set the input data tool inside of the macro so the box is checked for "First Row Contains Data". Then you can use a sample tool where you pick the first row and check the box to group by file path. This will give you the header rows for each file that has been read in.

Hans020 · ‎10-29-2019

Hi Brandon,

Thanks for your input. I installed the macro and made the proposed modification ( "First Row Contains Data").

The output of the Input Batch is indeed showing the header for each selected file. However, it's in this format:

Filename	Field_1	Field_2	Field_3	Field_4	Field_n
File_A.csv	col1	col2	col3	null	null
File_B.csv	col1	col2	col3	col4	null
File_C.csv	col1	col2	null	null	null

I'm looking for something like this:

Filename	header
File_A.csv	col1,col2,col3
File_B.csv	col1,col2,col3,col4
File_C.csv	col1,col2

Is there a way to concatenate a (variable) number of columns into 1 comma delimited string?

Hans

jrgo · ‎10-29-2019

Hi @Hans020

In your batch macro, update the Input tool's configuration with the delimiter set to "\0" which signifies no delimter. You'll also want to update the field length to something larger than the default 254. I usually just add some 0's, but adjust to whatever works for your use case.

Hope this helps!

Jimmy
Teknion Data Solutions

BrandonB · ‎10-29-2019

Great suggestion Jimmy! I was going to talk through a transpose, filter out null, then cross tab back with concatenate as the aggregation but reading the fields in without the delimiter in the first place is perfect!

Hans020 · ‎10-29-2019

Hi,

While experimenting, I discovered basically what you are suggesting Brandon: see screenshot.

I will look into the other suggestion, might be bit easier.

Thanks,

Hans

BrandonB · ‎10-29-2019

Yep! Summarize at the end would work for this purpose as well as you have found. The other suggestion is even a bit better because it reads the file in without delimiters and therefore all of the fields will be comma separated from the beginning.

Alteryx Designer Desktop Discussions

Use of Dynamic Input

Re: Date Time Function - Prioritization Base on Du...

Re: Running multiple alteryx workflows within alte...

Re: Selecting the columns coming after a specific ...

Re: Regex(?) formula to remove values matching the...

Re: Python ECC SAP Extract into Alteryx Workflow