cancel
Showing results for 
Search instead for 
Did you mean: 
EXTENDED DEADLINE | Share your story at Inspire 2018! Accepting Call for Speakers Now - Submit by January 26th to qualify.

Parse column from Hadoop using flat file layout

SOLVED
David-UHG
Meteoroid

I had designed a somewhat complex alteryx flow which took the latest .txt file in a NAS drive. This file has several lines that need parsed, and based of the lines record key, has specific string lengths. I built using a dynamic input tool, and everything worked out great with the template and multi row logic for looping through the specific record keys.

 

However, the long term solution was to store these files in Hadoop. I sweep Hadoop by pulling the latest load datetime stamp. When I try to use the dynamic input tool and connect to the Hadoop instance, there is no File/Field Layout selection to bring in the .flat file that specifies what the fixed length for each individual file. What am I missing here? I searched the community, and I apologize if there is a subject on this already.

 

Many Thanks,

David

Alteryx
Alteryx

Hi @David-UHG

Currently we only support csv and avro files for HDFS which is why you are not seeing the option to map in the layout in the .flat file. 

 

Is it possible to store your files in avro format instead? 

 

Henriette Haigh
Sr. Customer Support Engineer, Alteryx
David-UHG
Meteoroid

Thanks Henriette,

 

In anticipation of getting that answer, I have an output tool for this workflow as an ASC file on a secured NAS drive from the Hadoop column. I can then bring that into my parsing workflow that uses the dynamic input tool. I will then need to join back to the original Hadoop table to bring in all the other columns that are associated to the newly parsed rows. Now what I will need to do is work with our IT department to somehow allow us to install the runner and conditional runner macro. Unfortunately, I don't think this will be easy.

 

Unfortunately, the original flat file is located on a highly secured Linux server of which I don't/can't get access to.

 

Two part question:

1 - Any future plans to add fixed format files to other data sources? I know I can create formulas and trim per the fixed format spec, but that can be very tedious depending on the amount of tables/columns per use case.

 

2 - Any future plans to bring in the runner macro or something similar as native in Alteryx?

 

My apologies for adding another question that will not be easily searchable.

 

Many Thanks,

David

Alteryx
Alteryx

Hi @David-UHG

 

I can't answer road map questions. but you can post both of those to our ideas forum where the product owners will look for things our users request. Looks like somebody already asked for the crew macros to be included in the product here. Feel free to chime in on that discussion. 

 

Also, if you have desktop automation enabled or a server install, you don't have to rely on the runner macros to kick off workflows, you can use the events in the workflow itself to kick off another workflow (it's a little fussier than the runner macros but works without installing anything else). This article explains how to do that. 

Henriette Haigh
Sr. Customer Support Engineer, Alteryx