Let’s talk Alteryx Copilot. Join the live AMA event to connect with the Alteryx team, ask questions, and hear how others are exploring what Copilot can do. Have Copilot questions? Ask here!
Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Converting parquet data file to csv

knobsdog
8 - Asteroid

I have a workflow that is currently run from a csv file but the incoming file is set to change to a parquet file in a month or so.  Is there a way to take that parquet file and convert it to a csv file?  I have read through the "Will It Alteryx" article about using parquet files in an Hadoop connection but this is different b/c it's being sent as a file we won't actually be connecting to Hive/Hadoop at all.  I appreciate you guys looking at this question.

4 REPLIES 4
Emil_Kos
17 - Castor
17 - Castor
knobsdog
8 - Asteroid

Not sure why my reply didn't stick but I'll send it again.  I've looked through these articles but they don't address what I'm looking for.  I am not connecting to Hadoop/Hive and pulling down parquet files.   I have a vendor sending me data in a parquet file format via email, similar to if they emailed me a csv file, and I need to convert it to csv.  I'm not sure if Alteryx can do that or not but I'm hoping someone has used it for this before.

Emil_Kos
17 - Castor
17 - Castor

Hi @knobsdog,


Ok I got it now.


This article mentioned a scenario like this

 

https://russellchristopher.com/alteryx-and-parquet-sure-why-not/

 

There is a link to a github with information that this workflow will help you:

 

https://github.com/russch/alteryx-parquet/blob/master/parquet_to_csv.yxmd

 

I didn't download any workflow from GitHub for a long time but if I remember correctly if you will download it the workflow should work. 

 

 

 

Aguisande
15 - Aurora
15 - Aurora

Hi @knobsdog 

You may try pandas' read_parquet (https://pandas.pydata.org/docs/reference/api/pandas.read_parquet.html) and to_parquet (https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_parquet.html) functions. They should work for your use case.

Labels
Top Solution Authors