community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx Designer Ideas

Share your Designer product ideas - we're listening!

Add Parquet data format as input & output

Please add Parquet data format (https://parquet.apache.org/) as read-write option for Alteryx.

 

Apache Parquet is a columnar storage format available to any project in the Hadoop ecosystem, regardless of the choice of data processing framework, data model or programming language.

 

Thank you.

 

Regards,

Cristian.

12 Comments
Alteryx Partner

Parquet format is getting increasingly popular in the hadoop world day by day.Ability to read/write this format from/onto NFS, HDFS storage adds a lot of value to the product.

 

Thanks,

Sandeep.

Meteoroid

Hi Guys.

 

Any progress since 2015?

 

You'll really benefit of supporting parquet for BDE.

Nebula
Nebula

Hey all,

Not sure about Parquet directly, but we have successfully tested using a Kudu database (which is also a columnar database in the Apache stack) and also using Spark SQL.   That may give you a route into Parquet?

Hi we are just now evaluating Alteryx and I was curious as to how to add parquet as a file input/output format?

 

Thanks

Alteryx Alumni (Retired)

Hi @Rajabhathor,

 

Data in parquet format can be stored in hive tables and accessed from the Alteryx Designer via the hive ODBC driver.


Create a table in hive with "STORED AS PARQUET" for hive 0.13 and later.
Alteryx can read and write data from these tables with the hive ODBC driver.

Check the create table syntax in this article


For files already stored in the "PARQUET" format in HDFS, use "LOAD DATA" to load the data in the HDFS file to a table in hive.


To write results of an Alteryx workflow back to a hive table in the PARQUET format, use ""hive.default.fileformat=PARQUET” in the Server Side Properties ODBC driver configuration

 

Hope these help.

Meteoroid

Thank you, Durga S.

Unfortunately, this is a very weak suggestion. ODBC is capable of a simple things but the need is to upload cca 50 Gb file in compressed columnar storage format.

Alteryx Partner

Hi all, 

 

Do you have any updates about the subject ? We are looking forward to be able to read/write easily Parquet data format !

 

 

Alteryx
Alteryx
Status changed to: Not Planned

Hi,

 

Thanks for the idea. Other than the ODBC option mentioned by Durga we don't have plans to add parquet support as our engine is not optimized to handle columnar data at this time.

 

Best,

Alex

Comet

"as our engine is not optimized to handle columnar data at this time"

 

https://github.com/elastacloud/parquet-dotnet

Runs on all flavors of Windows, Linux, and mobile devices (iOS, Android) via Xamarin

 

 

Atom

@DurgaS,

 

We tried setting up server side properties for PARQET but somehow its not working Alteryx still creating table with Text format and not Parquet. We are using Hive 1.2 version. I tried writing in tables using both way (through IN DB and through Out Tool).

 

Can you suggest if any thing else need to change.