Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Write In-DB (Hive) taking to much time

JRamos
7 - Meteor

Hi all!

I'm writing data from excel or alteryx database to Hive, some months ago was taking 6-10 hrs to write 3 millions of data, but now it's taking to much time and in 23 hours just write the 15%, my disk memory is available and I have optimized my workflow because I just need the data in Hive, also I need to use a VPN and Kerberos to do it, it's the work computer.

 

Could you guide me what am I doing wrong or what can I do to improve the time?

 

Workflow:

JRamos_0-1677165137478.png

 

Hive ODBC Driver Advanced Options:

JRamos_1-1677165205202.png

 

Manage In-DB Connections:

JRamos_2-1677165352845.png

 

Thanks in advance!!!

4 REPLIES 4
simonaubert_bd
13 - Pulsar

Hello @JRamos 

 

Usually i have two in db connections for Hive :
1/ with odbc for classic use (when data comes from in-db)
2/ the other one writing on HDFS instead of ODBC (when data comes from "in-memory" like an excel import. this is the solution you must use here. Time can be reduce by 5,10,20...

Best regards,

Simon

JRamos
7 - Meteor

Hi @simonaubert_bd,

 

Writing on HDFS like the next one? It is correct to use the same connection string as the read ODBC ONE?

 

JRamos_0-1677167032907.png

 

simonaubert_bd
13 - Pulsar

@JRamos About the hdfs : 

-no, it's not the same string at all. this is more like 

simonaubert_bd_0-1677167552260.png

 

Please contact your datalake admin for the exact path/configuration.

However, Alteryx must propose a windows like that when you choose HDFS Parquet and click on the black arrow

 

image.png

-use parquet, not avro

-use two indb alias like HIVE_PROD_ODBC and HIVE_PROD_HDFS since Alteryx does not distinguish creating a table and inserting data from in-memory. cf idea https://community.alteryx.com/t5/Alteryx-Designer-Ideas/In-DB-Connexion-windows-should-be-divided-in...

 

JRamos
7 - Meteor

Thank you @simonaubert_bd , I have changed the write driver with the exact path related to HDFS and now it's working!

Labels