The Product Idea boards have gotten an update to better integrate them within our Product team's idea cycle! However this update does have a few unique behaviors, if you have any questions about them check out our FAQ.

Alteryx Designer Desktop Ideas

Share your Designer Desktop product ideas - we're listening!
Submitting an Idea?

Be sure to review our Idea Submission Guidelines for more information!

Submission Guidelines

Hive : how to get faster the metadata

As you may know, the interrogation of Hive to get the Metadata is actually very slow on Alteryx

 

A first step of improvement (at least in the Visual Query Builder) has been proposed here

Smartest VQB

 

But the real issue for Hive is that the way Alteryx queries the Metadata : it passes "Show table" queries for all the databases. On our cluster, it means more than 400 queries that last each avout 0.5 seconds. The user has to to wait about 4 minutes.

A solution : using an API in java to ask the Hive metastore if it exists (it may be an other tab in the In database configuration). Our cluster admin has an example of a Thrift API in java that we can give you.

Result : 2 seconds for a 38700 tables in more than 500 databases !!

3 Comments
simonaubert_bd
13 - Pulsar

Upgrade :
according to these sites :
https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.4/using-hiveql/content/hive_query_information_sc...

 

https://www.adaltas.com/fr/2019/07/25/hive-3-fonctionnalites-conseils-astuces/

 

in Hive 3.xxx, we can query the information_schema to retrieve fastly the metadata instead of show table

 

Does Alteryx plan to take that into account?

steven4320555
8 - Asteroid

Just come across this great idea by @dataprep  and informative update by @simonaubert_bd !

Is there any updates for this? 

AlteryxCommunityTeam
Alteryx Community Team
Alteryx Community Team
Status changed to: Accepting Votes