Alteryx Designer Ideas

Share your Designer product ideas - we're listening!
Submitting an Idea?

Be sure to review our Idea Submission Guidelines for more information!

Submission Guidelines
Announcement | We'll be doing maintenance between 2-3 hours, which may impact your experience. Thanks for your patience as we work on improving the community!

Hive : how to get faster the metadata

As you may know, the interrogation of Hive to get the Metadata is actually very slow on Alteryx

 

A first step of improvement (at least in the Visual Query Builder) has been proposed here

Smartest VQB

 

But the real issue for Hive is that the way Alteryx queries the Metadata : it passes "Show table" queries for all the databases. On our cluster, it means more than 400 queries that last each avout 0.5 seconds. The user has to to wait about 4 minutes.

A solution : using an API in java to ask the Hive metastore if it exists (it may be an other tab in the In database configuration). Our cluster admin has an example of a Thrift API in java that we can give you.

Result : 2 seconds for a 38700 tables in more than 500 databases !!

2 Comments
simonaubert_bd
10 - Fireball

Upgrade :
according to these sites :
https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.4/using-hiveql/content/hive_query_information_sc...

 

https://www.adaltas.com/fr/2019/07/25/hive-3-fonctionnalites-conseils-astuces/

 

in Hive 3.xxx, we can query the information_schema to retrieve fastly the metadata instead of show table

 

Does Alteryx plan to take that into account?

steven4320555
8 - Asteroid

Just come across this great idea by @dataprep  and informative update by @simonaubert_bd !

Is there any updates for this?