Alteryx Connect Ideas

AndreyBaburov · ‎07-20-2020

Our Data Catalogue in Connect has about 2 millions items (tables, views, columns).

I see next issues:

We collect metadata from about 10+ DBMS. So after each Metadata loader run, Alteryx Connect will start load_alteryx_db script and process whole staging area (DB_*) tables, not only current extracted metadata set from single DBMS. It will lead huge redundancy.
Follows from first issue: One-by-one comparison of loaded metadata will take a lot of time in real environment with 1-2 millions items (ordinary situation in large Bank). And this comparison will be executed several times. It will increase the redundancy in the number of DBMS servers.
All queries in this script containing column or table name as a parameter (e.g. src.TABLE_NAME='${query_table_name}' AND src.COLUMN_NAME='${query_column_name}') will be executed as many times as number of columns in Data Catalogue (millions times). It will work very slow because it executes a lot of queries.

Can you optimize somehow this process?

KylieF · ‎09-03-2020

Thank you for your feedback and idea!

We’re currently working diligently to insure all product ideas are reviewed and commented on by Alteryx when the necessary criteria are met. If you haven’t yet, check out our Submission Guidelines which go over the idea boards in a bit greater detail.

Alteryx Connect Ideas

Submitting an Idea?

Metadata loading optimization