community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx Designer Ideas

Share your Designer product ideas - we're listening!

Make In-DB Queries much more efficient (dont nest everything or use Select *!)

I reported this to the support team but was told it was by design and to post here.

 

In-DB Inefficient SQL

I would like to report that the In-DB tools are generating horribly inefficient SQL code for simple operations.  It seems no matter what tools you use every statement is starting with a nested 'Select * From'.

 

Example Simple workflow:

  Support1.jpgSupport2.jpg

 

This is a simple Select and Group by but the SQL Generated is:

 

SELECT "ShipTo", "ShipTo_Name", SUM("ECM_3PL_OVERHEADS_Unit") AS "Sum_ECM_3PL_OVERHEADS_Unit"

FROM (SELECT * FROM "_SYS_BIC"."shell.app.gsap.FL000_LSC.FL002_CTS.INT.RPT/CA_CTS_RPT_MAIN_001") AS "a"

GROUP BY "ShipTo", "ShipTo_Name"

 

This is taking a very long time to execute:

 

Statement 'SELECT "ShipTo", "ShipTo_Name", SUM("ECM_3PL_OVERHEADS_Unit") AS "Sum_ECM_3PL_OVERHEADS_Unit" FROM ...'

successfully executed in 15.752 seconds  (server processing time: 15.699 seconds)

 

Whereas if I take the same query and remove the nested Select *:

 

SELECT "ShipTo", "ShipTo_Name", SUM("ECM_3PL_OVERHEADS_Unit") AS "Sum_ECM_3PL_OVERHEADS_Unit"

FROM "_SYS_BIC"."shell.app.gsap.FL000_LSC.FL002_CTS.INT.RPT/CA_CTS_RPT_MAIN_001" AS "a"

GROUP BY "ShipTo", "ShipTo_Name"

 

It is very quick:

 

Statement 'SELECT "ShipTo", "ShipTo_Name", SUM("ECM_3PL_OVERHEADS_Unit") AS "Sum_ECM_3PL_OVERHEADS_Unit" FROM ...'

successfully executed in 1.211 seconds  (server processing time: 1.157 seconds)

 

So Alteryx is generating queries up to x13 slower than they should be thereby defeating the point of using In-DB.  As you can imagine in a workflow where we have multiple Connect In-DB tools this is a really substantial amount of time.  Example used above is from SAP HANA DB has 1.9m rows and ~90 columns but we have much bigger tables/views than this.

 

If you look you will see its same behaviour for all In-DB tools where each tool creates another nested Select with its particular operator.

 

MY SUGGESTION:

So my suggestion is that Alteryx should combine the SQL of the first few tools and avoid using SELECT * completely unless no Select tools have been used.  So it should combine:

- Connect In-DB + Select

- Connect In-DB + Filter

- Connect In-DB + Summarise

 

Preferably it should combine/flatten everything up until the first join or union.  But Select + Filter are a must!

 

Note it seems some DB's can cope OK with un-nesting these big nested queries in the query plans for some Tables but normally not for Views.  But some cannot cope at all and so the In-DB tools cannot even be used to Browse 100 records (due to select *).

4 Comments

Totally agree. X13 the speed really defeats the object of moving to In-DB tools.

Alteryx Certified Partner
Alteryx Certified Partner

+1

Meteoroid

+1, please simplify the SQL generation. The purpose of Alteryx is to simplify the ETL process. The code needs to be optimized for a copy and paste into whatever ETL engine we use.

Alteryx
Alteryx

+1