Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!
The Product Idea boards have gotten an update to better integrate them within our Product team's idea cycle! However this update does have a few unique behaviors, if you have any questions about them check out our FAQ.

Alteryx Designer Desktop Ideas

Share your Designer Desktop product ideas - we're listening!
Submitting an Idea?

Be sure to review our Idea Submission Guidelines for more information!

Submission Guidelines

Automatic Pushdown Optimization

I find it a little strange that we need to utilize a different set of tools (the In-DB tools) in order to achieve push-down optimization. This is not the case, for instance (as far as I know), in SPSS, perhaps others.

 

My request is to automatically push logic to the database layer if and when it's possible to do so.

 

In order to maintain operation similar to existing, it could be implemented as an overall workflow setting (on or off) such that leaving it turned off would result in tools continuing to behave as they do today.  Turning it on simply enacts a process of combining as many tools as possible into SQL that is pushed to the database server, and assumes whatever permissions are necessary to make that happen; (e.g. ability to write temp tables or etc...)

 

Thanks!

 

6 Comments
SeanAdams
17 - Castor
17 - Castor

Hey John - so what you're suggesting is that you pre-optimize the SQL based on the workflow?

 

In other words - if I have a Select * from table; followed by a select tool that throws away all of the fields; followed by a filter that throws away 90% of the rows - the question is "Can we parse the workflow to spot this, and then query the data more intelligently by pushing these filters and column selects down into the query".

 

I think that this is a great suggestion, and would allow people to get far better performance out of their flows!

 

I've suggested something similar (some related links below) to build the parser which reverse compiles the Alteryx flow to find which fields are never used and suggests removing them from the flow (see first link below) - with your suggestion then pushing this down to the server this would be very powerful.

 

Related ideas that may be worth supporting too:

 

 

JohnJPS
15 - Aurora

Hi Sean,

I guess I hadn't thought much about the implementation details, but yes: regardless of the initial query in the input tool; all preceeding tools simply becomes subqeries for the current tool; thus a simple workflow with INPUT TOOL > SELECT > FILTER > SUMMARY > SORT > TOP 100 might look like this behind the scenes:

SELECT top 100 * FROM (
    SELECT * FROM (
        SELECT sum(d.col1) as sumd1, d.col2, d.col3 FROM (
            SELECT * FROM (
                SELECT b.col1, b.col2, b.col3 FROM (
                    SELECT * FROM (
                        "INPUT TOOL QUERY"
                    ) as a
                ) as b
            ) as c
            WHERE "FILTER TOOL OPTIONS"
        ) as d
        GROUP BY d.col2, d.col3
    ) as e
    ORDER BY e.sumd1, e.col2, e.col3
) as f

It wouldn't be trivial to implement given that every database will have different syntax... but it can be done with existing In-DB tools, so it would be nice if it were simply "built-in".

SeanAdams
17 - Castor
17 - Castor

it's interesting @JohnJPS - if you unpick how the inDB connector creates queries - it does exactly this.   It doesn't nest it, but creates temporary aliases to do the same thing (same outcome, it just reads more linearly, and probably is easier for the SQL optimizer to work through).

 

I'm thinking something more along the lines of a code-scanner.   We built one of these for VB6 once - where it would check for all the variables declared, and then scan backwards from the end of any given code-scope to see if that variable was used.    you could do the same thing here  if a variable is untouched between read, and a filter which is 20 steps down the flow - you could then back-propagate the filter backwards.

 

@Ned - I presume that you and the team have looked at this kind of back-propagation as part of your ongoing improvement work - does this idea make functional sense?   If it's worth talking this through in more detail, would be happy to spend the time.

 

Cheers

Sean

JohnJPS
15 - Aurora

@SeanAdams,

Cool stuff. My example is based on what I see Hibernate/NHibernate doing in terms of auto-generated query strings.

 

Your backprop optimizer is interesting beyond push-down: reducing unnecessary data would help performance, no doubt... I'm tempted to think Alteryx already does this: I seem to recall observing filters reducing row counts earlier than expected... not sure though (and posting from my phone so can't recreate here bow to validate my memory).

 

Fun stuff.  Cheers,

John

Community_Admin
Alteryx
Alteryx
Status changed to: Inactive
 
Community_Admin
Alteryx
Alteryx

The status of this idea has been changed to 'Inactive'. This status indicates that:

 

1. The idea has not had activity in the form of likes or comments in over a year.

2. The idea has not reached ten likes.

3. The idea is still in the 'New Idea' status. 

 

However, this doesn't mean your idea won't be implemented! The Community can still like and comment on this idea. With enough renewed interest, this idea can be brought back into the 'New Idea' status. 

 

Thank you for contributing to the Alteryx Community and the Alteryx Product Idea Boards!