We use the pre-sql statement of the input to set some parameters of connections. Sadly, we cannot do that in a in-db workflow. This would be a total game-changing feature for us.
That actually really good to know! The fact that we don't have the ability to do PreSQL is increasingly a problem and forcing us to consider other tools.
At our company, we have data in Hadoop queried via Hive or Impala. I use Indb tools for efficiency but some of the tables in the data schemas in Hadoop need to be REFRESHED by issuing a SQL command, prior to the queries being run. Since, the Indb tools don't have this pre-sql option, I use the input data tool first to run the refresh table in Pre-sql and then run the indb tools. This way, I don't have to manually intervene and can schedule the full workflow.
Use the Input Data tool (Pre-SQL is only available here at this time) just to run your pre-sql code. Then, you can switch to the Indb tools to run your queries. I suppose depending on how complex your pre-sql code is... this might work. See figure below.
Hello @KylieF Good to know, thanks for the update.
@mscuaycong I usually use pre-sql to pass set instruction linked to session, such as container size or queue in Hive. This won't work if I use a input tool because it will be two different sessions.
Thank you both. We have tried using that workaround and it works for some things but as @saubert pointed out it doesn't work for everything. It is truly an issue that we don't have a pre-SQL tool.
Having Pre-SQL for In-DB would be especially useful for connections to e. g. Snowflake, where you can use different warehouses to access different compute resources.
This links directly to the amount of money you pay while working / developing.
Right now the only workaround I see here would be to have different connections set up and choose the one you need, but that's annoying if you have a lot of options...
Functionalities like this is hugely beneficial, especially, when working with large datasets or centrally managed databases. Is there a roadmap for Alteryx's in-DB capability? What are the priorities?
I guess this idea is less database dependent, and once implemented, can benefit a lot of use cases.