Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!
The Product Idea boards have gotten an update to better integrate them within our Product team's idea cycle! However this update does have a few unique behaviors, if you have any questions about them check out our FAQ.

Alteryx Designer Desktop Ideas

Share your Designer Desktop product ideas - we're listening!
Submitting an Idea?

Be sure to review our Idea Submission Guidelines for more information!

Submission Guidelines

In-database and output tool : Partition support

Hello,

According to wikipedia :

 

A partition is a division of a logical database or its constituent elements into distinct independent parts. Database partitioning is normally done for manageability, performance or availability reasons, or for load balancing. It is popular in distributed database management systems, where each partition may be spread over multiple nodes, with users at the node performing local transactions on the partition. This increases performance for sites that have regular transactions involving certain views of data, whilst maintaining availability and security. 

 

 

Well, basically, you split your table in several parts, according to a field. it's very useful in term of performance when your workflows are in delta or when all your queries are based on a date. (e.g. : my table helps me to follow my sales month by month, I partition my table by month).

So the idea is to support that in Alteryx, it will add a good value, especially in In-DB workflows.

Best regards,

Simon

4 Comments
adriansilk
6 - Meteoroid

Simon, SQL table partitioning is an architectural principal - I'd buy one of your DBAs a coffee in exchange for an explanation!

 

Partitioning is designed into the physical design of the data table (so that) different tranches of data can exist potentially even in different file systems. The In-Database tools in Alteryx exist for blending, filtering etc data on the server, and they should never need or be able to physically change the underlying storage of the data tables.

 

i_love_databases
8 - Asteroid

Hello @adriansilk ,

 

For your information, on 90% of my Alteryx projects, I create a table on a database (oracle, hive, vertica...), either in in-memory or in-db. Then, I query this "front" table with tools such as Tableau.

 

When you use in-db category or worflows with several steps, you have also temporary tables or intermediate level tables.

On all these use cases, partitioning would benefit either an other alteryx workflow either a data visualization query.

So, of course, the idea is to support partitionning when creating a table, as an option.

Best regards,

 

Simon

AlteryxCommunityTeam
Alteryx Community Team
Alteryx Community Team
Status changed to: Accepting Votes
 
JMB001
8 - Asteroid

Upvote!

 

We use BigQuery a lot for pulling and pushing data in workflows, and every instance with Write In-DB and Output tool using ODBC (Simba Driver) with Overwrite mode will always overwrite the partitioning and clustering set-up in the table in BigQuery and thus remove it. This behavior makes sense since you are essentially pushing a DDL statement into BQ and re-creating the table. 

 

I've tried injecting this back in using Post-SQL in the Output tool but the latest version of the Simba Driver (2.5.2) always throws a "Operation is not allowed in this transaction state" even on the simplest of workflows. This is a separate problem which needs to be addressed because it seems the Google pushed a driver upgrade without consulting anyone. (Typical?)

 

Since the upgraded version of Alteryx has the Bulk (read: big data) feature for BigQuery, it would seem appropriate to incorporate Partitioning and Clustering as features of the Output tool.

 

Thanks,

Joe