Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!
The Product Idea boards have gotten an update to better integrate them within our Product team's idea cycle! However this update does have a few unique behaviors, if you have any questions about them check out our FAQ.

Alteryx Designer Desktop Ideas

Share your Designer Desktop product ideas - we're listening!
Submitting an Idea?

Be sure to review our Idea Submission Guidelines for more information!

Submission Guidelines

In-Database Sort Tool

There is a need when visualizing in-Database workflows to be able to visualize sorted data. This sorting could be done 1 of 2 ways: In a browse tool, or as a stand-alone Sort tool. Either would address the need. Without such a tool being present, the only way to sort the data is to "Data Stream Out" and then visualize the data in Alteryx. However, this process violates the premise of the usefulness of the in-DB toolkit, which is to keep your data in-DB and process using the DB engine. Streaming out big data in order to add a sort is not efficient.

 

Granted, the in-DB processing doesn't care whether data is sorted or not. However, when attempting to find extreme values after an aggregation, or when trying to identify something as simple as whether null values are present in a field, then a sort becomes extremely useful, and a necessary tool for human consumption of data (regardless of the database's processing needs).

 

Thanks very much for hearing my idea!

9 Comments
Atabarezz
13 - Pulsar

Sorting is probably the most resource consuming task and the frequently used of all heavy tasks... If the data is huge it is a must to have the heavy sorting operation to be handled by the remote DB machine.

 

Though there is a handy capability, you can use the sample tool and select %100 of the samples and sort at the same time...

It doesn't sound like the might userfriendly Alteryx tool but htere you go,

 

here is a community post for the workaround --> http://community.alteryx.com/t5/Alteryx-Knowledge-Base/In-Database-Sorting/ta-p/13935

 

Best 

ARich
Alteryx Alumni (Retired)
Status changed to: Not Planned

Hi @zdavis,

 

Thanks for the idea. At this time, since there's a simple workaround via the Sample In-DB tool or Data Stream Out tool, we're not considering adding a separate sort tool to the In-DB tools.

 

Best,

Alex

bradley_slaughter
8 - Asteroid

@ARich  The work-around you reference doesn't seem to be an option for IBM Netezza users.  I have always seen the "percentage" option on the sample in-db tool.  When I try to pass data from Netezza into that tool, it removes the Percentage option.

 

Capture.PNG

Atabarezz
13 - Pulsar

From time to time SQL server refuses to sort data too...

davidhenington
10 - Fireball

@ARich  sort via stream out is not really an option because you're no longer in-db. 

 

As has been established, neither is using sample, since sample by % is not supported in all in-db database types. 

 

The biggest thing Alteryx could do right now is bring all in-db platforms to parity! 

bradley_slaughter
8 - Asteroid

One thing I've learned for sorting is that you can use the IN-DB Formula tool to sort if you use ROW_NUMBER (or the comparable formula for your database).  You can use that formula tool to create a unique row number for each record then order by the field you want.  

 

ROW_NUMBER () OVER (ORDER BY 'field1' desc)

 

 

gabrielvilella
14 - Magnetar

@AlexRi sample by % is not supported in all in-db database types, therefore there is no way to sort in all scenarios. I am using Redshift and this is not supported for some reason. The reason you marked this idea as Not Planned is not valid. 

C3PO
Alteryx Alumni (Retired)

This is not ideal, but it looks like the number limit for the in-db sample tool is 999,999,999...this should work for many user scenarios...

kaianderson
8 - Asteroid

I agree, I'm working with a Hadoop database and I cannot use the sample tool to sort. Sorting via the Data Stream Out tool is too late in the process to improve efficiency. Agree that this needs to be reopened.