community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx Designer Knowledge Base

Definitive answers from Designer experts.
Community v19.9

The latest release includes several enhancements designed to improve your Community experience!

Learn More

In-Database Sorting

Alteryx
Alteryx
Created on

Suppose you're using the In-DB tools and at some point you want to sort your data... in database.  You notice there isn't a Sort In-DB tool available.  (NOTE: you're not alone... see http://community.alteryx.com/t5/Alteryx-Product-Ideas/In-Database-Sort-Tool/idi-p/13160).  Do you need to stream your data out, sort your data and then stream your data back in to the database?  You could.  But you don't have to.  You can get the same results using the Sample In-DB tool.  (Help regarding how to configure the Sample In-DB tool is available here: http://help.alteryx.com/current/index.htm#LockInSample.htm?Highlight=in-db%20sort).

 

In the example below, we're connecting to a Teradata database and reading a retail customer file containing 1500 records but the process is database independent.  The Sample In-DB configuration is the important piece to the solution.  Select 'Percent' in the dropdown and sample 100%.  Check the 'Sample records based on order' checkbox and select the field(s) want to sort and what order you want the data sorted (Ascending/Descending).

 

Sort In-DB Sample 6.png

 

If you decide to stream your data out as some point, you have the option of sorting your data in the Data Stream Out tool as well.

 

Sort In-DB Sample 7.png

 

If you attach a Browse In-DB tool to your Sample In-DB, set the 'Browse first N records' to 0.  This will display all records in the order you specified.  At the time of this writing, any value other than 0 may not display results in the proper sort order (solution pending). 

 

Sort In-DB Sample 8.png

 

 

The attached workflow will need to be configured to your specific database environment and logon credentials to work properly. 

Attachments
Comments
Quasar
Quasar
This is a nice workaround that will certainly save processing time.
Meteoroid

Hi,

 

When I use sample 100% to order my table, the in-db browse tool was not showing ordered table after sorting. Do you know why it happens like that?

did the ability to sample in-db by % of records go away? I only see number as an option. 

 

Or am I missing something? It is Friday..

 Seems likely that this is because i'm using Greenplum and either it's just not possible, or it got missed in dev. 

Moderator
Moderator

Hi @davidhenington,

 

Can you kindly send us an email at support@alteryx.com so we can get a case created for you and work with you to investigate?

 

Thanks,

Yeah, jumping in to bump this. It appears that the percent option has been taken away in version 11.8. 

 

Please bring this back! 

Alteryx
Alteryx

Hi @bryanbumgardner

I can see the option in 2018.1: 

3-27-2018 10-06-35 AM.png

 

Can you please open a case with support by emailing support@alteryx.com so we can work with you to investigate? Please include details on the database you are connecting to and version of Alteryx. 

 

Thank you! 

Atom

I am using Alteryx Server version 2018.2. I do not see the ability to sample in-db by % of records

 

 

Alteryx
Alteryx

@shoeb

 

Not all databases support sampling by % of records which is why it is not always available for the in-db tools. 

@HenrietteH I'm using Alteryx 10.6 and connecting to Hive, and I don't see the percentage option. Is it normal ? If it's not supported by Hive, what other way is there to sort results in-db ?
Alteryx
Alteryx

@Thomas_simonet

 

As mentioned above, sorting by percent is not available for all databases. 

 

You can still use the Sample In-DB tool to sort but only for a sample based on a total # of records to sample. You can also use the Data Stream Out tool to sort data before streaming it out of the database. 

 

So the need to sort in-db brought me back to this thread, which reminds me that not only can you not sample by % in-db with greenplum, bulk load is also not available. 

 

It kinda feels like the easiest path was sought in delivering in-db for greenplum. We didn't get the "full" in-db experience. That's disappointing. 

 

In this use case, if you want to use this workaround and you know the record count, you can just choose more records in the sample than exist in the data set.