Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Knowledge Base

Definitive answers from Designer Desktop experts.

In-Database Sorting

WayneWooldridge
Alteryx Alumni (Retired)
Created

How to sort In-Database


Suppose you're using the In-DB tools and at some point you want to sort your data... in database. You notice there isn't a Sort In-DB tool available. Do you need to stream your data out, sort your data and then stream your data back in to the database? You could, however you don't have to. You can get the same results using the Sample In-DB tool .


Prerequisites

   

Procedure


In the example below, we're connecting to a Teradata database and reading a retail customer file containing 1500 records but the process is database independent. The Sample In-DB configuration is the important piece to the solution.

  1. Select 'Percent' in the dropdown and sample 100%.
  2. Check the 'Sample records based on order' checkbox
  3. Select the field(s) want to sort and what order you want the data sorted (Ascending/Descending).

image.png

 

  • Please note: If you decide to stream your data out as some point, you have the option of sorting your data in the Data Stream Out tool as well. See below:

image.png

  1. Next, Attach a Browse In-DB tool to your Sample In-DB, set the 'Browse first N records' to 0. This will display all records in the order you specified. At the time of this writing, any value other than 0 may not display results in the proper sort order (solution pending).

image.png


The attached sample workflow is an example showing the tools used to apply sorting. Please note you need to configure the In-DB Connect tool to your specific database environment and set field and other parameters to requirements.

 


Additional Resources

Attachments
Comments
dataMack
12 - Quasar
This is a nice workaround that will certainly save processing time.
jennyaster93
6 - Meteoroid

Hi,

 

When I use sample 100% to order my table, the in-db browse tool was not showing ordered table after sorting. Do you know why it happens like that?

davidhenington
10 - Fireball

did the ability to sample in-db by % of records go away? I only see number as an option. 

 

Or am I missing something? It is Friday..

davidhenington
10 - Fireball

 Seems likely that this is because i'm using Greenplum and either it's just not possible, or it got missed in dev. 

JessicaS
Alteryx Alumni (Retired)

Hi @davidhenington,

 

Can you kindly send us an email at support@alteryx.com so we can get a case created for you and work with you to investigate?

 

Thanks,

bryanbumgardner
8 - Asteroid

Yeah, jumping in to bump this. It appears that the percent option has been taken away in version 11.8. 

 

Please bring this back! 

HenrietteH
Alteryx
Alteryx

Hi @bryanbumgardner

I can see the option in 2018.1: 

3-27-2018 10-06-35 AM.png

 

Can you please open a case with support by emailing support@alteryx.com so we can work with you to investigate? Please include details on the database you are connecting to and version of Alteryx. 

 

Thank you! 

shoeb
5 - Atom

I am using Alteryx Server version 2018.2. I do not see the ability to sample in-db by % of records

 

 

HenrietteH
Alteryx
Alteryx

@shoeb

 

Not all databases support sampling by % of records which is why it is not always available for the in-db tools. 

Thomas_simonet
7 - Meteor
@HenrietteH I'm using Alteryx 10.6 and connecting to Hive, and I don't see the percentage option. Is it normal ? If it's not supported by Hive, what other way is there to sort results in-db ?
HenrietteH
Alteryx
Alteryx

@Thomas_simonet

 

As mentioned above, sorting by percent is not available for all databases. 

 

You can still use the Sample In-DB tool to sort but only for a sample based on a total # of records to sample. You can also use the Data Stream Out tool to sort data before streaming it out of the database. 

 

davidhenington
10 - Fireball

So the need to sort in-db brought me back to this thread, which reminds me that not only can you not sample by % in-db with greenplum, bulk load is also not available. 

 

It kinda feels like the easiest path was sought in delivering in-db for greenplum. We didn't get the "full" in-db experience. That's disappointing. 

 

In this use case, if you want to use this workaround and you know the record count, you can just choose more records in the sample than exist in the data set. 

Gualigee
8 - Asteroid

搜狗截图20201030064030.png

we can use stream data out tool to sort, right ?