Alteryx Designer Desktop Knowledge Base

WayneWooldridge · ‎02-19-2016

How to sort In-Database

Suppose you're using the In-DB tools and at some point you want to sort your data... in database. You notice there isn't a Sort In-DB tool available. Do you need to stream your data out, sort your data and then stream your data back in to the database? You could, however you don't have to. You can get the same results using the Sample In-DB tool .

Prerequisites

Alteryx Designer
- Versions All
In-Database
- Sample In-DB Tool

Procedure

In the example below, we're connecting to a Teradata database and reading a retail customer file containing 1500 records but the process is database independent. The Sample In-DB configuration is the important piece to the solution.

Select 'Percent' in the dropdown and sample 100%.
Check the 'Sample records based on order' checkbox
Select the field(s) want to sort and what order you want the data sorted (Ascending/Descending).

Please note: If you decide to stream your data out as some point, you have the option of sorting your data in the Data Stream Out tool as well. See below:

Next, Attach a Browse In-DB tool to your Sample In-DB, set the 'Browse first N records' to 0. This will display all records in the order you specified. At the time of this writing, any value other than 0 may not display results in the proper sort order (solution pending).

The attached sample workflow is an example showing the tools used to apply sorting. Please note you need to configure the In-DB Connect tool to your specific database environment and set field and other parameters to requirements.

Additional Resources

dataMack · ‎02-19-2016

This is a nice workaround that will certainly save processing time.

jennyaster93 · ‎07-05-2017

Hi,

When I use sample 100% to order my table, the in-db browse tool was not showing ordered table after sorting. Do you know why it happens like that?

davidhenington · ‎12-15-2017

did the ability to sample in-db by % of records go away? I only see number as an option.

Or am I missing something? It is Friday..

davidhenington · ‎12-15-2017

Seems likely that this is because i'm using Greenplum and either it's just not possible, or it got missed in dev.

JessicaS · ‎12-28-2017

Hi @davidhenington,

Can you kindly send us an email at support@alteryx.com so we can get a case created for you and work with you to investigate?

Thanks,

bryanbumgardner · ‎03-26-2018

Yeah, jumping in to bump this. It appears that the percent option has been taken away in version 11.8.

Please bring this back!

HenrietteH · ‎03-27-2018

Hi @bryanbumgardner

I can see the option in 2018.1:

3-27-2018 10-06-35 AM.png

Can you please open a case with support by emailing support@alteryx.com so we can work with you to investigate? Please include details on the database you are connecting to and version of Alteryx.

Thank you!

shoeb · ‎08-21-2018

I am using Alteryx Server version 2018.2. I do not see the ability to sample in-db by % of records

HenrietteH · ‎08-21-2018

@shoeb

Not all databases support sampling by % of records which is why it is not always available for the in-db tools.

Thomas_simonet · ‎12-05-2018

@HenrietteH I'm using Alteryx 10.6 and connecting to Hive, and I don't see the percentage option. Is it normal ? If it's not supported by Hive, what other way is there to sort results in-db ?

HenrietteH · ‎12-05-2018

@Thomas_simonet

As mentioned above, sorting by percent is not available for all databases.

You can still use the Sample In-DB tool to sort but only for a sample based on a total # of records to sample. You can also use the Data Stream Out tool to sort data before streaming it out of the database.

davidhenington · ‎04-24-2019

So the need to sort in-db brought me back to this thread, which reminds me that not only can you not sample by % in-db with greenplum, bulk load is also not available.

It kinda feels like the easiest path was sought in delivering in-db for greenplum. We didn't get the "full" in-db experience. That's disappointing.

In this use case, if you want to use this workaround and you know the record count, you can just choose more records in the sample than exist in the data set.

Gualigee · ‎10-30-2020

搜狗截图20201030064030.png

we can use stream data out tool to sort, right ?

Alteryx Designer Desktop Knowledge Base

In-Database Sorting

How to sort In-Database

Prerequisites

Procedure

Additional Resources