on 02-19-2016 08:23 AM - edited on 08-23-2021 11:29 AM by miteshnarottam
In the example below, we're connecting to a Teradata database and reading a retail customer file containing 1500 records but the process is database independent. The Sample In-DB configuration is the important piece to the solution.
The attached sample workflow is an example showing the tools used to apply sorting. Please note you need to configure the In-DB Connect tool to your specific database environment and set field and other parameters to requirements.
Hi,
When I use sample 100% to order my table, the in-db browse tool was not showing ordered table after sorting. Do you know why it happens like that?
did the ability to sample in-db by % of records go away? I only see number as an option.
Or am I missing something? It is Friday..
Seems likely that this is because i'm using Greenplum and either it's just not possible, or it got missed in dev.
Hi @davidhenington,
Can you kindly send us an email at support@alteryx.com so we can get a case created for you and work with you to investigate?
Thanks,
Yeah, jumping in to bump this. It appears that the percent option has been taken away in version 11.8.
Please bring this back!
I can see the option in 2018.1:
Can you please open a case with support by emailing support@alteryx.com so we can work with you to investigate? Please include details on the database you are connecting to and version of Alteryx.
Thank you!
I am using Alteryx Server version 2018.2. I do not see the ability to sample in-db by % of records
Not all databases support sampling by % of records which is why it is not always available for the in-db tools.
As mentioned above, sorting by percent is not available for all databases.
You can still use the Sample In-DB tool to sort but only for a sample based on a total # of records to sample. You can also use the Data Stream Out tool to sort data before streaming it out of the database.
So the need to sort in-db brought me back to this thread, which reminds me that not only can you not sample by % in-db with greenplum, bulk load is also not available.
It kinda feels like the easiest path was sought in delivering in-db for greenplum. We didn't get the "full" in-db experience. That's disappointing.
In this use case, if you want to use this workaround and you know the record count, you can just choose more records in the sample than exist in the data set.
we can use stream data out tool to sort, right ?