This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
We recently purchased Alteryx to add to our Analytical Stack and we're loving the toolset, our analytical stack has Hortonwork Hadoop, Teradata Database, Teradata Aster and Microsoft SQL Server as data sources and we've been able to easily write data in and out of Hadoop, Teradata Database and SQL Server, but we have issues when attempting to write into Aster.
Creating a table to write into is problematic as the ODBC Driver supplied by Aster doesn't seem to include indications for Alteryx to add the "distribute by" either hash or replication.
Writing into a table throws up an error, appending into a table simply fails to run
Has anyone managed to combine the two together and be able to write data into Aster from an Alteryx analytical workflow? The whitepapers indicate this should be plausible but there doesn't seem to be a way from what I've seen to do this.
I'm not presenting myself as someone who has much experience with Aster, but below are some comments from some Alteryx-internal emails that might help...
Loading data to Aster is a challenge because it requires the uploaded data to be distributed across all of the worker nodes. So every SQL state needs a DISTRIBUTE BY HASH clause at the end of the SQL statement.
Since this is not included in our standard ODBC output implementation, it causes the loading process to typically fail and not load all the data. It will load some data (very slowly) but ultimately fail on a standard Create Table approach.
What folks can do to get around is issue a Pre SQL statement to create the table in Aster with the DISTRIBUTE BY HASH clause and then upload the data using Append Existing as the Output option. This is not efficient because you’ll need to know the table structure for the Create Table statement in order for this to work.
With all that said, the best way (fastest) to load data to Aster from Alteryx is via the Run Command tool where we call the ncluster_loader.exe and feed some command arguments to load the data.
And with a question related to reading in data from another database and inputting to Aster...
Because Aster is not a relational database platform, our connection using the ODBC connection isn’t the greatest for writing out lots of data. Thus going directly from a relational DB to Aster could have some issues. (This is not a 10.0 or 9.5 Alteryx issue…more around how we handle the Aster ODBC driver.)
I have attached a ‘template’ that shows how to write data to Aster using the ncluster_loader.exe file that is from Aster within a Run Command tool.
One other idea that was posed was to actually write directly from SQL Server to a stage table in Teradata and then import from there (my understanding is that TD has some functionality that allows the interaction between TD and Aster to happen faster).
I have attached the example workflow referenced above.