I have connected my Alteryx Designer to a corporate Oracle database, I need to fetch datasets sometimes with more than 10 million records. The typical number is between 1 and 5 million. Nr. of columns is usually between 20 and 40.
I am testing now the connection, and it takes me around 50-60 minutes to read a dataset of 1.3 million records. This means that a larger set would not be available during a full working day if I started in the morning.
When I am using a competing similar software (KNIME), which uses the same driver (stored in a different folder of course), the same SQL command, the same sample took only 5 minutes.
Reading previous posts in the community, it seems this issue has never been really solved, more just different workarounds are being proposed. I've tried everything I could find here, but no real success.
Unfortunately this topic is a show stopper for me, unless I can get Alteryx to read around 10 million records per hour (or max 2h), I simply cannot use it in my work.
Any help would be appreciated, thanks in advance.
Solved! Go to Solution.
We had a POC last year comparing Powercenter informatica and Alteryx in combination with a Oracle db :-), feel free to send me a pm.
Greetings,
Seb
Jimmy,
I have a similar scenario to yours did you find a solution to this.
regards
Dear @Jimmy_Payyapilly ,
sorry for not getting back on this, but we have new colleagues in our team and we had to re-do this exercise a few times. Here's my learnings;
1. Install Oracle drivers from downloads.alteryx.com (it's called Simba after install)
2. Install Visual C++ Redistributable for Visual Studio 2015-2019 AND Oracle 19.25 client as instructed on this support page: Simba Oracle Driver gives Error 126
3. Search for ODBC data sources in your windows system
4. Select tab System DSN
5. Click once on Simba Oracle ODBC DSN, and click 'Configure'
6. Enter your connection details, test and click 'OK'.
7. In Alteryx Input-node, setup new connection, I used Oracle ODBC (and NOT the quick setup)
8. Enter 'Data Source Name' and select from the ODBC DSN -box the driver you just setup (should mostly be 'Simba Oracle ODBC DSN').
9. Build your workflow to test this, I ran it WITHOUT the AMP engine just to be sure. The performance was REALLY good, 2.5 million rows in 42 seconds. My colleague with a slightly better laptop got 34 seconds.
10. Somewhere in between always worth to restart the PC.
As a general note, setting up the ODBC -connection, it takes connection details directly from the driver, which is clearly the key here for the speed. If you setup the "quick setup", it will use connection details from the input node that you setup in Alteryx - which is much slower.
Have a nice day all. :-)
Carlos
Dear @drmonaali ,
please take a look at the last post in this thread, I hope it helps!
Best regards,
Carlos