community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.

Handling big data in alteryx

Hi There,

 

I am looking for extracting and loading 7.4 GB data with 4.5 M records and 124 fields, currently with input/in-DB tool it takes approximately 1 to 1.5 hrs. Can anyone tell me best approach to do that. Anyone tried the connectivity in python tool and doing it with python tool . Please guide.

Alteryx Certified Partner
Alteryx Certified Partner

What type of database are you loading to? Check to see if a bulk connection is available (note: not all ODBC drivers support the bulk load connections. Try to the native drivers for best results). 

 

20190620-BulkConnections.png

@CharlieS which alteryx verson is this? I have only Teradata bulk option not other.

Fireball

Is it only the data input that takes that long, or is it the rest of the downstream processing?

 

Some downstream processes can make it appear that the whole thing slows down, when really a bottleneck down the road holds everything up. Try putting a block until done after your input to see if the input finishes faster, from there you can try optimizing your workflow.

Alteryx Partner

Are you using Input tools or inDB tools (you mention both) ?

 

Can you include a screenshot of your workflow and show where it's slow ?

 

4.5M records isn't really big, but it will be slow if the 124 fields have big datatypes.  

 

 

Alteryx Certified Partner
Alteryx Certified Partner

That screenshot is from 2018.3. Those are bulk load options for the Output tool, the Input tool only has Teradata bulk. 

But I want to read the data faster in input tool as its taking almost more than an hour to read it from SQL server. Is there a faster way to load it?

Fireball

Unfortunately I think this starts bleeding over into SQL management territory. Is it alteryx that's slow, or does it take a similar amount of time from another query tool using the same query (you can pull the query text from the query builder)? If it still takes about an hour in another tool then it's the database that needs optimization, not alteryx.

 

Database administration isn't my thing but my first thoughts would be: can you optimize the query better? Can you add a primary key to your database to help it search better?

In my Input tool, I am just doing Select * from <table_name>

The same query if I run in SQL server side it takes only 5 minutes to load records

But in Alteryx it takes more than an hour to do it. I am just curious to know if we can do it faster loading or any other alternative or any other tool should I use in alteryx. If you can provide steps would be great help

Highlighted
Fireball

So the query is fast on the server machine itself (good sign), but is it as fast on your machine? Maybe its a data transport issue? I'm not aware of much inside SQL server, but is there possibly a rate limiting function it's using to try and save itself from getting pounded by a single user?

 

 

Labels