Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Data file save and load slow on network drives

haraldharders
9 - Comet

Because of specific data sources which are available on network drives and in order to share data with colleagues I need to load data (especially Excel) from network drives and I have to save files to network drives. Unfortunately this is very, very slow. 

If I compare the same file save on a local drive and on a network drive, network drive takes orders of magnitude longer than local. As a workaround, I have started to manually copy the input files from the network drive to my local harddrive, run the script and manually copy the results files to the network drive. With all the manual steps, this is still way faster than saving directly to the final destination.

I know that Excel import and export is slow, but the same happens (on a different scale) with Alteryx' native database format.

Also, I have observed that even saving the Alteryx scripts on a network drive takes very long in comparison to local save. (Gallery is not an option for me because it constantly asks my for PKI credentials.)

 

I have the impression that Alteryx has a fundamental problem with slow data connections/network drives. 

If a direct speed up is not possible, why doesn't Alteryx just do what I do in the background: Copy the input files to a temporary directory, run the script, copy the output files back to the destination.

 

I would like to know if others have similar experiences and maybe workarounds/solutions.

 

(unfortunately, I cannot find any labels which fit to my issue.)

5 REPLIES 5
Luke_C
17 - Castor

Hi @haraldharders 

 

I've noticed this as well, and I believe it's more of a product of the network/VPN. If you open the files in the network drive directly in Excel I'd expect they also take longer than a usual Excel on your local machine. I came to this conclusion because when using the same workflows on Alteryx Server (which is on the same network/server as the network drive where the files are) it ran much faster.

 

Unfortunately I don't have a suggestion though. 

haraldharders
9 - Comet

File format: Excel xlsx

Spreadsheet size: 19 x 1048575 cells

File size: 69,990 KB

Runs: 3 to 4, except Alteryx network: 1

Alteryx script does only contain one Input Data tool.

haraldharders_0-1618308284302.png

Results (all times in MM:SS):

TaskSet up Input Data tool in AlteryxKlick on existing tool in AlteryxRun task: max timeRun task: average timeRun task: median
Copy from network to local diskN/AN/A00:4900:2900:22
Open in Excel from local diskN/AN/A00:3000:3000:30
Open in Excel from networkN/AN/A01:3201:1901:15
Open in Alteryx from local disk00:10 00:4700:4400:44
Open in Alteryx from network19:3916:0241:3541:3541:35

 

If we compare the run times, following interessant information can be found:

  • Loading time in Excel increases by factor 2.5 when opening from network drive in comparison to local disk.
  • Loading time in Alteryx increases by factor 55 when opening from network drive in comparison to local disk.
  • Opening in Alteryx from local disk is 1.5 times slower than in Excel.
  • Opening in Alteryx from network drive is 31 times slower than in Excel.

Even if I could accept long running times when executing the process, I cannot accept that Alteryx needs so long to add the Input Data tool or to my workflow.

 

Let me try to interpret: If the data rate would slow down Alteryx, the slow down factor should be similar as in Excel (around 2 to 3). Thus, it needs to be a different cause.

I suspect that the data loading algorithm in Alteryx is just inferior for longer latency times. So I believe, Alteryx loads lots of small packages in a synchronised process rather than loading one big block of data at a time.

 

My conclusion is: The Alteryx team should look into the data loading and saving algorithm and change them in a way to load or save big blocks of data and to avoid synchrone data communication with the server.

 

Is there a way to make sure that this post reaches the Alteryx development team?

 

Luke_C
17 - Castor

Hi @haraldharders 

 

Support@alteryx.com is your best bet. I can certainly commiserate with what you're seeing, unfortunately I don't have a workaround but maybe someone else does.

gvotsmi
5 - Atom

We have have the same issue with slow transfers to/fr network drives on VPN to server drives.  This works for me.   I copy source data to local drive and perform all analysis and files saves locally, then at end of workflow, I use a run command with ROBOCOPY.EXE to move the results to the network drive. ROBOCOPY works great for large files over a network.  

HomesickSurfer
12 - Quasar

Hi @haraldharders 

 

No issues at our end.  All of our data is on NAS...even Sharepoint.

The file that you reference in this post...is Alteryx reading in 16384 or so fields/columns, mostly blank/null/empty?

I've experienced this.  The source file's sheet was formatted from A1:XFD1048576.

If so, you will need to reset the last used cell.  I've used vba to accomplish such.

Labels