Alteryx Designer Desktop Discussions

dschneider1 · ‎12-05-2018

How should one go about reading a large file via the Python tool?

Using Alteryx.read("#1") results in a memory error. Normally, I would read the file line by line to avoid this, but I am not sure how to do so within the syntax of reading from the Alteryx object.

Thanks,

-David

OldDogNewTricks · ‎12-05-2018

How large is the file?

What is the actual error that you receive?

Here is a 'hacky' solution, split the data into multiple chunks.

dschneider1 · ‎12-06-2018

About 12GB. That seems like the best solution as of now.

OldDogNewTricks · ‎12-06-2018

Have you tried the proposed solution?

Did it work for you?

I still propose that you share more about the actual errors and create a ticket with Alteryx so they know about the limitation/error.

I'm not sure if the error is a result of the Python virtual environment running out of space, Jupyter notebooks, Alteryx, or somewhere in between. It also would depend on your machine, if you only have 8GB of ram then that is obviously a problem.

dschneider1 · ‎12-06-2018

Yes, I am using something similar to the proposed solution in that I am batching out data to read in via separate connections. The issue is that i am trying to read the whole file into memory at once given the layout of Alteryx, unless there is a way to index connection objects that I am not aware of. I would run into the same issue if I were to do the same thing in any other Python environment-- it is simply bad practice. Normally, I would avoid this by reading the file in by line, but given that I am only able to work with the singular connection object I am not sure how to do that within Alteryx. At this point I will likely just write code in Python to do it correctly and execute it via Run Command.

dschneider1 · ‎12-06-2018

Here is how i would do it in pandas, since that is most closely aligned with how Alteryx handles data:

reader = pd.read_table("LARGEFILE", sep=',', chunksize=1000000)
master = pd.concat(chunk for chunk in reader)

vijaysuryav93 · ‎02-16-2023

Any solution to this memory issue? I face similar issue with my 8M records. The problem is few times it runs fine over Alteryx server/gallery and few times it fails with memory issue. I believe it is something to do with RAM of the Alteryx server machine. But just wanted to know from you all if there's any solution around

Alteryx Designer Desktop Discussions

Reading large files with Python tool