Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Reading multiple files from a folder where each file contains more than 400 columns

satish2gopal
5 - Atom

I am trying read the text files from a folder which contains more than 50000 files and each of the files contains attributes close to 500. I have setup Directory tool in order to identify the files which roughly takes 8 minutes and a logic to pick the most recent set of files. (More than 120 files for each business date). Now the process to read and consolidate the contents for each day takes more than 50 minutes of execution via Dynamic Input or Batch Macro. Is there a smarter way of speeding up this process Potentially via a Python tool in Alteryx?

2 REPLIES 2
apathetichell
18 - Pollux

You probably need more ram and a faster machine. This is a pretty huge workload - I'd say 50 minutes sounds fine.

CodeMonkey
8 - Asteroid

If you're trying to read more than one file "at once", you could use the python tool as you suggested to potentially achieve some speed-up.

 

Not sure what your python comfort level is, but there are some good stackoverflow answers that you may be able to adapt that use the multiprocessing library. This would basically allow you to process multiple files at the same time, but you'll still be limited by how fast your machine can read from the disc.

 

Examples:

https://stackoverflow.com/questions/7776293/read-txt-file-with-multi-threaded-in-python

https://stackoverflow.com/questions/48756691/how-to-read-multiple-files-into-multiple-threads-proces...

 

 

Labels