This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
Showcase Designer’s ability to read 7,000 files and 14.8 million rows of data in less than a minute *on a desktop*
Demonstrate additional utility of the Directory Tool & Dynamic Input tandem
Last week I wanted to build dynamic reports around historical stock data that I found from Kaggle.com. The folder of data contained over 7,000 files (each for an individual stock) with a variable amount of data depending on the stock’s IPO date. Inputting these files one by one would be a nightmare, so I decided to use a wildcard character. Unfortunately, I was greeted with errors when the input tool came across files without any data.
The good news is these errors can be avoided by being more selective with the files we read in. This is where the Directory Tool -> Dynamic Input tandem comes in to play.
The Directory Tool returns a listing of all the files in a specified directory. This list of files can be used as the input to a Dynamic Input tool which allows you to read multiple files at once with a single tool (assuming those files follow the same schema). In this example, we will read over 7,000 files each containing historical data about an individual stock in less than 1 minute.
It sounds like a wildcard character should work, but we run into an error, and we don’t have the ability to be as selective a we want to be. In this particular case, we have data files in our folder without any data. In other cases, there may be files in that folder with a different schema. Below is a screenshot of our example:
Find Files within Folder
Filter out Files with no Data
Read in all valid Files with Dynamic Input
Another way the dynamic Input tool allows us to be more selective is demonstrated with a simple reporting case. First, we select 10 stocks out of 7,000 available that we want to analyze. This can be done by using a Text Input and a Join Tool. I prefer this method over using a filter because it shows us which data is unmatched on both sides of the join. Second, we want to make sure the stocks we selected have data in those files, so we filter out the files where the Size field from our Directory Tool is equal to 0. Then we can use the Dynamic Input tool to bring in our data and prepare the data for reporting.
Below is a screenshot of that workflow:
Attached is the Workflow, and the PDF output for reference.
In order to run the workflow, you will need to download the data from Kaggle and be sure to have the Stock Market Data folder in the same location as the Workflow. Unfortunately, I am restricted to providing a direct download because of the file size. Here is a link to the Kaggle data set: