Alteryx Designer Desktop Discussions

nafong

I wanted some help or suggestions on how I should approach this issue I am faced:

Currently, I am using my workflow to connect to a database (Impala SQL) by using the Dynamic Input tool. I have around 15 distinct queries that pull different information per person depending on the Group ID I input into the Text Box Interface tool (replaces a "dummy group id" from the query with the name I input into the text box).
It takes about 50 minutes to fully run them, as other people are running queries from the same data-lake, which is restricted by how many queries can be run at once because of a memory limit.
Essentially, the queries pull billions of rows of data, searching to find a snippet of data that matches certain conditions, and then throw away the billions of rows away.. multiplied by how many people are running similar queries at once

Is there a way that I can pull the billions of rows once, and then look for those snippets of data matching certain conditions without "throwing it away" to save on run-time but also not be restricted by memory issues?

Ideas:

Pulling once into a csv file and then using that file to reference
- May not be feasible because they are limited to about 2 million rows
Using the Cache Dataset Macro -- is this the right way to approach this?
- How can I use this tool so that people without an Alteryx license can use the workflow through the Alteryx server?
- How can I set it up in a way that the large dataset is only run once, and when another user uses the Application portion of it (input a Group ID in a textbox), it starts running from what was cached?

I appreciate any ideas or help provided. Thank you!

lwolfie

You could try running it and storing the data as a yxdb file. Then have a separate workflow that references the yxdb file. I've also broken up data and stored them in separate files based on year (for example). Then, brought those individual files in together to continue the research.

nafong

This worked beautifully, thank you!

Alteryx Designer Desktop Discussions

Pulling Billions of Data to Store to then Use

SAP BW - Write Data to a Data Store Object (DSO)

Data Pulling from the System

Data backward pull

Alteryx Data Store

Sharing the Workflow that pulls data from Data War...

Re: Need to check if we can activate container bas...

Alteryx 2024.2 Upgrade Issue – Formula Tool Config...

Re: Alteryx Core Exam Data Download Issues

Re: Selecting the input with specific header

Re: Replace Values in Columns in a Table using a d...