Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Long time running Workflow due to the huge Data imported in the Alteryx

SH_94
11 - Bolide

Hi Community,

 

Currently i want to run the workflow after importing and adding some the functions tool. However, i realised that the complete running workflow took about 1 hour+ and we only able to see the result. I have two questions as below.

 

1. May I know if there is any tips and tricks that we can do before we running the workflow so that the running time can be shortened? Could someone please advise how you dealing with huge size of data when running the workflow.

 

2.I encounter one problem whereby some of the functions tool maybe not applicable / i dun want the tools to be functioned in certain report generation. May i know if there is anyway we remain the function tool without deleting them in the workflow , and at the same time, we can adjust when the functions need to be work or not need to work in certain situation. For instance, we use the screenshot below as example, if certain or special scenario, I don't want the tools to be functions, but i don't want to delete them. Is there anyway i can maintain them but at the same time, i able to decide whether the function tool need to be working or not.

 

Jacob_94_0-1615162455671.png

 

4 REPLIES 4
Emil_Kos
17 - Castor
17 - Castor

Hi @SH_94,


Based on your workflow I think the join tool is taking most of the resources. I am not sure if you can do much about it besides adding an auto field tool somewhere at the beginning of the workflow:

 

Emil_Kos_0-1615189171169.png

 

 

May I ask what you do with the data cleansing tool? To my best knowledge, it isn't the fastest tool in the tool palette and if you can replace it with something else I think you could observe some kind of speed improvement. 

 

Regarding the second question, there is a possibility to use the analytical app to pick how the workflow should work. Can you tell me what this formula is doing? I am not an expert in this area but I can search for some articles/resources. 

 

If you are working on this workflow I can imagine you need to wait for ages until this workflow will finish. I have two tips for you. One you can click on the input tool and load the data into the memory. Thanks to that you will not need to load it on each run time. Please see the screen below for reference;

Emil_Kos_1-1615189473010.png

 

 

Alternatively you can load only part of the data in the alteryx. In order to do so you need to limit the record limit in the input tool:

 

Emil_Kos_3-1615189631485.png

 

Hope this helps!

SH_94
11 - Bolide

Hi @ Emil_Kos,

 

Thank you for your details explanation.

 

May i know what is the main difference between auto field tool and select tool? As i always use select tool instead of auto field tool.

 

Secondly, actually the reason why i add in the data cleansing tool is i learn from the article saying that it is best to cleanse your data before using it. May i know do we necessary need to use data cleansing data for every imported data? As i can't differentiate in what circumstances we need to use data cleansing tool.

 

In term of the workflow, i still adjusting and will share with you shortly once i completed it.

 

 

Thanks again for the input

Emil_Kos
17 - Castor
17 - Castor

Hi,

 

Auto field tool automatically identifies the smallest possible data type for this column. In the select tool, you can manually pick the data type. 

 

Actually, the best practice would be identified as the smallest possible data type and replace afterwards with the select tool. You can just use the auto field option as I don't think it takes up a lot of resources so probably you will not waste a lot of time using it. 

 

Regarding the data cleansing tool sometimes your data is clean and you don't need to use it. Do you know that you need to clean your data as there are for example extra spaces? 

 

Some extra question to you because this workflow looks like a quite simple one and I don't think it should run 1 hour (unless the data set is really huge). The fields that you use in a joining tool are unique? I am not sure if you are familiar with the concept of creating duplicates in the databases/alteryx.

 

The easiest way to check if you create duplicates is by checking if the numbers of the row that comes to the join tool are equal to the number of rows that come out from it. So basically if what is on the left side of the join tool is equal to what is on the right side of it.

 

dvdadmin
5 - Atom

Hello All,

Created a workflow and it was taking much time .

 

workflow details.

 

It was extracting the data for two segments from one table. I can see no of rows for the two segments is 120 Million . 

 

If I want to extract all the data quickly ,I can say may be in 10-12 hours or less. Is that possible ? If so any options or any tools in Alteryx. please.

Labels