Hi everyone,
I need some tips/guidance on how to proceed with some CSV files that I am working on with Alteryx. In the datasets, the number of records is 2,205 records but there is 6,000 fields per record.
When I attempt to import this data file, I adjust the configuration for the size of the fields down to 10 characters in length, but it still takes a fair amount of time for the workflow to complete. Even after the workflow completes, Alteryx tends to freeze or lockup when I navigate to other parts of the workflow.
Is this file too big for Alteryx, or is there anything I can do to make this work faster and stop Alteryx from locking up? The file comes in as a Tab delimited file, and I have actually only need the first 10 fields or so. Is there a way to limit the number of columns the Data Input tool will initially bring in or anything else I can do stop Alteryx from freezing up?
Thanks in advance!
Mike
Solved! Go to Solution.
@apathetichell I did try your approach and workflow, but I am still getting the same issues with slow speed and Alteryx freezing up.
The test file I am using is a text file that is delimited with tabs and I could not see in the Input Tool in your workflow where the data that was being brought in was limited.
I was able to successfully modify the configuration in the solution that does use the Python tool so I was able to get my desired output using both the R Tool and the Python Tool.
I am posing a copy of the file I am using, if you would like to try to use this file with your workflow. Not sure I am configuring the tool wrong, but it does not work with this file.
Thanks again so much for your help!
ran in about 15 seconds for me - make sure that you have delimiter set to \0 in your input data. also - looks like your first row contains data - not headers - so make sure "first row contains field names" isn't selected.
@apathetichell You are a genius! I made the adjustments you suggested and mine now actually runs in about .4 seconds, not even 1 full second.
This is great stuff! Glad I got the exposure to the R and Python tools (I was able to get both of those to work, thanks again @Yoshiro_Fujimori for the great tips and I can now add the R and Python tools to my Go To tools).
But this solution actually runs the fastest of the three approaches, so I will use this for the other files I need to import.
Thanks again so much for your great help!
Mike
Hi @mkeiffer ,
Good to know you come to the solution.
Simple solution is always the better. (Thanks to @apathetichell !)
I tried to import the data you attached, and parsed it into columns.
It has 6000 columns in total. So it is really a big table.
If you need only the first 10 columns, sample First 10 rows for each record and Cross Tab as below.
(Keeping 6000 columns is difficult to deal with... Just browsing takes minutes.)
Workflow
Output
Good luck!
@Yoshiro_Fujimori Thank you for posting your updated solution, using the approach that @apathetichell developed.
I still found your solutions using both the R Tool and Python Tool to be extremely helpful, and I think seeing different approaches such as this really helps facilitate the learning process and how to apply Alteryx.
I appreciate both of you very much taking the time to work on this and for posting the solutions.
Thanks again!
Mike
User | Count |
---|---|
19 | |
14 | |
13 | |
9 | |
8 |