community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.
Upgrade Alteryx Designer in 10 Steps

Debating whether or not to upgrade to the latest version of Alteryx Designer?

LEARN MORE

Transpose and Summarize to create field inventories takes a long time

Alteryx Partner

Hi community,

 

I am using Transpose and then Summarize to create fields inventories, it actually works... BUT the problem is that the files I'm using are big, and the transpose tool runs very slow as it writes lots of data.

 

Let me clarify with an example:

Input:

Code1Code2Code3
111
241
133
334
125

 

Output (After Transpose and Summarize):

CodeValueQty
Code113
Code121
Code131
Code212
Code232
Code221
Code313
Code333

 

What happens on the transpose tool:

Code11
Code12
Code11
Code13
Code11
Code21
Code21
Code23
Code23
Code22
Code31
Code31
Code33
Code33
Code31

 

It might not look like a big deal on these small example tables, but I just ran 100,000 records on the input (about 200Mb), the transpose tool creates 21,900,000 records (Over 1.3Gb), sure, the summary tool then shrinks that into about 4,000 records for the inventory, but the transpose tool is really dragging down the process. 

 

Do anyone else have an alternative to make the inventory work without writing insane amounts of temporal data on the transpose? 

 

Thank you,

Highlighted
Alteryx
Alteryx

A more efficient way is to have a summarize with a group by function in with the count/sum for each variable. You can use the formula tool to hard code in the field name.

 

Having 10 summarize tools will run a lot quicker than a transpose and one summarize for 10 fields.

Hi 

 

I believe that, if you do the Summarize first and then Transpose, it solves your problem.

 

Sample workflow is attached for your reference.

 

Thanks

Kaviyarasan_P

Alteryx
Alteryx

@Kaviyarasan_P has gone along the same track as me. However, using a formula to add the name of the field should be a little more efficient still, as you can just hard code it as a string value. However it will be a little more work for you as an Alteryx designer.

Based on the spec listed on the question (100,000 rows and 219 columns), I think a more dynamic approach is going to be required.

 

I agree best approach is to summarise and then use formula tool, but think would need to generate workflow dynamically first!

Alteryx
Alteryx

Great point @jdunkerley79

 

At my old place of work I had such a workflow. I'll see if I can find time to build another.

Alteryx Partner

Thank you all for your replies and interest.

 

I would love to summarize first and then transpose. It would work perfectly for a handful of columns. But there are too many columns to summarize first, having 200+ summarize tools or entering the 200+ fields as "Group by" and "Count" on a single summarize is not an option.

 

Has anyone tried a different approach to create an inventory of a file, maybe programatically instead?

Alteryx
Alteryx

I have built the attached workflow, that will create a workflow with as many summarizes and formulas as needed.

 

I have done some light testing, so should hopefully work for you.

 

What you need to do is replace the text input with your file, if you don't want to run it on some fields remove them using a select tool before the field info.

 

This will then create a new workflow, again change the text input tool to be your data, and the result from the union should be what you need.

 

I'll look to write up a new post detailing it all a little more, but hopefully the above is enough for you to work through.

Labels