Showing results for 
Search instead for 
Did you mean: 

Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.

Creating Dummy Variable with large Number of Values

Dear Alteryx Experts 


I want to use linear regression for prediction, but I have a problem with the predictor variables (X-axis) 3 of them are categorical and I have at least 60 values on each of them , If I use formula tool for creating dummy variable I believe it will take more than 2 or 3 days to do so , can anyone tell me how to solve this issue faster? 

Note (If macro needed please provide it with workflow example for dummy creation for 60 values). 

I suggest:




First create a unique list - produced by the summarize tool

Add a record ID for grouping by

Use an append fields tool to make a cartesian join to the rows of all the variables needed (note you must allow all joins in the Append Fields tool)

A simple formula tool to create the 1/0 values

Finally a Cross Tab to put the variables back on a single row


Sample attached.


When you say create a dummy variable, do you mean, create a binary variable for each?


If so, the Linear Regression tool will do this already for you, withing the macro.


You could also use a transpose and crosstab tool, I have attached an example.




EDIT: get distracted for a bit and then get beaten to it by James!

@jdunkerley79 , @JoeS Thank you both for your help, I tried to impliment it on this shared File:

but I got an error  in the Append field ("Error: Append Fields (14): There were more than 16 records in the source") for @jdunkerley79 solution, and for @JoeS still I do not understand it (how to deal it with my case), The Fields I'm looking to make it dummy are:


Shop_id, date_block_num and right_item_category_id.


Thank you

Alteryx Certified Partner
Alteryx Certified Partner

@MAAbdullahAlMubarah change the option on the append field tools at the bottom to 'Allow All Appends'.

@JoeS You are right "Linear Regression tool will do this already for you" 

when I start to implement the linear regression it said the matrix is too large (XX GB) cannot be handled, that's why I thought it needs to create a dummy variable but in this case what I should do to solve the issue of the large matrix?


@MAAbdullahAlMubarah Yeah, R is going to struggle with memory to create the matrix.


I have re-attached my workflow with your file in the bottom. It still takes a while to run (12 mins on my laptop), but should achieve what you are looking for.

Make sure Allow All Appends is set in the Append Fields tool:


@JoeS How to resolve the memory allocation problem? 


I am not 100% sure you can with the matrix in opensource R. I think you will need to use Microsoft R instead, but I am not completely sure.