community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.
#SANTALYTICS

Gather all 9 clues to complete the final Weekly Challenge on Dec 16!

Learn More
SOLVED

Creating Dummy Variable with large Number of Values

Dear Alteryx Experts 

 

I want to use linear regression for prediction, but I have a problem with the predictor variables (X-axis) 3 of them are categorical and I have at least 60 values on each of them , If I use formula tool for creating dummy variable I believe it will take more than 2 or 3 days to do so , can anyone tell me how to solve this issue faster? 

Note (If macro needed please provide it with workflow example for dummy creation for 60 values). 

I suggest:

 

2019-01-02_10-10-08.jpg

 

First create a unique list - produced by the summarize tool

Add a record ID for grouping by

Use an append fields tool to make a cartesian join to the rows of all the variables needed (note you must allow all joins in the Append Fields tool)

A simple formula tool to create the 1/0 values

Finally a Cross Tab to put the variables back on a single row

 

Sample attached.

Alteryx
Alteryx

When you say create a dummy variable, do you mean, create a binary variable for each?

 

If so, the Linear Regression tool will do this already for you, withing the macro.

 

You could also use a transpose and crosstab tool, I have attached an example.

 

Workflow.png

 

EDIT: get distracted for a bit and then get beaten to it by James!

@jdunkerley79 , @JoeS Thank you both for your help, I tried to impliment it on this shared File:

 

https://drive.google.com/open?id=1F_sNYLXrEzS6tAjhWkYCm87UdQcNo1ks

but I got an error  in the Append field ("Error: Append Fields (14): There were more than 16 records in the source") for @jdunkerley79 solution, and for @JoeS still I do not understand it (how to deal it with my case), The Fields I'm looking to make it dummy are:

 

Shop_id, date_block_num and right_item_category_id.

 

Thank you
 

Highlighted
Alteryx Certified Partner
Alteryx Certified Partner

@MAAbdullahAlMubarah change the option on the append field tools at the bottom to 'Allow All Appends'.

@JoeS You are right "Linear Regression tool will do this already for you" 

when I start to implement the linear regression it said the matrix is too large (XX GB) cannot be handled, that's why I thought it needs to create a dummy variable but in this case what I should do to solve the issue of the large matrix?

Alteryx
Alteryx

@MAAbdullahAlMubarah Yeah, R is going to struggle with memory to create the matrix.

 

I have re-attached my workflow with your file in the bottom. It still takes a while to run (12 mins on my laptop), but should achieve what you are looking for.

Make sure Allow All Appends is set in the Append Fields tool:

2019-01-02_11-53-51.jpg

@JoeS How to resolve the memory allocation problem? 

Alteryx
Alteryx

I am not 100% sure you can with the matrix in opensource R. I think you will need to use Microsoft R instead, but I am not completely sure.

Labels