Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Interaction terms in regressions

davidb
5 - Atom

Hi all, new Alteryx user here. I am setting up a workflow around some predictive models I have - both a linear and logistic model. Normally (in R) I have interaction terms in these models, but I cannot see a way of implementing these conveniently, short of creating the variables separately. 

 

i.e. the logistic regression tool can happily make me:

 

lm(y ~ x1 + x2)

 

but not 

 

lm(y ~ x1 * x2)

 

Am I missing something or is there a good work around? I suppose I could make the interactions as separate fields but this will be very cumbersome for some of my larger models with hundreds of regressors.

 

Relatedly, is there a more convenient way logging regressors than simply making new fields?

 

Thanks

15 REPLIES 15
JessicaS
Alteryx Alumni (Retired)

Hello,

 

Unfortunately, you would need to create these extra fields or run the model in the R tool.  If desired, the macros can be opened and adapted by right clicking on the tool on the canvas and selecting 'open macro' then saving as a copy.

Jess Silveri
Manager, Technical Account Management | Alteryx
deargle
7 - Meteor

I'm teaching a class in a business school with Alteryx, using the Predictive Modeling tools. could easily pop open R and do this myself easily enough, but the reason for using Alteryx is that these students have no programming background. Is this really a `WILL NOT FIX` situation? It's so so easy to do in R. You could do it like SPSS and have another tab in the tools that lets you specify interaction effects, feeding in several variables.

 

And what do you mean "edit the macro"? It's fed a list of features from an Alteryx GUI. How are you suggesting to edit the macro to combine arbitrary incoming features for interaction effects?

Charity_K_Wilson
10 - Fireball

I'm with you.  Would love to Alteryx to be able to do interactions.  In the meantime, when you right click on the Linear icon, you will see at the bottom an option to open the macro.  You can then write your custom R script.

 

Open Linear Macro (2).jpg

deargle
7 - Meteor

Sorry, by "what do you mean edit the macro", I meant, "what do you mean 'edit the macro?!", as in "what kind of a suggestion is that!"

 

After posting, I dug into the Logistic Regression tool, and hit a roadblock. The R node in that macro calls `AlteryxPredictive:::runLogisticRegression(inputs, config)`, at which point the trail is lost. So v1.1 is not editable. But Logistic Regression v1.0 has all the code right in there, but what I would be doing is hacking line #234 to specify my case-sensitive `x.vars` with the interaction term. Also involves saving a custom version of the macro, and having two identically-appearing macros by the same name in the ribbon (one of which is using hard-coded features), unless I have students edit the macro name as well to specify which hard-coded features it's using, and then repeating the whole process for other models run, crossing fingers that the place to edit the formula is as "discernible" to students as it is in Logistic Regression v1.0. I don't feel comfortable asking my students to do this. It would be easier to just not use Alteryx and instead do the predictions straight in R. But even that requires learning syntax and scripting, which I think is out of reach for my students, at least in a one-semester crash course in business analytics. They're having a hard enough time with the Alteryx tools that have nice guis.

deargle
7 - Meteor

My reply has been removed, in which I clarified that I am aware of how to use the "edit macro" functionality, but that it is not reasonable, manageable, and sometimes not even feasible to customize the Alteryx R macros to do things such as specify interaction effects. In my now-deleted post, I gave specific examples of the process of hacking the v1.0 version of the logistic regression tool, to illustrate the difficulty that a business school student who has no coding skills would have with the process, and managing the collection of identically-named hard-coded tools that would then start appearing in the ribbon. Is not the target customer base of Alteryx those who have little to no coding skills? The suggestion to "edit the macro" appears in several places on the discussion forums as if it is tenable, while I argue that it is not. I think the community deserves a discussion about this.

DanM
Alteryx Community Team
Alteryx Community Team

@deargle we apologize, but your response for some reason went to spam. I have released this from the spam and you can see it above.

 

In regards to your comments,  the stock Predictive tools are only intended to be used for the creation of models and not the creation of variables. You can create an interaction variable (term) by using a Formula tool to generate interaction variables prior to the using the Regression tools.

 

Another is to create your own tool (macro) using the R tool.

You can create a macro that can be deployed to all of your students with the R code that you want to use. You can create the tool name, Icon, and which category they will be stored.

 

Regards

deargle
7 - Meteor

Thank you, that's a good idea. I recorded a lecture showing them how to dummy-code -- it's probably good for them to understand the process behind this, although tedious and potentially error-prone (if they miss checking the box for one of the main effects or interaction terms for an interaction effect that includes a term with many levels, etc). Maybe I'll create my own macro that creates interaction terms and that does dummy-coding all in one step... one day.

deargle
7 - Meteor

Thought about this a bit more. I am comfortable with R, but I don't have strong motivation to learn Alteryx macro language to implement features that should arguably already be available in Alteryx if predictive analytics is a competitive focus of "Alteryx the company", given that the mission of Alteryx is to accelerate the analytics process. If predictive analytics is a primary focus, isn't Alteryx motivated to dedicate employee time to create official tools for dummy coding and for interaction effects? I've seen a few dummy coding tools floating around on the community posts, but my hunch is that it would be easier to create interaction terms in the same breath as dummy-coding. I would have to look more closely at how R functions like glm do it on the fly to confirm that hunch.

DrDan
Alteryx Alumni (Retired)

@deargle,

 

Providing a formula based interface for users familiar with R syntax is something we have discussed internally, and we have even created a couple of early prototypes. You indicated in one email that doing variable interactions should be as easy as doing zero-one encoding of categorical variables. In terms of the back-end, that statement is true (in R both are implemented using the model.matrix function), from a user interface perspective, it is not. You may have run into a thread where I provided a macro to do zero-one categorical variable encoding (if not, I can add it to this thread). From an interface perspective, that is fairly straightforward, since it lends itself to a check box list of variables in string columns. However, for doing interactions, and also possible inline transformations, the UI requirements are a lot more challenging. In fact, it is only fairly recently that doing this would even be possible in Alteryx, as a result of fairly recent efforts to embed HTML5 and JavaScript into Alteryx to create user interfaces. A good UI should not require the user to remember the variable names and type them in, and it would keep track of the variables as opposed to any inline transformations (such as natural log, square root, etc.). These requirements are much more challenging than just having the user check the variables they would like to zero-one encode. Moreover, the number of Alteryx users that would benefit from that type of interface is currently not large, which makes the cost benefit ratio less than attractive. However, as I indicated, we have invested resources into this area. As a short-term solution, I've attached a sample workflow that contains a macro called "Formula Transformations", based on the criteria I indicated above, it does not have a good user interface, since the user needs to correctly type in an R formula expression without direct help in providing the names of the available variables. It does address simple, common inline transformations (abs, log, log10, exp, sqrt, ^2, and ^3 ), but not more complex ones such as bsplines or logarithms to an arbitrary base. While far from perfect, and a bit fragile, it should meet your needs for now. It also uses the traditional Alteryx user interface elements (not HTML5 and JavaScript), so you can open the macro and get at the underlying R code, which should be informative. You might also want to read my Knowledge Base articles on writing R-based macros.

Labels