Get Inspire insights from former attendees in our AMA discussion thread on Inspire Buzz. ACEs and other community members are on call all week to answer!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Regression analysis in Alteryx

timol
8 - Asteroid

I have a dataset with two columns.

The first column contains my independent variable. The first column contains characteristics of a company that are separated by a comma. So for instance "A,C, D" in the first row and "C, A" in the second row. The second column contains my dependent variable which is the revenue of a company.

Is there a way in Alteryx to compute a regression to get data on which of the charateristics (for instance "A) has what kind of impact on the revenue? Is there also a way to see what kind of combinations work best (for instance "A" with "C")?

 

Please see a screenshot of the simplified data structure below. In the end I would have approx. 1000 data points.

 

Many many thanks in advance.

 

timol_0-1620709935528.png

 

6 REPLIES 6
apathetichell
18 - Pollux

Alteryx has R based regression models. If you need a sample workflow you can pull up the "predictive tools" sample workflows under help and look at how data needs to be structured to go into the linear or logistic regression tool. Both have clear sample workflows which show how the data has to be set up and explains the output.

 

NOTE - don't underestimate the Score tool.

timol
8 - Asteroid

Thank you!

 

In the simplified example above: How would you solve the problem with Alteryx?
Right now I am not sure how the handle the issue since I would like to calculate the impact of each charateristic (configuration).

apathetichell
18 - Pollux

The usual choice would be to have a column for each characteristic with either a true or false/1 or 0 pair to represent if a characteristic is or isn't present - but without knowing too much about your data I can't judge if that's accurate in this case or not.

timol
8 - Asteroid

Unfortunately this is not possible to do that manually with my data since I have a large number of possible combinations.

 

Is there not way in Alteryx to do a kind of cluster regression or Qualitative Comparative Analysis to check the impact of a combination of comma separated attributed on a numeric value?

timol
8 - Asteroid

Does anyone have an idea how to do this in Alteryx?

Many many thanks in advance

Crydata
6 - Meteoroid

@timol 

Something like the attached below workflow would work for the basic version of your question (spit out the dummy variable flag columns). It parses out the factors from your characteristic column and creates them as binary flags (as the other user was saying).

 

I think you then maybe want to do something further regarding combinations of these factors? I would do this by creating further flags as multiplicative (interactive) dummy variables using formula tool (I have put one as an eg in the attached). If your problem is that there are LOTS* of these multiplicative dummy combinations and it is a pain to type them all in... explore using XML editor and create your many formulae all at once in Excel and paste the code into the formula tools' code.

Obviously be more generically statistically careful depending on your onward intention (multicollinearity, overspecifying model etc.)

 

Crydata_0-1626708567849.png

 

Labels