Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Stuck on problem set - Linear Regression scoring with p correlation

JoeMarco
5 - Atom

Hello - looking for some minor help in attached workflow. Essentially I am:

  • removing some individual store data from my analysis
  • applying p correlation to find variables that have the most ideal p correlation based on defined threshold
  • Use those variables in a linear regression model as predictor variables to find target variable (revenue)
  • Taking above model and scoring for the store I left out from the start
  • Applying a % difference to the predicted forecasted revenue and the actual revenue

 

The expected output can be seen at the top of the workflow (score 163643.334625 with an error of 0.0964)

 

However my ouput is (forecasted revenue 162031.049182 with an error of 0.105302956229071.

 

Can anyone help me figure out why I am off by ~1% of the expected outcome? Have been staring at this for a few hours and trying to troubleshoot with no success.

 

Thanks!

 

 

3 REPLIES 3
AkimasaKajitani
17 - Castor
17 - Castor

Hi @JoeMarco ,

 

I think the reason is that you select the lower correlation parameter.

 

If you use the Association analysis, its tool will show the appropriate parameter.

We can select the parameter that have *.

AkimasaKajitani_0-1633618675318.png

I'm going follow this, I check the bellow parameter(Dairy_Shr ~ Floral_Shr).

 

AkimasaKajitani_1-1633618930550.png

 

The result is as follows.

 

AkimasaKajitani_2-1633618946796.png

 

JoeMarco
5 - Atom

Akimasa thank you so much! This helped me get the answer I was looking for, I was wondering if i could bother you for a few more questions just so I know what went wrong here in my workflow:

  • Within alteryx in this use case, what is the advantage of using association analysis tool vs the pearson correlation tool? I see they gave different results when I went back to check and was wondering if you could simply explain why the association analysis was the best tool here. Was the Pcorrelation tool i was using before giving me wrong measures and thus i was selecting the wrong variables?
  • I see what went wrong in my workflow, P correlation tool was giving me lower correlation parameter variables that I was considering for my model input, but thus was not matching the variables to the output in association analysis

Any general guidance you can share on best practices here and why this occured would be great to help me avoid this in the future.

AkimasaKajitani
17 - Castor
17 - Castor

Hi @JoeMarco ,

 

Honestly the both of the tools output the same result. But the difference is how to output.


The advantage of using association analysis tool is easy to understand.

Because we can set the predictive field at the Association analysis tool and its tool outputs the correlation between the target variable and predictor variables directly.

 

If you use the P correlation tool, you should get the correlations with large absolute values.

 

Labels