Pearson Correlation and linear regression to check linearity
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I have attached the full workflow, where the goal is to find the correlation using Pearson method for the revenue and the complaints for all customers. I have used the lag value for complaints as want to check if complaint has been raised 3 months back if there is any effect on revenue. the result shows for max customers the Correlation is less than 0.7 which indicates revenue goes down and now would further want to check if there is linearity between revenue and complaints using linear regression.
new to linear regression, may i exactly know which value to check for the linearity.
can you validate the workflow especially on linear regression part if that's the correct way to process according to the problem statement.
attached the screen shot of the linear regression result
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
The resources here may help you understand your linear regression process. https://community.alteryx.com/t5/Alteryx-Designer-Knowledge-Base/One-Stop-Shop-for-Predictive-Resour...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @nidah5 ,
I'm not really sure, that Linear Regression is the right approach. If I got you right, you have a series of revenues and complaints and complaints are highly correlated to revenues with a log of 3 month. As there seems to be no other predictor, I assume, you would predict future revenues based on historic values, so previous values of revenue are the main predictor for future revenue, maybe with some seasonal influence. I would at least consider to use ARIMA (one of the time series tools) and add complaints as a covariate, and compare prediction for periods with actual data against the real actuals. What do you think?
Best,
Roland
