Hello - looking for some minor help in attached workflow. Essentially I am:
- removing some individual store data from my analysis
- applying p correlation to find variables that have the most ideal p correlation based on defined threshold
- Use those variables in a linear regression model as predictor variables to find target variable (revenue)
- Taking above model and scoring for the store I left out from the start
- Applying a % difference to the predicted forecasted revenue and the actual revenue
The expected output can be seen at the top of the workflow (score 163643.334625 with an error of 0.0964)
However my ouput is (forecasted revenue 162031.049182 with an error of 0.105302956229071.
Can anyone help me figure out why I am off by ~1% of the expected outcome? Have been staring at this for a few hours and trying to troubleshoot with no success.
Thanks!