Alteryx Designer Desktop Discussions

Bart · ‎06-13-2016

Edit on original post in blue

Hi,

I am looking for some guidance in comparing forecast errors as part of the Predictive Analytics tools in Alteryx.

Background

I have set up a forecast of the ordered units that will be cancelled compared to the initial orderbook.

For that I have set up:

ETS and several ARIMA models to forecast cancellations as a percentage of the initial orderbook
ETS and several ARIMA models to forecast cancellations in units

The idea is to take the most meaningful forecast model

Results

I modified the results for readability in the overview below, which is the output from 2 TS Compare tools. For the forecast in percentage, the ETS model scores best on all error measures. For the forecast in units this is the ARIMA covariate 2 model. I am also looking to optimize my forecast by combining the best 2 models, which should reduce bias and variance. For example, here is a paper that discusses how combining forecasts improves the accuracy: http://repository.upenn.edu/cgi/viewcontent.cgi?article=1005&context=marketing_papers

My questions:

Which is the most appropriate measure to compare all forecasts in percentages with all forecasts in units? MPE and MAPE?
Would there be any objection for combining 2 of the 9 forecast models (equal/proportional/regression weighted) after they have been transformed into percentage or units only?

Forecast on percentage

Forecast in units

Actual and Forecast Values:

Actual	ETS	ARIMA	ARIMA_cov1	ARIMA_cov2	ARIMA_cov3
0.121797	0.13973	0.34649	0.06387	0.06388	0.06213
0.188977	0.13973	0.14847	0.08519	0.10376	0.09116
0.164481	0.13973	0.11928	0.04991	0.05412	0.05280
0.128480	0.13973	0.11341	0.07967	0.08132	0.08341

Actual and Forecast Values:

Actual	ETS	ARIMA	ARIMA_cov1	ARIMA_cov2
1,107,342	591,147	28,923	605,223	453,601
1,375,405	591,147	641,179	1,741,376	1,974,260
1,614,726	591,147	617,289	2,958,232	2,076,745
1,158,343	591,147	809,138	1,526,938	1,479,633

Accuracy Measures:

Model	ME	RMSE	MAE	MPE	MAPE	MASE	NA
ETS	0.0112	0.0295	0.0258	4.4043	16.1474	0.3897	NA
ARIMA	-0.0310	0.1166	0.0814	-30.9592	61.2804	1.2292	NA
ARIMA_cov1	0.0813	0.0861	0.0813	52.5318	52.5318	1.2278	NA
ARIMA_cov2	0.0752	0.0791	0.0752	49.1116	49.1116	1.1355	NA
ARIMA_cov3	0.0786	0.0831	0.0786	50.9328	50.9328	1.1868	NA

Accuracy Measures:

Model	ME	RMSE	MAE	MPE	MAPE	MASE	NA
ETS	722,806	750,156	722,806	53.9980	53.9980	3.0035	NA
ARIMA	789,821	839,479	789,821	60.6722	60.6722	3.2819	NA
ARIMA_cov1	-393,988	762,714	645,048	-24.0720	46.7443	2.6804	NA
ARIMA_cov2	-182,106	525,046	508,976	-10.2133	39.7318	2.1149	NA

.

Thanks for your help!

Regards,

Bart

Bart · ‎06-16-2016

Accuracy measures

I have been in contact with Alteryx and they confirmed my understanding on the error measures:

No metric is better than the others,
ME and RSME are commonly used in most situations,
MPE and MAPE support scale independent comparison within certain limits (for example, to compare the forecast in percentage with the forecast in units),
MASE can be used if MAPE cannot be used due to meaningful zero values

Additionally, Chapter 2, Section 5 of Hyndman and Athanasopoulos's online book Forecasting: Principals and Practice provides a good discussion of the measures used to assess forecast model accuracy (http://otexts.com/fpp/).

Explaining accuracy to audience

Based on several e-learnings I went through, I will use the terms 'bias' and 'variance' to explain to my audience the accuracy of the forecast models. With bias indicating the average distance from actual and variance indicating the spread of the predictions. I think this will create a better understanding as they have no background in statistics.

Bias = ME
Variance = MSE - (bias * bias) = (RMSE * RMSE) - (ME * ME)

Forecast model outcomes

Alteryx gave me feedback to have a closer look at the ETS model:

"One thing I noticed in your results is that you have a single value for all observations in the ETS forecast. Typically this means that there's not enough "signal" in your data, so the tool is returning the average value. Perhaps check the decomposition plots for your ETS model - see if there's excessive noise causing that static result."

When asking for direction, they replied:

"As far as reducing noise goes, have you used the Data Investigation tools yet? From there, you can get a better understanding of your data and the metadata. And try different Model, Seasonal and Trend Types – additive, multiplicative, etc, as well as your Information criteria."

Something did I noticed, was that my data was not sorted on date, but on another field. After sorting on dates, I noticed that it helped to make the forecasting model (more) meaningful and not have a flat line. I did not realize the forecasting tools do not do that automatically based on the date field provided.

Combining forecasts

The link in my post above provides a good direction on combining forecasts.

Alteryx Designer Desktop Discussions

Guidance in forecast comparison and combination

Re: Adapt which columns are written into one Excel...

Re: formula guidance on an IF statement

Re: formula guidance on an IF statement

Re: Interface Filter

Re: Macro not Looping thru Files in Folder