This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
Just getting started with Alteryx ARIMA modelling and set up the below workflow which is generating weird results -- the time series compare picture shows the ARIMA and ETS forecasts which are clearly off, even though the decomposition plot seems to pick up the seasonality in the data points -- at the very least I would have expected the ARIMA plot to follow the seasonality just with large error areas to account for the remainder noise. Any thoughts on what's going on here?
From the TS Compare it does look like something is off in the model or perhaps the configuration of the tools. Would you be able to provide the workflow and data? It is tough to determine the PDQ values in the ARIMA model without looking at the ACF plot.
Sure -- workflow is attached. I kept digging in on ARIMA modelling after posting that question and added a second ARIMA model -- this time I manually selected Model Customization > Customize the parameters used for automatic model creation > The seasonal components > Alter the degree of seasonal differencing and I at least get a forecast out of the model (the manually adjusted is ARIMA1 and the base is ARIMAAuto in the attached picture), which is nice!
So my question is now a 2 parter:
1. Now that you can see the data in my workflow, any thoughts on what's happening with my ARIMA auto?
2. If there's just an issue with the data and an auto ARIMA can't divine the seasonality, is there any way to manually alter the degree of seasonal differencing in Model Factory? I want to apply this process to multiple units at once but most of what I've read indicates that the Model Factory uses auto ARIMA which isn't helpful given my issue, so being able to adjust the seasonal differencing to at least get a forecast seems like it could be useful
To answer your first question, the ARIMA tool in Alteryx uses the R function auto.arima() when choosing the the order of AR, the order of MA, and the degree of differencing.
Per the documentation for this function: "The number of seasonal differences is sometimes poorly chosen. If your data shows strong seasonality, try setting D=1 rather than relying on the automatic selection of D." auto.arima() documentation
This is actually what you did in the "ARIMA1" model when you set the level of seasonal differencing = 1!
So, when we look at the "ARIMAAuto" model, which uses default settings, the tool chooses an ARIMA(1,0,2)(0,0,0,) model. The fact that there is no differencing doesn't bode well when modeling data that has the amount of seasonality in your data, which is why the "ARIMAAuto" model performs poorly.
However, when we look at the "ARIMA1" model, where you set the level of differencing to 1, we see that it lands on an ARIMA(0,1,1)(0,1,0) model. Forcing one level of differencing allows the "ARIMA1" model to account for seasonality and produces much better results.
To answer your second question, at the moment, that feature is not present in the model factory. So if you want to force differencing to multiple variables, you will need to bring in an ARIMA tool for each one. This seems like it would be a useful feature, as it was mentioned here as well, so feel free to request it here.
I read you reply on this post while searching for a solution to a Time Series Prediction problem that I am working on at the moment. I am currently working with a dataset that has Sales in $ for 192 Retail Stores.
At the moment, I am at the cross-roads and not able to decide how to determine the right set of values (P,D and Q) for the ARIMA tool to model the seasonality and non-seasonality in the data.
For my first run, I had set the Level of Seasonal Differencing to 0 and the and the Maximum order of AR and MA for Seasonality were set to 1 and 1. This resulted in predictions which were same across the forecast periods for the some of the Retail Stores. I did not feel comfortable with this output since the predictions for future forecast periods for few retail stores were exactly same.
Later I changed the Level of Seasonal Differencing to 1 and this resulted in Sales predictions that were varying in $ for forecast periods. The new outputs seemed better at a first glace since they atleast showed some variation with the forecast periods. However, this change increased the summation of the magnitude of the error for the entire set of retail stores. At an individual retail store level, errors in prediction went up for some of the stores while for others it came down.
With this background, I am requesting suggestions on how to determine the right set of Seasonal and Non-Seasonal components to make predictions for a heterogenous dataset that has time series Sales data for 192 retail stores.