Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

How to create datatset for a Time Series Model

AshishBhavnani
8 - Asteroid

Dear Alteryx Users,

 

I am working on a Time Series Model for a Retail Store-Product level data with the objective of predicting Monthly Sales at a Retail Store- Product level. For making predictions for Monthly Sales at Retail Store Product Level, I am using the Monthly Historical data available for Retail Stores for various products.

 

While working on this model, I was encountering issues with the ARIMA and ETS models that I am using and when I discussed the issues with Alteryx Support, I was informed that the issues are due to Nulls for Sales for few combinations of Retail Stores and Products and also for a few Retail Stores-Product combinations, the historical data for a few months was missing.

 

In order to resolve the issues with the two Time Series models, I am thinking of introducing dummy records for Retail Store Product level data for historical months and I have been exploring the Generate Rows tool available under the "Data Preparation" category of tools to handle this issue. However I am not able to figure out how to use the "Generate Rows" tool for multiple Years for eg. 2015 and 2016 .

 

Can anyone please provide some guidance of how to use either the "Generate Rows" tool to create historical data for Retail Store-Product Level combination or some other approach to enable the creation of the records?

13 REPLIES 13
AshishBhavnani
8 - Asteroid

Hi @BenMoss

 

Thanks once again for a wonderful tip that is helpful in providing insights on how Alteryx tools process the data passing through them.

 

I enabled the "Show All Macro Messages" setting under RunTime. However, enabling this setting led to so many messages being displayed for every record that I could not find the batch of records myself that was causing issues with the processing of the data through the ARIMA and ETS Time Series Tool.

 

I just wanted to know if you had made a tweak of some kind through which you were able to write these messages to a file and then did you search that file for some specific error message that led you to finding those 81 instances of Retail Store Product combinations.

 

Also, I would like to know your thoughts on a discrepancy that I observed in the model results after running the workflows in two different iterations. I compared the model output under columns "ARIMA_Model_Forecast" and "ETS_Model_Forecast" across outputs from two different runs of the same workflow. For some of the observations, I found huge differences in the values of the Target variable. I am not sure which of the two runs' output is correct and also why there was so much difference in the results across the two runs of the same workflow.  I would like to mention that none of the settings for either ARIMA or ETS were altered during the two runs of the workflow. I was hoping to see a similar set of predictions for the Target variable since the Input dataset and the workflow used to predict the variable hadn't changed.

 

Have you encountered such scenarios when the model produces different results for the same datasets across different runs.

 

I am really thankful for the help and the guidance you have provided in looking at this issue.

BenMoss
ACE Emeritus
ACE Emeritus

Firstly, see my previous post regards why the issues seem to be occuring. Alteryx automatically writes the messages you get to a 'log file'. You can find the location of the log files Alteryx generates by going to options > user settings > edit user settings. Note you can quickly filter to just the 'ERROR' messages by going to by selecting the red error indicator on the toolbar of the results window.

 

As for your 2nd question, I would almost expect there to be differences. The ARIMA and ETS whilst both time series models, have some key differences in how they predict your result data and it's worth reading up on the subject to ensure your analysis captures what you need.

It is best practice to create a hold out sample of data (lets say, the last 3 months for each store-product for instance). You can then generate a forecast for this period for both your models and then compare against your actuals. By performing some basic calculations you will then be able to derive the best model to implement.

 

I hope this helps.

Ben

 

 

AshishBhavnani
8 - Asteroid

Hi @BenMoss

 

Thanks for your reply.

 

I also expected differences between ARIMA and ETS Forecast models based on a fair bit of reading that I have done. However, in the case of my workflow, I am actually using a Join Tool to combine the outputs of ARIMA and ETS forecast models. For this reason itself, the Workflow predicts the Target variable for both ARIMA and ETS models for the entire dataset in a single run. 

 

I am observing differences in the Target variable predicted by the ARIMA model in the first run from the Target variables predicted by ARIMA model in the second run and the same applies to ETS Forecast model. The differences in the Target Variable values is significant and I am currently running an iteration to see if the results are going to be different.

 

As for comparison of Model outputs, I am actually comparing the Target Variable Actual values from a period with the Forecast values for the same period for both ARIMA and ETS Forecast models.

 

Regards,

Ashish Bhavnani

AshishBhavnani
8 - Asteroid

Hi Fellow Users,

 

I was able to build the Time Series model for a large dataset that comprises of data for a large number of Retail Stores and Products.

 

However the ARIMA and ETS Tools for some of the Retail Stores and Products have predicted same values across the range of 4 future periods for which the predictions were made. In light of these outputs, I resorted to changing the Level of Seasonal Differencing for ARIMA Tool from 0 to 1 and I also changed the Seasonal Type for "ETS" Tool from "Auto" to "Additive".

 

These changes introduced variations in the forecast values but the summation of the absolute of the errors in the predictions went up from the last run. Is there a way to introduce seasonality in the predictions and at the same time also bring down the errors in the predictions at the aggregate level? How do I decide upon a right combination of Non-Seasonal - AR, Differencing and MA components as well as the right combination of Seasonal - AR, Differencing and MA components for a dataset that comprises of data for multiple Retail Stores and Products?

 

Please respond as early as possible. Your responses and suggestions are highly appreciated.

 

 

Labels