Dear community,
I've made a time series forecast model in Alteryx Designer using the Time Series tool from R. I have a time series dataset where every record contains a week number, a DateID and the average daily volume that has been shipped in that week. The dataset starts in week 1 of 2017 and end at the current week of the date. My organization uses the ISO week number count that is built in Excel (=ISO.WEEKNUM()). This function in Excel gives in previous years a weeknumber from 1 to 52. But for 2020 (this year) it also has a 53rd week. This week is from 28-dec-2020 to 3-jan-2020. When I cut my training data set to week 48 of 2020 and want to foreast the next 6 weeks, this will be the weeks 49, 50, 51, 52, 53, and 1. But TS forecast tool using the ARIMA tool will not give a 53rd week: it will give a table as output with the sub periods being: 49, 50, 51, 52, 1 and 2, so it does not have a 53rd week.
What are my options to solve this problem?
Is the forecast for sub period 1 actually the forecast for my 53rd week? and the sub period 2 for my 1st week of 2021?
Is it better to just remove the 53rd week from my time series dataset?
Is there a way to tell the forecast tool that there is a 53rd week in 2020?
Should my organization better be using a different week count than the ISO.WEEKNUM() from Excel?
I would like to hear your opinion and feedback. Thankyou beforehand!
Greetings, Jimmy
Solved! Go to Solution.
Hello @jimmke7
Can you please upload your workflow so that the Community can see what you have completed?
This will also help the community to troubleshoot the issues you have mentioned above.
Thanks!
TrevorS
Hi @TrevorS and community,
Trevor does have a point I could give a better explanation of my question by uploading a workflow. Because it's not allowed to share sales data from my company, I generated random sales data in excel to use or this question.
In the file Alteryx Data 53rd week question.xlsx the data that will be used can be found. This is my example time series data which has weekly values of sales. The week number of the =ISOWEEKNUM function in excel is the standard within my department. Say we only have sales from 2017 week 1 (2-jan-2017) to 2020 week 51 (14-12-2017). I would like to forecast the next 5 week using the ARIMA model tool in Alteryx. This will be done by executing the workflow attached named.
The output of the forecast tool is as followed:
Period | Sub period | forecast |
2020 | 52 | 30,287.613429 |
2021 | 1 | 30,378.760679 |
2021 | 2 | 29,549.006889 |
2021 | 3 | 29,152.584612 |
2021 | 4 | 29,761.685689 |
So it forecasted the next 5 weeks just as desired. But it's forecasting sales for the weeks 52 (2020) and week 1, 2 ,3 and 4 (2021). But according to the =ISOWEEKNUM() in excel there is a 53rd week because the 53rd week starts at the date 28-dec-2020 which means that there are 4 days of that week in 2020 (28,29,30,31) which is more than half of the week and therefor there is a 53rd week.
Because there is no forecast for a 53rd week from the ARIMA model in Alteryx my plannning tool will not have forecast available for those days. Also I am wondering if this value in the table above for sub period 1 is actually the value for the 53rd week, which means that al the sub periods are 1 week ahead (for example the forecast for sub period 4 is for ISO week number 3)?
My question: How to deal with this problem that the ISO week number DOES have a 53rd week but the ARIMA model DOESN'T.
Some thought of myself: Just use the forecast for sub period 1 for week 53 and the forecast for sub period 2 for week 1 and so on?
Say that we are in week 4 of 2021, should I delete the sales actuals for the 53rd week of 2020 in the time series? (because it will see the 53rd week as week 1 of 2021 and so on)
Is it better to use a different week number standard than the ISO built in Excel?
Also: in the ARIMA configuration, in the tab 'Other', I've selected that my time series has a series starting point which is 2017 week 1 (which it does). Do I go wrong here by entering this data? Besides that, in the tab 'Other' i've selected that my data the week format is ISO8601.
If anyone has any input please share,
Thankyou beforehand
Hi,
I am having this same issue. My data is structured as Period, Sub Period and Count
I wanted to have a more clean output from the forecast tool (ie; instead of Period 7 Sub Period 53, I wanted Period 2020 Sub Period Dec 28) so before this issue I had added a formula tool after the forecast tool and created a few new fields which I will share. I have used these derived fields to create logic which corrects this 2021 issue I am having. Just for confirmation, I am also using ISO8601 for the weekly choice in the ETS configuration (this should work for ARIMA as well).
In the attachments below you will see terms Year and Week, I used a select tool to rename Period and Sub Period respectively. I then created the Year field to change from a number to the actual year (6=2020). I then used that field to build the field Start_of_Year and then use the DateTimeAdd to correct the week. Per you post, the model does look at week 53 of 2020 as week 1 of 2021, but I pushed my forecast out 53 periods and confirmed that it was sufficient through the entire year of 2021 except for the last couple of weeks (which is what I run into this year). This means that near the end of 2021, I will need to come in and update the logic from -11 to whatever corrects the flow of the data. Prior to having this issue the logic for all of 2020 in my workflow was todate(DateTimeAdd([Start_of_Year],([Week]*7)-9,'days'))
The one downside I have found is that week 53 is being considered as Year 2021, but I will manually fix this in my output to the customer. All other subsequent weeks will reflect the correct year.
I am unable to post my workflow for proprietary reasons but will be happy to clarify anything if needed.
I hope this helps.
JS
Forgot one important detail. I then use Week_Start_Mon field to replace the Week field (which without formatting is the Sub Period field)
Hi @jimmke7 ,
You can assume the observation that it is putting out as subperiod 1 is in actuality the 53rd week and you will have to manually adjust the output and the rest of the output downstream. The R tool is simply treating the forecast as a series of observations and the next observation is mis-labeled because the tool doesn't know about the 53rd week.