Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Predicting what day you reach your goal

Shelbey
Alteryx
Alteryx

Hi Community!

 

I have been tasked with predicting the future. I know it's possible, but I have yet to use any of the predictive analysis tools I need to achieve this. I'm hoping someone will be able to help get me started on the right path. I've been testing out different tools, but I either get an error or something confusing so I'm assuming I'm not using the right tools to begin with...

 

Here's my business case...

My company moves ore daily. We are scheduled to move 37,000 tons per day. We have a target of 68,500,000 tons. If we stay on schedule, we will reach our target on 2/14/2023. We regularly compare on a daily and monthly basis how many actual tons vs scheduled tons were moved. The problem I am trying to solve is, looking at our actual tons, on what day will we meet our target?

 

Any help getting started would be much appreciated!

 

 

5 REPLIES 5
LukeG
Alteryx Alumni (Retired)

Hi @Shelbey 

 

It sounds like a Time Series forecast could be applicable here. If allowed, would you mind sharing some sample data and I can help build out a solution? If not I can help you to get moving in the right direction by helping out with time series tools in general.

 

Luke

jamielaird
14 - Magnetar

Hey @Shelbey 

 

Unless I'm oversimplifying things (always a possibility) I think you could achieve this without touching the predictive tools at all. I think a combination of Generate Rows, the Formula tool and the Running Total tool will do the trick.

 

Here's my thinking. Imagine you have a very simple three column table containing a Date column, a Quantity column and a Type column. For the sake of argument imagine we're starting this from today and you already know the actual amount of ore that was shipped today was 38,657 tons. So row 1 is:

 

DateQuantityType
2019-09-2638,657Actual

 

 

For every future date you're going to assume that 37,000 tons of ore get shipped. Use Generate Rows to create an arbitrary large number of rows (5 years perhaps) and then Multi-Row to fill down the dates using the DateTimeAdd function against the row above:

 

DateQuantityType
2019-09-2638,657Actual
2019-09-27  
2019-09-28  
and so on...  

 

You can populate Quantity and Type with 37,000 and 'Scheduled' respectively for all rows after today's date (or more accurately all rows after the last date for which you know the actual amount).

 

DateQuantityType
2019-09-2638,657Actual
2019-09-2737,000Scheduled
2019-09-2837,000Scheduled
and so on...... ...

 

Finally you use the Running Total tool to add a fourth column which sums the quantities up to each date. This will be a combination of the actual and scheduled amounts. Obviously you'll want to configure this so that your Actuals are being updated as you get that data, which you could approach in a few different ways - the best one will depend on how exactly you get that data.

 

 

DateQuantityTypeRunning Total
2019-09-2638,657Actual38,657
2019-09-2737,000Scheduled75,657
2019-09-2837,000Scheduled112,657
and so on...... ... 

 

Your target date can then be identified as whatever date row the running total exceeds 68,500,000. 

 

20XX-XX-XX37,000Scheduled68,501,421

 

Am I on the right track?

 

 

Shelbey
Alteryx
Alteryx

@LukeG  Here are some of the actual tons per day I'm using. 

Shelbey
Alteryx
Alteryx

@jamielaird  Definitely on the right track! Thank you! Looking at your solution gave me an idea...

Instead of using the 37000 tpd for all of the future dates, I might try an average of all of our actual tons and use that average for the future dates, and then doing a running total off of that. Then every day my target reached date will change because my average changes. I don't want to use the 37000 tpd for future dates because historically speaking we haven't reached that yet, and I'm thinking the date from that logic will be off.

 

I am still curious as to what @LukeG can come up with using a time series tool and comparing that date with the date from the solution using the average tpd. 

LukeG
Alteryx Alumni (Retired)

I think your idea of using an average is a good idea at this point. Because of the small sample size (54 rows) of historical data, Time Series forecasting is not necessary. When you have at least a year of historical data, then you could use a TS forecast to predict the actual tpd. You can then use the time series forecast to predict the next year and beyond and ultimately the date that your goal is reached.

 

I would then use the time series forecast to create a running total just like @jamielaird did, but for now it is probably best to use the expected value (37000) or a moving/running average to create that running total.

 

Let me know if you have any questions!

 

Luke

 

Labels