Forecasting

Naive Forecasting

The simplest form of forecast, using the nave approach, forecasts are produced that are equal to the last observed value.

The forecast can be calculated as follows:

\[\hat{y}_{t+h|t}=y_t\]

On the picture below, an example can be seen when naive forecasting is applied in Demand Forecasting.

../../_images/Naive_final.png

Naive forecasting in Demand Forecasting

Moving Average

Another simple technique that can help smooth out variability in the data. Called 'moving' because it is continually recomputed as new data becomes available; it progresses by dropping the earliest value and adding the latest value. For example, the moving average of six-month sales may be computed by taking the average of sales from January to June, then the average of sales from February to July, then of March to August, and so on.

The user is required to set the number of periods the average should be calculated over.

Moving average is the mean of the previous N points of the time series, where N is a parameter. Moving average is commonly used for smoothing short-term fluctuations out and highlighting long-term trends and cycles.

On the picture below, an example can be seen when moving average (N=12) is applied in Demand Forecasting.

../../_images/MA_final.png

Moving average (N=12) in Demand Forecasting

Exponential Smoothing

Whereas in the moving average the past observations are weighted equally, exponential functions are used to assign exponentially decreasing weights over time.

The user is required to set the smoothing parameter alpha, this has range between 0-1 where a lower value of alpha will make the forecast less reactive to changes in demand and a high value make it more reactive.

The forecast is calculated based on the weighted average of historical sales. The older the data point, the smaller weight it gets. The weights are exponentially decreasing based on the smoothing factor (\(0 \leq \alpha \leq 1\)). At time \(t\) let the actual demand be \(Y(t)\) and the forecast \(F(t)\), then the calculation is the following:

\[F(t) = \alpha Y(t) + (1-\alpha)F(t-1).\]

On the picture below, an example can be seen when exponential smoothing (\(\alpha=0.5\)) is applied in Demand Forecasting.

../../_images/ES_final.png

Exponential smoothing (\(\alpha=0.5\)) in Demand Forecasting

Holt Method

Or double exponential smoothing is used where the history displays a trend and is basically exponential smoothing applied to both the level and the trend.

The user is required to set two smoothing parameters alpha – level smoothing factor between 0-1 beta – trend smoothing factor between 0-1 Again a low value makes the model less reactive to changes

Contrary to the previous methods, Holt method considers trend, too. It uses two smoothing factors the level smoothing factor (\(0 \leq \alpha \leq 1\)) and the trend smoothing factor (\(0 \leq \beta \leq 1\)). At time \(t\) let the actual demand be \(Y(t)\), the level \(L(t)\), the trend \(T(t) \). The calculation is the following:

\[ \begin{align}\begin{aligned}L(t)&=\alpha Y(t)+(1-\alpha)(L(t-1)+T(t-1)),\\T(t)&=\beta (L(t)-L(t-1))+(1-\beta)*T(t-1),\end{aligned}\end{align} \]

and the forecast is

\[F(t+1)=L(t)+T(t)\]

(where t+1 is the next period and t-1 is the previous one).

On the picture below, an example can be seen when Holt method (\(\alpha=0.1, \beta=0.08\)) is applied in Demand Forecasting.

../../_images/Holt_final.png

Holt method (\(\alpha=0.1, \beta=0.08\)) in Demand Forecasting

Holt-Winters Method

The three aspects of the time series behavior—level, trend, and seasonality—are expressed as three types of exponential smoothing, so Holt-Winters is called triple exponential smoothing. The model predicts a current or future value by computing the combined effects of these three influences.

The user is required to set three smoothing parameters alpha – level smoothing factor between 0-1 beta – trend smoothing factor between 0-1 gamma – seasonal smoothing factor between 0-1 Again a low value makes the model less reactive to changes And also set the ‘Seasonal periods’, this is the number of periods over which the seasonality is exhibited.

The Holt-Winters method considers both trend and seasonality. It uses three smoothing factors: the level smoothing factor (\(0 \leq \alpha \leq 1\)), the trend smoothing factor (\(0 \leq \beta \leq 1\)), and the seasonal change smoothing factor (\(0 \leq \gamma \leq 1\)). At time \(t\) let the level be \(L(t)\), the trend \(T(t)\), and the seasonality factor \(S(t)\). Since we use seasonality in this forecasting method, we have a period length \((p)\).

The calculation is the following:

\[ \begin{align}\begin{aligned}L(t) &= \alpha*(Y(t)/S(t-p))+(1-\alpha)*(L(t-1)+T(t-1)),\\T(t) &= \beta*(L(t)-L(t-1))+(1-\beta)*T(t-1),\\S(t) &= \gamma*(Y(t)/L(t))+(1-\gamma)*S(t-p).\end{aligned}\end{align} \]

On the picture below, an example can be seen when Holt-Winters method (\(\alpha=0.1, \beta=0.08, \gamma=0.05\)) is applied in Demand Forecasting.

../../_images/HW_final.png

Holt-Winters method (\(\alpha=0.1, \beta=0.08, \gamma=0.05\)) in Demand Forecasting

Croston Method

This method is used for forecasting intermittent demands. It uses periods with the demand and the interval between demands to build up the forecast. The forecast is an average of demands, with consideration given to non-demand periods.

The user is required to set the smoothing parameter alpha.

It involves separate simple exponential smoothing forecasts on the size of the demand and the time period between demands. Consider the following notation to explain the Croston method: \(z(t)\) is the demand at period \(t\), \(Z(t)\) is the forecast size of next demand after period t, \(\hat{z}\) is the average demand per period, \(p(t)\) is the time between two positive demands, \(P(t)\) is the forecast of demand interval, \(q\) is the time interval since the last positive demand, and alpha is the smoothing factor (between 0 and 1). The method updates the forecasts for demand size and interval only after positive demand occurs. That means, if in a period \(t\) the demand is zero, the method only increments the count of time periods since last positive demand. The procedure for the Croston method is the following: if \(z(t) = 0\) then

\[ \begin{align}\begin{aligned}Z(t) &= Z(t-1),\\P(t) &= P(t-1).\end{aligned}\end{align} \]

Else

\[ \begin{align}\begin{aligned}Z(t) &= Z(t-1) + \alpha(z(t)-Z(t)),\\P(t) &= P(t-1) + \alpha(q-P(t-1)),\\q&=1.\end{aligned}\end{align} \]

Combining these forecasts results in

\[\hat{z} = Z(t)/P(t).\]

Tracking Signal

Tracking signal is a measure used to evaluate if the actual demand does not reflect the assumptions in the forecast about the level and perhaps trend in the demand profile. In Statistical Process Control, people study when a process is going out of control and needs intervention.

Similarly Tracking signal tries to flag if there is a persistent tendency for actual values to be higher or lower systematically. If Forecast is consistently lower than the actual demand quantity, then there is persistent under forecasting and Tracking Signal will be positive.

Tracking Signal is calculated as the ratio of Cumulative Error divided by the mean absolute deviation. The cumulative error can be positive or negative, so the TS can be positive or negative as well.

TS should pass a threshold test to be significant. If Tracking Signal > 3.75 then there is persistent under forecasting. On the other hand, if this is less than -3.75 then, there is persistent over-forecasting.

So, in essence, abs(TS) > 3.75 implies a forecast bias ==> TS < -3.75 or TS > 3.75 implies a bias.

So, what is magical about 3.75. This is an approximation using the relationship between a normally distributed forecast error and the Mean Absolute deviation.

In General, Forecast Error (using RMSE) * 0.8 = MAD.

At 99% promised service level, you will be using a 3 Sigma level. As a measure of MAD, this translates into 3.75 MAD hence the 3.75 as the threshold for TS.

Trend damping factor

Damping parameter (\(\phi\)) can be set such that \(0 \leq \phi \leq 1\).

When using Holt method with damping, the forecast can be calculated as follows:

\(\hat{y}_{t+h|t}=l_{t}+(\phi+\phi^{2}+...+\phi^{h})b_{t}\).

If \(\phi=1\), the results are identical to the Holt method's.

When using Holt-Winter method with damping, the forecast can be calculated as follows:

\(\hat{y}_{t+h|t}=[l_{t}+(\phi+\phi^{2}+...+\phi^{h})b_{t}]s_{t+h-m(k+1)}\).

If \(\phi=1\), the results are identical to the Holt-Winter method's.