Predict Future Sales using Forecasting Techniques

- Pentaho


Demand Forecasting is a field of predictive analysis which uses historical sales data to make an estimation of an expected forecast of customer demand. Demand Forecasting is done on a scientific basis. Overestimation of demand leads to overstock, whereas underestimation of demand leads to stock-out and many valued customers may not get the products they want. Demand Forecasting is performed to ensure that the company has enough supply to meet demand.

Demand Forecasting is the key to better supply chain performance. It is the pivotal business process around which strategic & operational plans, budgeting, financial planning, sales & marketing plans, production planning, inventory management, risk assessment and mitigation plans of a company are formulated. Historical sales data is analysed using time-series analysis and deep learning models.

Qualitative forecasting methods are used when data is not available or the data present is irrelevant to the forecast. Quantitative forecasting methods are used when numerical information about the past is available and it is reasonable to assume that some aspects of the past patterns will continue in the future. Let’s have a look at the different types of qualitative and quantitative forecasting models:

Qualitative Forecasting Methods

Survey of Buyer’s Choice:

This is the most feasible method when the demand needs to be forecasted in the short run (usually a year).

Collective Opinion or Sales Force Competitive Method:

It is based on the idea that salesmen are closest to the customers and will easily understand their demands. Individual estimates of future sales in their regions are aggregated to calculate the total estimated future sales. This method of short-term forecasting is particularly useful for sales of new products.

Market Experiment Method:

The demand is forecasted by conducting market studies and experiments on consumer behaviour under actual but controlled market conditions. This method is expensive and time-consuming.

Expert Opinion Method:

Experts are given a series of carefully-designed questionnaires and are asked to predict. The opinions are shared with other experts to arrive at a conclusion. This is a fast and cheap method but estimates for a market segment cannot be done.

Quantitative Forecasting Methods

Traditional Forecasting Methods:

Traditional forecasting methods are based on time-series forecasting approaches. Forecasting is based on historical time series which is a sequence of data points measured at successive intervals of time. Time-series methods include:

Naive Method: All forecasts are set to be the value of the last observation.

Average Method: All forecasts are set to be the average of the historical data.

Simple Exponential Smoothing (SES):

SES is used to forecast stationary uni-variate data that has no clear pattern (no trend /no seasonality). Forecasts are weighted averages of past observations, with the weights decaying exponentially as the observations get older. SES is an ARIMA model with one non-seasonal difference and no constant term [ARIMA(0,1,1) model without constant].

Holt’s Linear Trend Method:

Extension of SES that adds support for trends in uni-variate time-series.

Holt Winters’ Seasonal Method:

Extension of Holt’s method that adds support for seasonality in uni-variate time series.

ARIMA(Auto-regressive Integrated Moving Average) Model:

ARIMA aims to describe the auto-correlations in the data. It is used in cases where the data is stationary, uni-variate and does not contain any anomaly. Seasonal ARIMA takes into account seasonality. ARIMA is parametric, flexible and gives more accuracy compared to other models. It is comparatively difficult than Simple Exponential Smoothing.


TBATS Model:

TBATS models time-series with multiple seasonalities. TBATS has the capability to deal with complex seasonalities (non-integer seasonality, non-nested seasonality and large-period seasonality). TBATS is slow to estimate in case of long time-series.



It is a decomposable model(trend+seasonality+holidays). It is a Bayesian curve-fitting method that is accurate, fast and fully automatic (finds seasonal trends). It responds robustly to missing data, shifts in trends and handles outliers well. The intuitive parameters are easy to tune.

Intermittent Demand Forecasting Methods:

Intermittent demand(ID) is also known as sporadic demand. The most influential intermittent forecasting models are:

Croston’s Method (Forecasting and Stock Control for Intermittent Demand):

It possesses the ability to estimate the time between two demand occurrences and evaluate the average demand level when there is a demand occurrence. It exhibits superior performance over Simple Exponential Smoothing. Drawbacks include the lack of independent smoothing parameters for demand size and interval size. It is a biased method.

Crostons method

Hierarchical Forecasting Framework:

Hierarchical-time series occur due to geographic divisions. The hierarchical forecasting framework provides better forecasts by either a top-down or bottom-up approach.

Machine Learning/Artificial Intelligence Models:

Machine learning models use complex mathematical techniques to select variables and optimise fit in instances where there may be complicated interactions between features. ML/AI models are great for discovering non-linear and complex relationships in the data without the need to pre-select the model type or make assumptions about external factors. They are highly tunable, less sensitive to outliers and can reliably process large volumes of data. However, ML/AI models require lots of input data and are difficult to retrain once trained. The results of these models can be arduous for non-technical audiences to interpret.

Regression Model:

Models with single independent variable are called simple regression models. Models with one dependent and two or more independent variables are called multiple regression models. Regression models provides demand forecasts of the dependent variable along with useful managerial information for adapting to the events that cause the dependent variable to change.

Random Forest:

Random forest builds multiple decision trees and merges their predictions together to get a more accurate and stable prediction rather than relying on individual decision trees. It handles large data sets with high dimensionality. It prevents over-fitting and ensures high accuracy. The disadvantages of random forest are: inability to use data that is close to the present, complexity and longer processing time.

Recurrent Neural Network (RNN):

RNN looks at the past and its decisions are influenced by insights from the past. RNN can take one or more input vectors and produce one or more output vectors (Output is influenced by weights applied on the inputs and hidden state vector). RNN captures hidden correlations and deals with short dependencies. RNNs are difficult to train.

Long Short-Term Memory (LSTM):

It is a variation of RNN that is capable of dealing with long dependencies. It combines short-term memory with long-term memory through gate control.
LSTM consists of 4 interactive neural networks.

Convolutional Neural Network (CNN):

CNN consists of:

Convolutional layer -> extracts different features of the input -> more layers extract complex features from the last feature.

Pooling layer -> combines outputs of neuron clusters at one layer into a single neuron in the next layer.

Fully-connected layers -> combines all local features into global features to calculate final result.

CNN can automatically extract and learn features. CNN uses highly dimensional data with minimal processing.


CNN is effective in load forecasting. To increase accuracy and stability, CNN is integrated with LSTM for supporting long input sequences.
The architecture involves using CNN layers for feature extraction on input data combined with LSTMs to support sequence prediction.


Both qualitative and quantitative forecasting techniques can be used to calculate future demand, for a more well-rounded perspective. Businesses should take time and ensure to model both short-term and long-term demand forecasts. Short-term forecasts provide data on inventory planning, replenishment and procurement activities. Long-term forecasts provide data for major investment and strategic decisions. The perfect forecasting model must be selected according to the data set for getting the best fit.