sales prediction with time series modelingcs229.stanford.edu/proj2015/219_poster.pdfsales...

Sales Prediction with Time Series ModelingGautam Shine, Sanjib Basak

Nonlinearity induced by hidden layer

𝑦𝑗 = 𝑏𝑗 +

𝑖=1

𝑤𝑖𝑗 𝑥𝑖

Parameters 𝑏𝑗 and 𝑤𝑖𝑗 learned from data

Autoregression can be included through lagged inputs

Optimization is non-convex, averaging needed

• Order (5,2,0) chosen by minimizing Akaike information, which is a regularized maximum likelihood estimate

• Fourier terms used to introduce multiple-seasonality

• Predictions are too smooth to capture sales spikes

Neural Net

• 10 autoregression lags and 14 hidden layers, chosen by minimizing generalization error

• Averaged prediction of 100 nets initialized with random seeds

• Qualitatively captures sales spikes with lower MSE than ARIMA, but low daily accuracy

ARIMA + Regression

• Regression with indicator vectors for special days like Black Friday and Christmas greatly improves ARIMA

• Captures sales spikes while retaining daily accuracy

• Lower MSE than the neural net

Autoregression (AR)

𝑟𝑡 = 𝑐 + 𝜖𝑡 +

𝑖=1

𝜙𝑡−𝑖 𝑟𝑡−𝑖

Moving Average (MA)

𝑟𝑡 = 𝜇 + 𝜖𝑡 +

𝑖=1

𝜃𝑡−𝑖 𝜖𝑡−𝑖

Autoregressive Integrated Moving Average (ARIMA)

𝑑𝑡 = 𝑐 + 𝜖𝑡 +

𝑖=1

𝜙𝑡−𝑖 𝑑𝑡−𝑖 +

𝑖=1

𝜃𝑡−𝑖 𝜖𝑡−𝑖

Conventional Time Series Models

Feed-Forward Neural Networks

Sales forecasting is critical for retailers

• optimal stocking of products

• website stability under peak traffic

• planning, customer support, and marketing

Anomalies like Black Friday are especially difficult to capture with models

Example data from online sales of tech products:

Forecast ResultsPredicting Sales

Hidden Node Count and Autoregression Order

• Erratic behavior can be observed when the number of hidden nodes is low

• Autoregression order made little difference beyond 5 or so

• Both chosen to minimize test MSE, but nearby values could also have been reasonable

Learning Curve

• Curve is non-monotonic since data is not i.i.d. and set size affects prediction interval

• Neural nets require more data to achieve the same forecast accuracy, but can exceed ARIMA with large data sets

Model Fitting

sales prediction with time series modelingcs229.stanford.edu/proj2015/219_poster.pdfsales...

Documents

playing tetris with genetic algorithms - machine...

aladdin what can we learn from movie...

detection identification - machine...

evaluation blind audio source separation pipeline and...

calorie estimation from fast food...

pedestrian detection with r-cnn - machine...

classification of irish and scandinavian folk music by dance...

local ancestry inference in admixed populations -...

blind audio source separation pipeline and algorithm...

earthquake-induced structural damage classi cation...

eyes around the world learning to cluster and alert live...

finding poverty in satellite images - stanford...

img001 - cs229: machine...

sales prediction with time series modeling - machine...

reinforcement learning for adaptive traffic signal control...

robocop: crime classification and prediction in san...

your next personal trainer: instant evaluation of exercise...

pedestrian detection with rcnn - machine...

1 multifaceted predictive algorithms in commodity...

learning of visualization of object recognition features...