Link : Time Series Forecasting with Exogenous Features

func3_plot2

func3_plot2

Notes on Methodology

  • The time series looks very different earlier to the 2012 January and thus, we wont be using pre-2012 data for modeling
  • Strong positive and negative correlation between ‘Number of Bicycle Hires’ and the following exogenous features - ‘Net Supply’,’Wind’,’holiday_flag’
  • Little correlation between the above three exogenous features themselves and thus, inlcuding uncorrelated features in the model provides us separate pieces of information to forecast the target
  • Choosing to forecast at Weekly grain to get some smoothing and also, model selection / update is faster than building Daily grain model
  • And Weekly grain model are more manageable
  • We tried 4 different models - Prophet without / with exogenous features AND Seasonal-ARIMA without / with exogenous features
  • Inspected the models visually by plottling
  • Followed by quantifying model performace across a range of metrics - MAE / MAPE / RMSE / R-square
  • Prophet with exogenous features was best among the 4 models w.r.t forecast accuracy and also, faster in training (which will have benefits in productionalization or if we were to forecast at further granularity like TFL hires in different counties within London)
  • PLEASE SEE - Since, we wont have actuals for the exogenous features for out of sample forecasting we will need to first forecast the exogenous features with a separate Prophet model for each and then use the forecasts as model input
  • Also, Prophet doesn’t requires much manual inspection of parameters unlike SARIMA models where we’ll need to study the ACF and PACF plots to decipher the autoregressive, random walk and moving average orders
  • Prophet also supports additionally weekly seasonaity apart from annual seasonality should we to switch to daily grain
  • Prophet allows us to add changepoints for time series - for example if there are any major trend breaks in future
  • Finally, to get reliable statistics of generalization model performance we did a 5-fold rolling window cross-validation and saw mean MAPE to be 12.6% and mean RMSE to be 29,752 (For reference mean TFL hires per week is 189,643)

Leave a Comment