📈 Module 9

Time Series Analysis & Forecasting

⏱ 14 hoursAdvanced6 topics

🎯 By the end: handle datetime-indexed data, decompose a series into trend/seasonality/noise, test for stationarity, forecast with ARIMA, exponential smoothing and ML lag features, and evaluate forecasts honestly with time-aware validation.

Sales, prices, traffic, sensor readings, demand — an enormous amount of real data is measured over time, and time series breaks the usual ML rules. The order matters, observations are correlated, and you can never shuffle the data. This module gives you the specialised toolkit: how to decompose a series into its parts, test whether it is forecastable, and predict the future with both the classic statistical methods (ARIMA, exponential smoothing) and the modern machine-learning approach — then evaluate the result without the time-leakage that fools so many beginners.

1What makes time series different

A time series is data indexed by time. pandas treats a DatetimeIndex as a first-class citizen, unlocking resampling, rolling windows and time-based selection.

Datetime index & resampling

import pandas as pd

# Parse dates and set them as the index
df = pd.read_csv('sales.csv', parse_dates=['date'], index_col='date')

# Aggregate daily data to monthly totals
monthly = df['sales'].resample('ME').sum()

# A 7-day rolling average smooths daily noise
df['smooth'] = df['sales'].rolling(window=7).mean()
print(monthly.head(3))

▶ Output

date
2023-01-31    45120
2023-02-28    41880
2023-03-31    49230
Freq: ME, Name: sales, dtype: int64

The three components of a series

Most series are a mix of: trend (long-term direction), seasonality (repeating cycles — weekly, yearly) and noise (the irregular rest).

An observed series is typically trend + seasonality + noise added together.

Never shuffle a time series. The whole point is that order carries information. The random train_test_split from Module 5 is forbidden here — you must split by time, training on the past and testing on the future.

Key points

Use a pandas DatetimeIndex to unlock resample, rolling and time-based selection.
Series decompose into trend, seasonality and noise.
Never shuffle time-series data — order carries the signal; split by time, not randomly.

2Decomposition & stationarity

Two diagnostics set up every forecast: decomposition separates the components so you can see them, and a stationarity test tells you whether the series' statistics are stable over time — a precondition for classic models like ARIMA.

Decompose the series

from statsmodels.tsa.seasonal import seasonal_decompose

result = seasonal_decompose(monthly, model='additive', period=12)
result.plot()      # shows observed / trend / seasonal / residual

Decomposition splits the observed series into trend, a repeating seasonal pattern, and residual noise.

Test for stationarity (ADF)

from statsmodels.tsa.stattools import adfuller

stat, p = adfuller(monthly)[:2]
print(f'ADF statistic: {stat:.3f}')
print(f'p-value      : {p:.3f}')

# If non-stationary (p > 0.05), difference it to remove the trend
stat2, p2 = adfuller(monthly.diff().dropna())[:2]
print(f'After differencing p-value: {p2:.3f}')

▶ Output

ADF statistic: -1.842
p-value      : 0.361
After differencing p-value: 0.011

p = 0.36 > 0.05, so the raw series is non-stationary (it trends). After one round of differencing (subtracting the previous value), p drops to 0.011 — now stationary, ready for ARIMA.

Differencing is the d in ARIMA. It removes trend so the model sees a stable series. The Augmented Dickey-Fuller (ADF) test's null hypothesis is “non-stationary”, so a small p-value is what you want.

Key points

seasonal_decompose separates a series into trend, seasonal and residual parts.
The ADF test checks stationarity; p < 0.05 means stationary (its null is non-stationary).
Differencing removes trend to make a series stationary — the d term in ARIMA.

3Classic forecasting: ARIMA & exponential smoothing

Two workhorses dominate classical forecasting. ARIMA models a series from its own past values and past errors; exponential smoothing (Holt-Winters) weights recent observations more and handles trend and seasonality directly.

ARIMA(p, d, q)

from statsmodels.tsa.arima.model import ARIMA

# p=AR terms, d=differencing, q=MA terms
model = ARIMA(monthly, order=(1, 1, 1)).fit()
forecast = model.forecast(steps=6)

print('AIC:', round(model.aic, 1))
print(forecast.round(0))

▶ Output

AIC: 512.4
2024-01-31    50180.0
2024-02-29    49620.0
2024-03-31    51040.0
2024-04-30    50890.0
2024-05-31    51230.0
2024-06-30    51310.0
Freq: ME, Name: predicted_mean, dtype: float64

Holt-Winters for trend + seasonality

from statsmodels.tsa.holtwinters import ExponentialSmoothing

hw = ExponentialSmoothing(monthly, trend='add',
                          seasonal='add', seasonal_periods=12).fit()
hw_forecast = hw.forecast(6)
print('Holt-Winters AIC:', round(hw.aic, 1))

Choosing p, d, q. Use the ADF test for d, and ACF/PACF plots (or a library like pmdarima.auto_arima) to pick p and q. Compare candidate models by AIC — lower is better — but always confirm on a held-out future period.

Key points

ARIMA(p, d, q) forecasts from past values (AR), differencing (I) and past errors (MA).
Exponential smoothing (Holt-Winters) weights recent data and models trend + seasonality.
Pick orders with ACF/PACF or auto_arima; compare by AIC, then validate on the future.

4Machine learning for forecasting

You can also turn forecasting into a supervised problem: engineer lag features (past values as columns) and let a model like gradient boosting learn the pattern. This shines when you have many related series or extra predictors (price, weather, promotions).

Build lag & rolling features

import pandas as pd

d = monthly.to_frame('sales')
d['lag_1']  = d['sales'].shift(1)       # last month
d['lag_12'] = d['sales'].shift(12)      # same month last year
d['roll_3'] = d['sales'].shift(1).rolling(3).mean()
d['month']  = d.index.month

d = d.dropna()
print(d.tail(3))

engineered features
date	sales	lag_1	lag_12	roll_3	month
2023-10-31	48900	47200	46100	47350	10
2023-11-30	52300	48900	49800	48000	11
2023-12-31	61500	52300	58200	49467	12

from sklearn.ensemble import GradientBoostingRegressor

X = d.drop(columns='sales')
y = d['sales']
# Time-ordered split: train on past, test on most recent
X_tr, X_te = X.iloc[:-6], X.iloc[-6:]
y_tr, y_te = y.iloc[:-6], y.iloc[-6:]

model = GradientBoostingRegressor(random_state=42).fit(X_tr, y_tr)
print('Test R2:', round(model.score(X_te, y_te), 3))

▶ Output

Test R2: 0.881

Mind the leakage. Every feature must use only information available before the point you are predicting. Note the .shift(1) on the rolling mean — without it, the window would include the current value you are trying to forecast. This subtle leak is the classic time-series mistake.

Key points

Reframe forecasting as supervised learning using lag and rolling features.
ML forecasters handle extra predictors and many series well (e.g. gradient boosting).
Every feature must use only past information — shift before rolling to avoid leakage.

5Evaluating forecasts honestly

Forecast evaluation has its own rules. You split by time, and you often backtest with a rolling origin to simulate forecasting repeatedly through history.

Always train on the past and test on the future — the chronological order must be preserved.

The right metrics

from sklearn.metrics import mean_absolute_error, mean_absolute_percentage_error
import numpy as np

mae  = mean_absolute_error(y_te, model.predict(X_te))
mape = mean_absolute_percentage_error(y_te, model.predict(X_te)) * 100
rmse = np.sqrt(((y_te - model.predict(X_te)) ** 2).mean())

print(f'MAE : {mae:.0f}')
print(f'RMSE: {rmse:.0f}')
print(f'MAPE: {mape:.1f}%')

▶ Output

MAE : 2140
RMSE: 2630
MAPE: 4.3%

Always beat a naive baseline

# Naive forecast: tomorrow = today (or this month = last year's month)
naive_pred = y_te.shift(1).fillna(y_tr.iloc[-1])
naive_mae  = mean_absolute_error(y_te, naive_pred)
print(f'Naive MAE: {naive_mae:.0f}  vs  Model MAE: {mae:.0f}')

▶ Output

Naive MAE: 3580  vs  Model MAE: 2140

If you can't beat “tomorrow = today”, your model adds nothing. The naive (or seasonal-naive) forecast is the bar every model must clear. Many fancy models quietly fail this test — checking it keeps you honest.

Key points

Split chronologically (train on past, test on future); backtest with a rolling origin.
Report MAE, RMSE and MAPE — MAPE gives an intuitive percentage error.
Always compare against a naive/seasonal-naive baseline; beating it is the minimum bar.

6A practical forecasting workflow

Tying it together into a repeatable process — and meeting Prophet, a friendly library for business forecasts with holidays and changing trends.

Forecast with Prophet

from prophet import Prophet

# Prophet expects columns named 'ds' (date) and 'y' (value)
frame = monthly.reset_index()
frame.columns = ['ds', 'y']

m = Prophet(yearly_seasonality=True)
m.fit(frame)

future = m.make_future_dataframe(periods=6, freq='ME')
fc = m.predict(future)
print(fc[['ds', 'yhat', 'yhat_lower', 'yhat_upper']].tail(3))

Prophet forecast with uncertainty band
ds	yhat	yhat_lower	yhat_upper
2024-04-30	50890	47710	54050
2024-05-31	51230	47980	54600
2024-06-30	51310	47820	54910

A good forecast comes with an uncertainty band — never report a single line as if it were certain.

Always show uncertainty. A forecast without a confidence interval is dangerously overconfident. Whether from ARIMA, Prophet or a quantile model, present the range — decision-makers need to know how wrong you might be.

The workflow checklist

Plot the raw series; fix gaps and outliers.
Decompose; test stationarity; difference if needed.
Try a naive baseline, a classic model (ARIMA/Holt-Winters/Prophet) and an ML model.
Validate by time (backtest), report MAE/MAPE with an uncertainty band, and beat the baseline.

Key points

Prophet offers easy business forecasting with seasonality, holidays and uncertainty bands.
Always report a forecast with its confidence interval — never a bare single line.
Repeatable workflow: plot → decompose/test → baseline + models → time-validated, uncertainty-aware report.

★ Hands-on Project — Forecast a Real Time Series

Build an end-to-end forecast on real temporal data, validated honestly and compared against a baseline.

Load a time series (e.g. airline passengers, retail sales, energy demand, or a stock's monthly close) with a DatetimeIndex.
Plot it; handle missing dates and outliers; resample to a sensible frequency (e.g. monthly).
Decompose into trend/seasonal/residual and run an ADF test; difference if it is non-stationary.
Hold out the last 6–12 periods as a time-ordered test set (never shuffle).
Fit at least three forecasters: a naive baseline, a classic model (ARIMA or Holt-Winters or Prophet), and an ML model with lag features.
Evaluate all three on the held-out future with MAE, RMSE and MAPE; confirm each beats (or fails) the naive baseline.
Produce a final forecast for the next periods WITH an uncertainty band and plot history + forecast together.
Write up which model won and why, note the leakage checks you made, and commit the notebook to your portfolio.

Ready to test yourself?

Take the module quiz. Score 70% or more to mark this module complete.

Start the quiz →

💡 Log in to save your progress and earn the certificate.

← Previous

Natural Language Processing

MLOps & Model Deployment