Naive Methods | Learn Statistics Free - SkillsetMaster | Learn Data Analytics Free

What You'll Learn

Naive forecasting methods
Seasonal naive approach
Drift method
When simple methods work best
Benchmarking forecasts

Naive Forecast

Simplest method possible!

Formula: F_{t+1} = Y_t

Meaning: Tomorrow's forecast = Today's actual value

Example: Today's sales: $1000 Forecast for tomorrow: $1000

When it works:

Random walk data
Short-term forecasts
Stable patterns
Benchmark comparison

Why Naive Methods Matter

1. Baseline for comparison If your complex model can't beat naive, why use it?

2. Sometimes they win! For certain data, naive is hard to beat

3. Simple to explain Stakeholders understand easily

4. Fast to compute No training needed

5. Robust Can't overfit!

Naive Forecast Example

Weekly sales: Week 1: 100 Week 2: 105 Week 3: 110 Week 4: 108

Naive forecasts:

Forecast for Week 2: 100
Forecast for Week 3: 105
Forecast for Week 4: 110
Forecast for Week 5: 108

Simple!

Seasonal Naive

For data with seasonality:

Formula: F_{t+h} = Y_{t+h-m}

Where m = seasonal period

Meaning: Forecast = Same period last season

Example (monthly, yearly seasonality): Forecast for June 2024 = June 2023 actual

When it works:

Strong seasonality
Stable seasonal patterns
Retail, tourism data

Seasonal Naive Example

Monthly sales (2 years): Jan 2023: 100, Feb 2023: 90, Mar 2023: 110 Jan 2024: 120, Feb 2024: 108, Mar 2024: ?

Seasonal naive (m=12): Forecast for Mar 2024 = Mar 2023 = 110

Better than simple naive: Captures yearly pattern!

Drift Method

Allows for trend:

Formula: F_{t+h} = Y_t + h × [(Y_t - Y_1) / (t-1)]

Meaning: Last value + drift × horizon

Drift: Average change per period

Example: Period 1: 100 Period 2: 105 Period 3: 110 Period 4: 115

Drift = (115-100)/(4-1) = 5 per period

Forecast for Period 5 = 115 + 1×5 = 120 Forecast for Period 6 = 115 + 2×5 = 125

Comparing the Three

Data: 100, 110, 105, 115, 120

Naive: Forecast = 120

Seasonal Naive (m=4): Forecast = 110 (value 4 periods ago)

Drift: Drift = (120-100)/4 = 5 Forecast = 120 + 5 = 125

Which is best? Depends on data pattern!

Mean Forecast

Another simple method:

Formula: F_{t+1} = Mean of all historical data

Example: Historical: 100, 110, 105, 115, 120 Mean = 110 Forecast = 110 (for all future periods!)

When it works:

No trend
No seasonality
Data fluctuates around constant mean
Very stable process

Usually not recommended: Too simplistic for most real data

Residual Diagnostics

Even naive methods need checking!

Good residuals should be:

Uncorrelated (no pattern)
Zero mean
Constant variance
Normally distributed (ideally)

If residuals have pattern: → Information not captured → Try different method

Forecast Accuracy Metrics

MAE (Mean Absolute Error): Average |Actual - Forecast|

RMSE (Root Mean Squared Error): √[Average(Actual - Forecast)²]

MAPE (Mean Absolute Percentage Error): Average |Actual - Forecast|/Actual × 100%

Example: Actual: 100, 105, 110 Forecast: 98, 107, 108

MAE = (2+2+2)/3 = 2 RMSE = √[(4+4+4)/3] = 2.31 MAPE = (2+1.9+1.8)/3 = 1.9%

Benchmark Performance

Common practice: Compare your model to naive methods

Example:

Naive MAE: 5
Your model MAE: 4.5
Improvement: 10%

If your model worse than naive: Something's wrong! Debug or simplify.

MASE (Mean Absolute Scaled Error): Your MAE / Naive MAE < 1 is good!

When Naive Methods Excel

Stock prices: Often follow random walk → Naive is hard to beat!

Stable processes: Manufacturing output → Last value is good guess

Short-term forecasts: Tomorrow's weather temp → Use today's temp

High-frequency data: Hourly traffic → Last hour informative

When to Use More Complex Methods

Strong trend: Use exponential smoothing or regression

Strong seasonality: Use seasonal methods (Holt-Winters, SARIMA)

Multiple predictors: Use regression models

Long-term forecasts: Need models that capture patterns

External factors: Use causal models

Excel Implementation

Naive:

=B2  (copy down)

Seasonal Naive (m=12):

=B2  (for row 14, references 12 rows back)

Drift:

=LAST_VALUE + FORECAST_HORIZON * (LAST_VALUE - FIRST_VALUE)/(COUNT-1)

Calculate errors:

=ABS(Actual - Forecast)  (for MAE)
=(Actual - Forecast)^2   (for MSE)

Python Implementation

import pandas as pd
import numpy as np

# Data
data = [100, 105, 110, 108, 115, 120]
df = pd.DataFrame({'actual': data})

# Naive forecast
df['naive'] = df['actual'].shift(1)

# Seasonal naive (m=3)
df['seasonal_naive'] = df['actual'].shift(3)

# Drift
first_value = df['actual'].iloc[0]
last_value = df['actual'].iloc[-1]
n = len(df)
drift = (last_value - first_value) / (n - 1)

# Forecast next value
naive_forecast = last_value
drift_forecast = last_value + drift

# Calculate MAE
df['naive_error'] = np.abs(df['actual'] - df['naive'])
mae = df['naive_error'].mean()

print(f"Naive forecast: {naive_forecast}")
print(f"Drift forecast: {drift_forecast}")
print(f"MAE: {mae}")

Combination Forecasts

Ensemble approach: Average multiple methods

Example: Combined = 0.5×Naive + 0.5×Seasonal_Naive

Often better than individual: Reduces risk of one method failing

Optimal weights: Can be estimated from historical performance

Real-World Example

Daily website traffic:

Monday-Friday pattern: Use seasonal naive (m=7)

No clear weekly pattern: Use simple naive

Growing trend: Use drift method

Test all three: Choose based on MAE/RMSE

Cross-Validation for Naive Methods

Time series cross-validation:

Process:

Use data 1-10 to forecast 11
Use data 1-11 to forecast 12
Use data 1-12 to forecast 13
Calculate average error

Gives realistic performance estimate: Simulates actual forecasting scenario

Limitations of Naive Methods

Cannot capture:

Complex patterns
Multiple seasonality
Trend changes
External influences
Non-linear relationships

No confidence intervals: (Without bootstrap)

Fixed pattern assumption: Future like past

When Naive Actually Wins

Research shows: For some time series, naive beats complex models!

Reasons:

Overfitting in complex models
Data is truly random walk
Sample size too small for complex models
Structural breaks favor simple methods

M-competitions: Naive methods competitive in forecasting contests!

Practice Exercise

Quarterly sales: Q1 2023: 100 Q2 2023: 120 Q3 2023: 110 Q4 2023: 130 Q1 2024: 105

Calculate forecasts for Q2 2024:

Naive
Seasonal naive (m=4)
Drift method

Answers:

Naive: 105 (last value)
Seasonal naive: 120 (Q2 2023)
Drift: (105-100)/4 = 1.25, Forecast = 105 + 1.25 = 106.25

Forecasting Principles

From naive methods we learn:

1. Simplicity matters Start simple, add complexity only if needed

2. Benchmark everything Always compare to naive

3. Recent data is informative Last value carries information

4. Patterns repeat Seasonality is real and useful

5. Validate properly Use out-of-sample testing

Next Steps

Learn about ETS Framework!

Tip: Never skip naive methods - they're your forecasting benchmark!