5 min read min read
Time Series Data
Learn to work with data that changes over time
Time Series Data
What is Time Series?
Time series is data collected over time:
- Daily stock prices
- Monthly sales
- Hourly temperature
- Yearly population
The key: data is ordered by time.
Set Date as Index
For time series, make the date column your index:
code.py
import pandas as pd
df = pd.DataFrame({
'Date': ['2024-01-01', '2024-01-02', '2024-01-03'],
'Sales': [100, 150, 120]
})
# Convert to datetime and set as index
df['Date'] = pd.to_datetime(df['Date'])
df = df.set_index('Date')
print(df)Output:
Sales
Date
2024-01-01 100
2024-01-02 150
2024-01-03 120
Select by Date
code.py
# Get specific date
print(df.loc['2024-01-02'])
# Get date range
print(df.loc['2024-01-01':'2024-01-02'])
# Get specific month
print(df.loc['2024-01']) # All of JanuaryResample - Change Time Frequency
Convert daily data to weekly or monthly:
code.py
# Daily sales data
df = pd.DataFrame({
'Date': pd.date_range('2024-01-01', periods=30),
'Sales': range(100, 130)
})
df = df.set_index('Date')
# Convert to weekly sum
weekly = df.resample('W').sum()
print(weekly)
# Convert to monthly average
monthly = df.resample('M').mean()
print(monthly)Common Resample Frequencies
| Code | Meaning |
|---|---|
| D | Day |
| W | Week |
| M | Month end |
| MS | Month start |
| Q | Quarter |
| Y | Year |
| H | Hour |
Rolling Average (Moving Average)
Smooths out daily ups and downs:
code.py
df = pd.DataFrame({
'Date': pd.date_range('2024-01-01', periods=10),
'Sales': [100, 120, 90, 150, 130, 140, 160, 110, 180, 170]
})
df = df.set_index('Date')
# 3-day rolling average
df['Rolling_Avg'] = df['Sales'].rolling(3).mean()
print(df)Shift Data (Lag/Lead)
Compare to previous period:
code.py
# Yesterday's sales
df['Yesterday'] = df['Sales'].shift(1)
# Tomorrow's sales
df['Tomorrow'] = df['Sales'].shift(-1)
# Change from yesterday
df['Change'] = df['Sales'] - df['Sales'].shift(1)Percent Change
code.py
# Daily percent change
df['Pct_Change'] = df['Sales'].pct_change() * 100
print(df)Create Date Range
code.py
# Create 7 days
dates = pd.date_range('2024-01-01', periods=7)
# Create January
jan_dates = pd.date_range('2024-01-01', '2024-01-31')
# Create weekly dates
weekly_dates = pd.date_range('2024-01-01', periods=12, freq='W')
# Create monthly dates
monthly_dates = pd.date_range('2024-01-01', periods=12, freq='MS')Key Points
- Time series has date as index
- resample() changes frequency (daily→weekly)
- rolling() calculates moving averages
- shift() gets previous/next values
- pct_change() calculates growth rate
- date_range() creates date sequences
What's Next?
Learn to find and remove duplicate rows in your data.