#1 Data Analytics Program in India
₹2,499₹1,499Enroll Now
6 min read min read

Pivot Tables

Learn to create summary tables like in Excel with pivot_table()

Pivot Tables

What is a Pivot Table?

A pivot table is like a summary report.

Imagine you have thousands of sales records. A pivot table shows you:

  • Total sales by store (rows)
  • Broken down by product (columns)

Just like Excel pivot tables!

Basic Pivot Table

code.py
import pandas as pd

sales = pd.DataFrame({
    'Store': ['NYC', 'NYC', 'LA', 'LA'],
    'Product': ['Apple', 'Banana', 'Apple', 'Banana'],
    'Sales': [100, 150, 120, 180]
})

print("Raw Data:")
print(sales)

# Create pivot table
pivot = pd.pivot_table(
    sales,
    values='Sales',      # What to calculate
    index='Store',       # Rows
    columns='Product'    # Columns
)
print("\nPivot Table:")
print(pivot)

Output:

Raw Data: Store Product Sales 0 NYC Apple 100 1 NYC Banana 150 2 LA Apple 120 3 LA Banana 180 Pivot Table: Product Apple Banana Store LA 120 180 NYC 100 150

Easy to read summary!

Choose the Calculation (aggfunc)

By default, pivot_table calculates the mean. Change it:

code.py
sales = pd.DataFrame({
    'Store': ['NYC', 'NYC', 'LA', 'LA', 'LA'],
    'Product': ['Apple', 'Apple', 'Apple', 'Banana', 'Banana'],
    'Sales': [100, 120, 150, 180, 200]
})

# Sum instead of mean
pivot = pd.pivot_table(
    sales,
    values='Sales',
    index='Store',
    columns='Product',
    aggfunc='sum'
)
print(pivot)

Output:

Product Apple Banana Store LA 150 380 NYC 220 NaN

Common aggfunc Options

  • 'sum' - Total
  • 'mean' - Average (default)
  • 'count' - How many
  • 'min' - Smallest
  • 'max' - Largest

Fill Missing Values

code.py
pivot = pd.pivot_table(
    sales,
    values='Sales',
    index='Store',
    columns='Product',
    aggfunc='sum',
    fill_value=0    # Fill NaN with 0
)
print(pivot)

Output:

Product Apple Banana Store LA 150 380 NYC 220 0

Add Row and Column Totals

code.py
pivot = pd.pivot_table(
    sales,
    values='Sales',
    index='Store',
    columns='Product',
    aggfunc='sum',
    fill_value=0,
    margins=True,           # Add totals
    margins_name='Total'    # Name for totals
)
print(pivot)

Output:

Product Apple Banana Total Store LA 150 380 530 NYC 220 0 220 Total 370 380 750

Grand total is 750.

Multiple Values

Calculate sum AND count:

code.py
pivot = pd.pivot_table(
    sales,
    values='Sales',
    index='Store',
    columns='Product',
    aggfunc=['sum', 'count']
)
print(pivot)

Multiple Index Rows

code.py
data = pd.DataFrame({
    'Year': [2023, 2023, 2024, 2024],
    'Store': ['NYC', 'LA', 'NYC', 'LA'],
    'Sales': [100, 150, 120, 180]
})

pivot = pd.pivot_table(
    data,
    values='Sales',
    index=['Year', 'Store'],    # Two row levels
    aggfunc='sum'
)
print(pivot)

Output:

Sales Year Store 2023 LA 150 NYC 100 2024 LA 180 NYC 120

Practice Example

code.py
import pandas as pd

# Monthly sales data
orders = pd.DataFrame({
    'Month': ['Jan', 'Jan', 'Jan', 'Feb', 'Feb', 'Feb'],
    'Region': ['East', 'West', 'East', 'East', 'West', 'West'],
    'Product': ['A', 'A', 'B', 'A', 'B', 'A'],
    'Sales': [100, 150, 200, 120, 180, 160]
})

print("Order Data:")
print(orders)

# Sales by Region and Product
pivot1 = pd.pivot_table(
    orders,
    values='Sales',
    index='Region',
    columns='Product',
    aggfunc='sum',
    fill_value=0,
    margins=True
)
print("\nSales by Region and Product:")
print(pivot1)

# Sales by Month and Region
pivot2 = pd.pivot_table(
    orders,
    values='Sales',
    index='Month',
    columns='Region',
    aggfunc='sum',
    fill_value=0
)
print("\nSales by Month and Region:")
print(pivot2)

Pivot Table vs GroupBy

Both can summarize data, but:

FeaturePivot TableGroupBy
Output2D tableSeries or DataFrame
ColumnsAuto-createdNeed unstack()
TotalsEasy with marginsManual
Excel-likeYesNo
code.py
# GroupBy way
grouped = sales.groupby(['Store', 'Product'])['Sales'].sum().unstack(fill_value=0)

# Pivot Table way
pivot = pd.pivot_table(sales, values='Sales', index='Store',
                        columns='Product', aggfunc='sum', fill_value=0)

# Both give same result!

Key Points

  • pivot_table() creates summary tables
  • values = what to calculate
  • index = row labels
  • columns = column labels
  • aggfunc = how to calculate (sum, mean, count)
  • fill_value=0 fills missing data
  • margins=True adds totals

Common Mistakes

Mistake 1: Forgetting aggfunc defaults to mean

code.py
# This calculates MEAN, not sum
pd.pivot_table(data, values='Sales', index='Store')

# For sum, specify it
pd.pivot_table(data, values='Sales', index='Store', aggfunc='sum')

Mistake 2: Confusing index and columns

code.py
# index = rows, columns = columns
# Think: index goes DOWN, columns go ACROSS

What's Next?

You learned pivot tables. Next, you'll learn Cross Tabulation - counting combinations of categories.

SkillsetMaster - AI, Web Development & Data Analytics Courses