Apply and Map Functions

Why Use Apply and Map?

Sometimes you need to do things that aren't built into pandas.

Examples:

Convert temperature from Celsius to Fahrenheit
Clean messy text in a specific way
Calculate custom scores

Apply and map let you run any function on your data.

Map - Transform Single Column

Map replaces values one by one:

code.py

import pandas as pd

df = pd.DataFrame({
    'Name': ['John', 'Sarah', 'Mike'],
    'Grade': ['A', 'B', 'A']
})

# Create a mapping dictionary
grade_points = {'A': 4.0, 'B': 3.0, 'C': 2.0}

# Map grades to points
df['Points'] = df['Grade'].map(grade_points)
print(df)

Output:

    Name Grade  Points
0   John     A     4.0
1  Sarah     B     3.0
2   Mike     A     4.0

Map with a Function

code.py

df = pd.DataFrame({
    'Name': ['john', 'sarah', 'mike'],
    'Age': [25, 30, 28]
})

# Capitalize names using a function
df['Name'] = df['Name'].map(str.upper)
print(df)

Output:

    Name  Age
0   JOHN   25
1  SARAH   30
2   MIKE   28

Apply - Run Function on Column

Apply is more flexible than map:

code.py

df = pd.DataFrame({
    'Name': ['John', 'Sarah', 'Mike'],
    'Salary': [50000, 60000, 55000]
})

# Custom function
def add_bonus(salary):
    return salary * 1.10  # 10% bonus

# Apply to column
df['With_Bonus'] = df['Salary'].apply(add_bonus)
print(df)

Output:

    Name  Salary  With_Bonus
0   John   50000     55000.0
1  Sarah   60000     66000.0
2   Mike   55000     60500.0

Apply with Lambda

Lambda is a short way to write simple functions:

code.py

# Same as above, but shorter
df['With_Bonus'] = df['Salary'].apply(lambda x: x * 1.10)

More examples:

code.py

# Double the value
df['Double'] = df['Salary'].apply(lambda x: x * 2)

# Check if above average
avg = df['Salary'].mean()
df['Above_Avg'] = df['Salary'].apply(lambda x: 'Yes' if x > avg else 'No')

print(df)

Apply to Entire DataFrame

Apply function to each column:

code.py

df = pd.DataFrame({
    'A': [1, 2, 3],
    'B': [4, 5, 6]
})

# Get max of each column
print(df.apply(max))

Output:

A    3
B    6
dtype: int64

Apply function to each row:

code.py

# Sum of each row
df['Row_Sum'] = df.apply(sum, axis=1)
print(df)

Output:

   A  B  Row_Sum
0  1  4        5
1  2  5        7
2  3  6        9

Apply with Multiple Columns

Access multiple columns in one function:

code.py

df = pd.DataFrame({
    'Price': [100, 200, 150],
    'Quantity': [5, 3, 4]
})

# Calculate total using both columns
df['Total'] = df.apply(lambda row: row['Price'] * row['Quantity'], axis=1)
print(df)

Output:

   Price  Quantity  Total
0    100         5    500
1    200         3    600
2    150         4    600

Applymap - Apply to Every Cell

Apply function to every single cell:

code.py

df = pd.DataFrame({
    'A': [1.234, 2.567],
    'B': [3.891, 4.123]
})

# Round every number
df_rounded = df.applymap(lambda x: round(x, 1))
print(df_rounded)

Output:

     A    B
0  1.2  3.9
1  2.6  4.1

Note: In newer pandas, use df.map() instead of applymap().

Practice Example

code.py

import pandas as pd

# Employee data
employees = pd.DataFrame({
    'Name': ['john doe', 'sarah smith', 'mike johnson'],
    'Department': ['sales', 'it', 'sales'],
    'Salary': [50000, 70000, 55000],
    'Years': [3, 5, 2]
})

print("Original data:")
print(employees)

# 1. Capitalize names
employees['Name'] = employees['Name'].apply(str.title)

# 2. Map departments to codes
dept_codes = {'sales': 'S', 'it': 'I', 'hr': 'H'}
employees['Dept_Code'] = employees['Department'].map(dept_codes)

# 3. Calculate raise based on years
def calculate_raise(years):
    if years >= 5:
        return 0.15  # 15% raise
    elif years >= 3:
        return 0.10  # 10% raise
    else:
        return 0.05  # 5% raise

employees['Raise_Pct'] = employees['Years'].apply(calculate_raise)

# 4. Calculate new salary
employees['New_Salary'] = employees.apply(
    lambda row: row['Salary'] * (1 + row['Raise_Pct']),
    axis=1
)

print("\nProcessed data:")
print(employees)

Map vs Apply vs Applymap

Method	Works On	Best For
map	Series (column)	Simple replacements
apply	Series or DataFrame	Custom functions
applymap	DataFrame	Every cell

Key Points

map(): Replace values using dictionary or function
apply(): Run any function on column or row
lambda: Short way to write simple functions
axis=1: Apply function to each row
axis=0: Apply function to each column (default)

Common Mistakes

Mistake 1: Forgetting axis for row operations

code.py

# This applies to columns (default)
df.apply(sum)

# For rows, add axis=1
df.apply(sum, axis=1)

Mistake 2: Map with missing values in dictionary

code.py

mapping = {'A': 1, 'B': 2}
df['Grade'].map(mapping)  # 'C' becomes NaN

# Handle missing
df['Grade'].map(mapping).fillna(0)

Mistake 3: Using apply when vectorized operation exists

code.py

# Slow
df['Double'] = df['Value'].apply(lambda x: x * 2)

# Fast
df['Double'] = df['Value'] * 2

What's Next?

You learned apply and map. Next, you'll learn Window Functions - calculating moving averages and rankings.