#1 Data Analytics Program in India
₹2,499₹1,499Enroll Now
4 min read min read

Dropping Missing Data

Learn to remove rows or columns with missing values

Dropping Missing Data

When to Drop Missing Data?

Drop missing data when:

  • Only a few rows have missing values
  • The missing data is random
  • You have enough data left after dropping

Don't drop when:

  • Too many rows would be removed
  • Missing data follows a pattern (important information!)

Drop Rows with Any Missing Value

code.py
import pandas as pd
import numpy as np

df = pd.DataFrame({
    'Name': ['John', 'Sarah', None, 'Mike'],
    'Age': [25, None, 30, 28],
    'City': ['NYC', 'LA', 'Chicago', None]
})

# Drop rows with ANY missing value
clean_df = df.dropna()
print(clean_df)

Output:

Name Age City 0 John 25.0 NYC

Only 1 row left! (Others had at least one missing value)

Drop Rows Where ALL Values are Missing

code.py
df = pd.DataFrame({
    'A': [1, None, None],
    'B': [2, None, 5],
    'C': [3, None, 6]
})

# Drop only if ALL columns are missing
clean_df = df.dropna(how='all')
print(clean_df)

Output:

A B C 0 1.0 2.0 3 2 NaN 5.0 6

Row 1 was completely empty, so it's removed.

Drop Based on Specific Columns

code.py
df = pd.DataFrame({
    'Name': ['John', 'Sarah', None],
    'Age': [25, None, 30],
    'Email': ['j@mail.com', None, 'x@mail.com']
})

# Only drop if Name is missing
clean_df = df.dropna(subset=['Name'])
print(clean_df)

Output:

Name Age Email 0 John 25.0 j@mail.com 1 Sarah NaN None

Sarah stays even though Age is missing.

Drop Columns with Missing Values

code.py
# Drop columns (not rows)
clean_df = df.dropna(axis=1)
print(clean_df)

axis=1 means columns, axis=0 means rows (default)

Keep Rows with Minimum Values

code.py
df = pd.DataFrame({
    'A': [1, None, None, 4],
    'B': [None, 2, None, 5],
    'C': [3, 3, None, 6]
})

# Keep rows with at least 2 non-missing values
clean_df = df.dropna(thresh=2)
print(clean_df)

Output:

A B C 0 1.0 NaN 3 1 NaN 2.0 3 3 4.0 5.0 6

Row 2 had only 0 values, so it's dropped.

Important: dropna() Returns New DataFrame

code.py
# This does NOT change original df
df.dropna()

# To change original, reassign or use inplace
df = df.dropna()
# OR
df.dropna(inplace=True)

Key Points

  • dropna() removes rows with missing values
  • how='all' removes only if ALL values missing
  • subset=['col'] checks only specific columns
  • axis=1 drops columns instead of rows
  • thresh=n keeps rows with at least n values
  • Original data unchanged unless you reassign

What's Next?

Sometimes dropping data removes too much. Next, learn to fill missing values instead.

SkillsetMaster - AI, Web Development & Data Analytics Courses