#1 Data Analytics Program in India
₹2,499₹1,499Enroll Now
Module 7
6 min read

Multiple Regression

Predict using multiple variables at once

What You'll Learn

  • Using multiple predictors
  • Interpreting coefficients
  • Adjusted R-squared
  • Multicollinearity problem

What is Multiple Regression?

Multiple Regression Equation

Simple regression: Price = 50000 + 100(Square Feet)

Multiple regression: Price = 30000 + 100(Sqft) + 5000(Bedrooms) - 500(Age)

Why? More accurate predictions!

Interpreting Coefficients

Example: Salary = 40000 + 3000(Experience) + 5000(Education)

Meaning:

  • Each year experience → +$3000 (holding education constant)
  • Each education level → +$5000 (holding experience constant)

Key phrase: "Holding other variables constant"

Adjusted R-Squared

Problem: Regular R² always increases when adding variables

Solution: Adjusted R² penalizes adding useless predictors

Use it to compare models:

  • Model A: 2 predictors, Adj R² = 0.75
  • Model B: 5 predictors, Adj R² = 0.72 → Model A is better!

Multicollinearity

Multicollinearity Problem

What: When predictors are highly correlated

Example: Using both "Height in inches" and "Height in cm"

Problem:

  • Unstable coefficients
  • Hard to interpret

Detection:

  • Check correlation between predictors
  • Use VIF (Variance Inflation Factor)
  • VIF > 10 = Problem!

Fix: Remove one of the correlated variables

Categorical Variables

Can't use categories directly!

Solution: Dummy variables (0 or 1)

Example: Gender: Male = 0, Female = 1

Multiple categories: Region (North, South, West) → Create 2 dummies:

  • North: 1 if North, 0 otherwise
  • South: 1 if South, 0 otherwise
  • West: Both = 0 (reference group)

Real Example

Predicting house prices:

Price = 50000 + 100(Sqft) + 10000(Bedrooms) - 1000(Age) + 20000(Urban)

Interpretation:

  • +100 sqft → +$10,000
  • +1 bedroom → +$10,000
  • +1 year old → -$1,000
  • Urban location → +$20,000 vs rural

Practice Exercise

Dataset: Predict test scores using:

  • Study hours (0-10)
  • Sleep hours (4-10)
  • Previous test score (0-100)

Questions:

  1. Write the regression equation
  2. Interpret each coefficient
  3. What if study hours and previous score are highly correlated?
  4. How would you check model quality?

Next Steps

Learn about Interpreting Results!

Tip: Start simple, add complexity only when needed!

SkillsetMaster - AI, Web Development & Data Analytics Courses