#1 Data Analytics Program in India
₹2,499₹1,499Enroll Now
6 min read min read

Selecting Rows with loc

Learn to select rows using label-based indexing with loc

Selecting Rows with loc

What is loc?

loc selects rows by label (index name).

Think of it as selecting by the row name, not position.

code.py
import pandas as pd

df = pd.DataFrame({
    'Name': ['John', 'Sarah', 'Mike', 'Emma'],
    'Age': [25, 30, 28, 32],
    'City': ['NYC', 'LA', 'Chicago', 'Miami']
})

print(df)

Output:

Name Age City 0 John 25 NYC 1 Sarah 30 LA 2 Mike 28 Chicago 3 Emma 32 Miami

Select Single Row

code.py
row = df.loc[0]
print(row)

Output:

Name John Age 25 City NYC Name: 0, dtype: object

Returns Series for single row.

Select Multiple Rows

code.py
rows = df.loc[[0, 2]]
print(rows)

Output:

Name Age City 0 John 25 NYC 2 Mike 28 Chicago

Returns DataFrame for multiple rows.

Slicing Rows

code.py
subset = df.loc[0:2]
print(subset)

Important: Includes end point! Gets rows 0, 1, AND 2.

This is different from Python slicing!

Select Rows and Columns

code.py
result = df.loc[0, 'Name']
print(result)

Output:

John

Multiple rows, multiple columns:

code.py
subset = df.loc[[0, 1], ['Name', 'City']]
print(subset)

Output:

Name City 0 John NYC 1 Sarah LA

All Rows, Specific Columns

code.py
names_ages = df.loc[:, ['Name', 'Age']]
print(names_ages)

: means all rows.

Specific Rows, All Columns

code.py
first_two = df.loc[0:1, :]
print(first_two)

Usually you can omit the second ::

code.py
first_two = df.loc[0:1]

Boolean Indexing

Most powerful feature of loc!

code.py
adults = df.loc[df['Age'] > 28]
print(adults)

What this does:

  1. df['Age'] > 28 creates True/False for each row
  2. loc uses these to select rows

Output:

Name Age City 1 Sarah 30 LA 3 Emma 32 Miami

Multiple Conditions

AND condition (&):

code.py
result = df.loc[(df['Age'] > 25) & (df['City'] == 'LA')]
print(result)

OR condition (|):

code.py
result = df.loc[(df['Age'] < 27) | (df['City'] == 'Miami')]
print(result)

NOT condition (~):

code.py
not_nyc = df.loc[~(df['City'] == 'NYC')]
print(not_nyc)

Important: Use parentheses around each condition!

Custom Index

code.py
df_custom = pd.DataFrame({
    'Product': ['Laptop', 'Phone', 'Tablet'],
    'Price': [999, 599, 399]
}, index=['A', 'B', 'C'])

print(df_custom)

Output:

Product Price A Laptop 999 B Phone 599 C Tablet 399

Select by custom index:

code.py
row = df_custom.loc['B']
print(row)

Slice by labels:

code.py
subset = df_custom.loc['A':'C']
print(subset)

Practice Example

The scenario: Analyze employee records.

code.py
import pandas as pd

employees = pd.DataFrame({
    'Name': ['John', 'Sarah', 'Mike', 'Emma', 'David', 'Lisa'],
    'Department': ['Sales', 'IT', 'Sales', 'HR', 'IT', 'Sales'],
    'Salary': [50000, 75000, 55000, 60000, 80000, 52000],
    'Years': [3, 7, 4, 5, 9, 2],
    'Remote': [False, True, False, True, True, False]
})

print("All employees:")
print(employees)
print()

print("Single employee (row 0):")
print(employees.loc[0])
print()

print("First three employees:")
print(employees.loc[0:2])
print()

print("Names and salaries only:")
print(employees.loc[:, ['Name', 'Salary']])
print()

print("High earners (salary > 60000):")
high_earners = employees.loc[employees['Salary'] > 60000]
print(high_earners)
print()

print("IT department:")
it_dept = employees.loc[employees['Department'] == 'IT']
print(it_dept)
print()

print("Remote workers with 5+ years:")
experienced_remote = employees.loc[
    (employees['Remote'] == True) & (employees['Years'] >= 5)
]
print(experienced_remote)
print()

print("Sales OR high salary:")
sales_or_high = employees.loc[
    (employees['Department'] == 'Sales') | (employees['Salary'] > 70000)
]
print(sales_or_high)

String Methods

code.py
df = pd.DataFrame({
    'Name': ['John Doe', 'Sarah Smith', 'Mike Jones'],
    'Email': ['john@email.com', 'sarah@email.com', 'mike@email.com']
})

gmail_users = df.loc[df['Email'].str.contains('email')]
print(gmail_users)

Other string methods:

code.py
starts_with_j = df.loc[df['Name'].str.startswith('J')]
ends_with_e = df.loc[df['Email'].str.endswith('.com')]

isin() Method

code.py
cities_to_find = ['NYC', 'LA']
result = df.loc[df['City'].isin(cities_to_find)]
print(result)

What this does: Select rows where City is NYC or LA.

between() Method

code.py
mid_age = df.loc[df['Age'].between(26, 30)]
print(mid_age)

Includes both endpoints by default.

Combining loc with Columns

code.py
result = df.loc[df['Age'] > 28, 'Name']
print(result)

Returns Series of names where age > 28.

Setting Values with loc

code.py
df.loc[0, 'Age'] = 26
print(df)

Multiple values:

code.py
df.loc[0:1, 'City'] = 'Boston'
print(df)

Conditional update:

code.py
df.loc[df['Age'] < 30, 'Category'] = 'Young'
df.loc[df['Age'] >= 30, 'Category'] = 'Senior'
print(df)

Key Points to Remember

loc selects by label (index name), not position.

loc[row, column] format. Both can be labels, lists, or conditions.

Slicing with loc includes the endpoint: loc[0:2] gets 0, 1, AND 2.

Boolean indexing is powerful: loc[df['Age'] > 30] filters rows.

Multiple conditions need parentheses and & (AND) or | (OR).

Common Mistakes

Mistake 1: Forgetting parentheses

code.py
df.loc[df['Age'] > 25 & df['City'] == 'NYC']  # Error!
df.loc[(df['Age'] > 25) & (df['City'] == 'NYC')]  # Correct

Mistake 2: Using 'and' instead of &

code.py
df.loc[(df['Age'] > 25) and (df['City'] == 'NYC')]  # Error!
df.loc[(df['Age'] > 25) & (df['City'] == 'NYC')]  # Correct

Mistake 3: Single = instead of ==

code.py
df.loc[df['City'] = 'NYC']  # Error!
df.loc[df['City'] == 'NYC']  # Correct

Mistake 4: Assuming Python slicing behavior

code.py
df.loc[0:2]  # Gets 0, 1, 2 (includes endpoint!)
# Different from list[0:2] which gets 0, 1

Mistake 5: Chained assignment

code.py
df[df['Age'] > 30]['Name'] = 'Senior'  # May not work!
df.loc[df['Age'] > 30, 'Name'] = 'Senior'  # Correct

What's Next?

You now know how to use loc for label-based selection. Next, you'll learn about iloc - selecting rows by position (numeric index).

SkillsetMaster - AI, Web Development & Data Analytics Courses