#1 Data Analytics Program in India
₹2,499₹1,499Enroll Now
6 min read min read

Selecting Rows with loc

Learn to select rows using label-based indexing with loc

Selecting Rows with loc

What is loc?

loc selects rows by label (index name).

Think of it as selecting by the row name, not position.

code.pyPython
import pandas as pd

df = pd.DataFrame({
    'Name': ['John', 'Sarah', 'Mike', 'Emma'],
    'Age': [25, 30, 28, 32],
    'City': ['NYC', 'LA', 'Chicago', 'Miami']
})

print(df)

Output:

Name Age City 0 John 25 NYC 1 Sarah 30 LA 2 Mike 28 Chicago 3 Emma 32 Miami

Select Single Row

code.pyPython
row = df.loc[0]
print(row)

Output:

Name John Age 25 City NYC Name: 0, dtype: object

Returns Series for single row.

Select Multiple Rows

code.pyPython
rows = df.loc[[0, 2]]
print(rows)

Output:

Name Age City 0 John 25 NYC 2 Mike 28 Chicago

Returns DataFrame for multiple rows.

Slicing Rows

code.pyPython
subset = df.loc[0:2]
print(subset)

Important: Includes end point! Gets rows 0, 1, AND 2.

This is different from Python slicing!

Select Rows and Columns

code.pyPython
result = df.loc[0, 'Name']
print(result)

Output:

John

Multiple rows, multiple columns:

code.pyPython
subset = df.loc[[0, 1], ['Name', 'City']]
print(subset)

Output:

Name City 0 John NYC 1 Sarah LA

All Rows, Specific Columns

code.pyPython
names_ages = df.loc[:, ['Name', 'Age']]
print(names_ages)

: means all rows.

Specific Rows, All Columns

code.pyPython
first_two = df.loc[0:1, :]
print(first_two)

Usually you can omit the second ::

code.pyPython
first_two = df.loc[0:1]

Boolean Indexing

Most powerful feature of loc!

code.pyPython
adults = df.loc[df['Age'] > 28]
print(adults)

What this does:

  1. df['Age'] > 28 creates True/False for each row
  2. loc uses these to select rows

Output:

Name Age City 1 Sarah 30 LA 3 Emma 32 Miami

Multiple Conditions

AND condition (&):

code.pyPython
result = df.loc[(df['Age'] > 25) & (df['City'] == 'LA')]
print(result)

OR condition (|):

code.pyPython
result = df.loc[(df['Age'] < 27) | (df['City'] == 'Miami')]
print(result)

NOT condition (~):

code.pyPython
not_nyc = df.loc[~(df['City'] == 'NYC')]
print(not_nyc)

Important: Use parentheses around each condition!

Custom Index

code.pyPython
df_custom = pd.DataFrame({
    'Product': ['Laptop', 'Phone', 'Tablet'],
    'Price': [999, 599, 399]
}, index=['A', 'B', 'C'])

print(df_custom)

Output:

Product Price A Laptop 999 B Phone 599 C Tablet 399

Select by custom index:

code.pyPython
row = df_custom.loc['B']
print(row)

Slice by labels:

code.pyPython
subset = df_custom.loc['A':'C']
print(subset)

Practice Example

The scenario: Analyze employee records.

code.pyPython
import pandas as pd

employees = pd.DataFrame({
    'Name': ['John', 'Sarah', 'Mike', 'Emma', 'David', 'Lisa'],
    'Department': ['Sales', 'IT', 'Sales', 'HR', 'IT', 'Sales'],
    'Salary': [50000, 75000, 55000, 60000, 80000, 52000],
    'Years': [3, 7, 4, 5, 9, 2],
    'Remote': [False, True, False, True, True, False]
})

print("All employees:")
print(employees)
print()

print("Single employee (row 0):")
print(employees.loc[0])
print()

print("First three employees:")
print(employees.loc[0:2])
print()

print("Names and salaries only:")
print(employees.loc[:, ['Name', 'Salary']])
print()

print("High earners (salary > 60000):")
high_earners = employees.loc[employees['Salary'] > 60000]
print(high_earners)
print()

print("IT department:")
it_dept = employees.loc[employees['Department'] == 'IT']
print(it_dept)
print()

print("Remote workers with 5+ years:")
experienced_remote = employees.loc[
    (employees['Remote'] == True) & (employees['Years'] >= 5)
]
print(experienced_remote)
print()

print("Sales OR high salary:")
sales_or_high = employees.loc[
    (employees['Department'] == 'Sales') | (employees['Salary'] > 70000)
]
print(sales_or_high)

String Methods

code.pyPython
df = pd.DataFrame({
    'Name': ['John Doe', 'Sarah Smith', 'Mike Jones'],
    'Email': ['john@email.com', 'sarah@email.com', 'mike@email.com']
})

gmail_users = df.loc[df['Email'].str.contains('email')]
print(gmail_users)

Other string methods:

code.pyPython
starts_with_j = df.loc[df['Name'].str.startswith('J')]
ends_with_e = df.loc[df['Email'].str.endswith('.com')]

isin() Method

code.pyPython
cities_to_find = ['NYC', 'LA']
result = df.loc[df['City'].isin(cities_to_find)]
print(result)

What this does: Select rows where City is NYC or LA.

between() Method

code.pyPython
mid_age = df.loc[df['Age'].between(26, 30)]
print(mid_age)

Includes both endpoints by default.

Combining loc with Columns

code.pyPython
result = df.loc[df['Age'] > 28, 'Name']
print(result)

Returns Series of names where age > 28.

Setting Values with loc

code.pyPython
df.loc[0, 'Age'] = 26
print(df)

Multiple values:

code.pyPython
df.loc[0:1, 'City'] = 'Boston'
print(df)

Conditional update:

code.pyPython
df.loc[df['Age'] < 30, 'Category'] = 'Young'
df.loc[df['Age'] >= 30, 'Category'] = 'Senior'
print(df)

Key Points to Remember

loc selects by label (index name), not position.

loc[row, column] format. Both can be labels, lists, or conditions.

Slicing with loc includes the endpoint: loc[0:2] gets 0, 1, AND 2.

Boolean indexing is powerful: loc[df['Age'] > 30] filters rows.

Multiple conditions need parentheses and & (AND) or | (OR).

Common Mistakes

Mistake 1: Forgetting parentheses

code.pyPython
df.loc[df['Age'] > 25 & df['City'] == 'NYC']  # Error!
df.loc[(df['Age'] > 25) & (df['City'] == 'NYC')]  # Correct

Mistake 2: Using 'and' instead of &

code.pyPython
df.loc[(df['Age'] > 25) and (df['City'] == 'NYC')]  # Error!
df.loc[(df['Age'] > 25) & (df['City'] == 'NYC')]  # Correct

Mistake 3: Single = instead of ==

code.pyPython
df.loc[df['City'] = 'NYC']  # Error!
df.loc[df['City'] == 'NYC']  # Correct

Mistake 4: Assuming Python slicing behavior

code.pyPython
df.loc[0:2]  # Gets 0, 1, 2 (includes endpoint!)
# Different from list[0:2] which gets 0, 1

Mistake 5: Chained assignment

code.pyPython
df[df['Age'] > 30]['Name'] = 'Senior'  # May not work!
df.loc[df['Age'] > 30, 'Name'] = 'Senior'  # Correct

What's Next?

You now know how to use loc for label-based selection. Next, you'll learn about iloc - selecting rows by position (numeric index).