#1 Data Analytics Program in India
₹2,499₹1,499Enroll Now
5 min read min read

String Methods and Operations

Learn to split, extract, and search text in pandas

String Methods and Operations

Split Text into Parts

code.py
import pandas as pd

df = pd.DataFrame({
    'Full_Name': ['John Doe', 'Sarah Smith', 'Mike Johnson']
})

# Split by space
df[['First', 'Last']] = df['Full_Name'].str.split(' ', expand=True)
print(df)

Output:

Full_Name First Last 0 John Doe John Doe 1 Sarah Smith Sarah Smith 2 Mike Johnson Mike Johnson

Get Part of Split

code.py
# Get only first name
df['First'] = df['Full_Name'].str.split(' ').str[0]

# Get only last name
df['Last'] = df['Full_Name'].str.split(' ').str[1]

Extract Part of Text

Get specific characters:

code.py
df = pd.DataFrame({
    'Code': ['ABC-123', 'DEF-456', 'GHI-789']
})

# Get first 3 characters
df['Letters'] = df['Code'].str[:3]

# Get last 3 characters
df['Numbers'] = df['Code'].str[-3:]

print(df)

Output:

Code Letters Numbers 0 ABC-123 ABC 123 1 DEF-456 DEF 456 2 GHI-789 GHI 789

Check Start and End

code.py
df = pd.DataFrame({
    'Email': ['john@gmail.com', 'sarah@yahoo.com', 'test@gmail.com']
})

# Check if ends with gmail.com
df['Is_Gmail'] = df['Email'].str.endswith('gmail.com')

# Check if starts with 'john'
df['Is_John'] = df['Email'].str.startswith('john')

print(df)

Output:

Email Is_Gmail Is_John 0 john@gmail.com True True 1 sarah@yahoo.com False False 2 test@gmail.com True False

Find Position of Text

code.py
df = pd.DataFrame({
    'Text': ['Hello World', 'Python is great', 'Data Science']
})

# Find where 'o' first appears (starts from 0)
df['Position'] = df['Text'].str.find('o')
print(df)

Output:

Text Position 0 Hello World 4 1 Python is great 4 2 Data Science -1 <- not found

-1 means not found.

Count How Many Times Text Appears

code.py
df = pd.DataFrame({
    'Text': ['banana', 'apple', 'pineapple']
})

# Count letter 'a'
df['A_Count'] = df['Text'].str.count('a')
print(df)

Output:

Text A_Count 0 banana 3 1 apple 1 2 pineapple 1

Pad Text with Zeros

Good for codes that need fixed length:

code.py
df = pd.DataFrame({
    'ID': ['1', '23', '456']
})

# Make all IDs 5 digits with leading zeros
df['ID_Padded'] = df['ID'].str.zfill(5)
print(df)

Output:

ID ID_Padded 0 1 00001 1 23 00023 2 456 00456

Quick Reference

MethodWhat It Does
.str.split('x')Split text by 'x'
.str[:3]First 3 characters
.str[-3:]Last 3 characters
.str.startswith('x')Starts with 'x'?
.str.endswith('x')Ends with 'x'?
.str.find('x')Position of 'x'
.str.count('x')How many 'x'?
.str.zfill(5)Pad with zeros

Key Points

  • split() breaks text into parts
  • str[start:end] extracts part of text
  • startswith() and endswith() check text
  • find() returns position (-1 if not found)
  • count() counts occurrences

What's Next?

For complex text patterns, learn Regular Expressions - a powerful search language.

SkillsetMaster - AI, Web Development & Data Analytics Courses