#1 Data Analytics Program in India
₹2,499₹1,499Enroll Now
6 min read min read

Sets and Set Operations

Master sets and powerful operations for working with unique collections

Sets and Set Operations

What is a Set?

Think of a set like a bag of unique items. If you try to put the same item twice, it only keeps one copy.

A set in Python is a collection that stores unique values. No duplicates allowed. Sets are also unordered, meaning items don't have positions.

Why use sets?

  • Remove duplicates from data automatically
  • Check if item exists very fast
  • Find common items between groups
  • Find differences between groups
  • Mathematical operations like union and intersection

Creating a Set

You create a set using curly braces with values, or using the set() function.

code.py
fruits = {"apple", "banana", "orange"}
print(fruits)

What happens: Creates a set with 3 unique fruits.

Different ways to create sets:

  1. From a list (removes duplicates)
code.py
numbers = [1, 2, 2, 3, 3, 3]
unique = set(numbers)
print(unique)

Result: {1, 2, 3} - duplicates removed

  1. Empty set (must use set())
code.py
empty = set()

Important: You cannot use {} for empty set because that creates an empty dictionary.

  1. From a string (unique characters)
code.py
letters = set("hello")
print(letters)

Result: {'h', 'e', 'l', 'o'} - only unique letters

Key Properties of Sets

No Duplicates

code.py
numbers = {1, 2, 2, 3, 3, 3}
print(numbers)

Result: {1, 2, 3}

Python automatically removes duplicates.

No Order

code.py
fruits = {"apple", "banana", "orange"}
print(fruits)

Result might be: {'banana', 'orange', 'apple'}

The order can change. Sets don't remember insertion order. You cannot use index numbers.

No Indexing

code.py
fruits = {"apple", "banana", "orange"}
print(fruits[0])  # Error! Sets don't support indexing

Adding Items

Using add() - Add One Item

code.py
fruits = {"apple", "banana"}
fruits.add("orange")
print(fruits)

Result: {'apple', 'banana', 'orange'}

If item already exists:

code.py
fruits.add("apple")
print(fruits)

Nothing changes - set still has same items.

Using update() - Add Multiple Items

code.py
fruits = {"apple", "banana"}
fruits.update(["orange", "grape", "apple"])
print(fruits)

Result: {'apple', 'banana', 'orange', 'grape'}

The duplicate "apple" is ignored.

Removing Items

Using remove() - Remove Item (Error if Not Found)

code.py
fruits = {"apple", "banana", "orange"}
fruits.remove("banana")
print(fruits)

Result: {'apple', 'orange'}

Warning: If item doesn't exist, you get an error.

Using discard() - Remove Item (No Error)

code.py
fruits = {"apple", "banana", "orange"}
fruits.discard("grape")
print(fruits)

What happens: Nothing breaks. discard() is safe - it won't error if item doesn't exist.

Using pop() - Remove Random Item

code.py
fruits = {"apple", "banana", "orange"}
removed = fruits.pop()
print(removed)
print(fruits)

What this does: Removes and returns one item (you don't know which one because sets are unordered).

Using clear() - Remove Everything

code.py
fruits = {"apple", "banana", "orange"}
fruits.clear()
print(fruits)

Result: set() (empty set)

Set Operations

Sets have powerful operations for comparing and combining groups.

Union - Combine All Items

Union gives you all items from both sets (no duplicates).

code.py
set1 = {1, 2, 3}
set2 = {3, 4, 5}
combined = set1 | set2
print(combined)

Result: {1, 2, 3, 4, 5}

Or use union() method:

code.py
combined = set1.union(set2)

Real-world example: All students in Math class OR Science class.

Intersection - Find Common Items

Intersection gives you items that exist in both sets.

code.py
set1 = {1, 2, 3}
set2 = {3, 4, 5}
common = set1 & set2
print(common)

Result: {3}

Or use intersection() method:

code.py
common = set1.intersection(set2)

Real-world example: Students in both Math AND Science class.

Difference - Items in First But Not Second

Difference gives you items in first set but not in second set.

code.py
set1 = {1, 2, 3}
set2 = {3, 4, 5}
diff = set1 - set2
print(diff)

Result: {1, 2}

Or use difference() method:

code.py
diff = set1.difference(set2)

Real-world example: Students in Math class but NOT in Science class.

Symmetric Difference - Items in Either But Not Both

Symmetric difference gives you items that are in one set or the other, but not in both.

code.py
set1 = {1, 2, 3}
set2 = {3, 4, 5}
sym_diff = set1 ^ set2
print(sym_diff)

Result: {1, 2, 4, 5}

Or use symmetric_difference() method:

code.py
sym_diff = set1.symmetric_difference(set2)

Real-world example: Students in Math OR Science but not both.

Checking Relationships

Subset - Is First Contained in Second?

code.py
set1 = {1, 2}
set2 = {1, 2, 3, 4}
print(set1.issubset(set2))

Shows: True (all items from set1 exist in set2)

Superset - Does First Contain All of Second?

code.py
set1 = {1, 2, 3, 4}
set2 = {1, 2}
print(set1.issuperset(set2))

Shows: True (set1 contains all items from set2)

Disjoint - No Common Items?

code.py
set1 = {1, 2, 3}
set2 = {4, 5, 6}
print(set1.isdisjoint(set2))

Shows: True (no items in common)

Practice Example

The scenario: You're analyzing customer purchases from two different stores to find patterns.

code.py
store1_customers = {"John", "Sarah", "Mike", "Emma", "David"}
store2_customers = {"Sarah", "David", "Lisa", "Tom", "Emma"}

all_customers = store1_customers | store2_customers
print("Total unique customers:", len(all_customers))

both_stores = store1_customers & store2_customers
print("Customers at both stores:", both_stores)

only_store1 = store1_customers - store2_customers
print("Only store 1:", only_store1)

either_not_both = store1_customers ^ store2_customers
print("Either store but not both:", either_not_both)

if "John" in all_customers:
    print("John is a customer")

new_customers = {"Alex", "Chris"}
all_customers.update(new_customers)
print("After adding new:", len(all_customers))

What this program does:

  1. Creates sets for customers at each store
  2. Finds all unique customers (union)
  3. Finds customers who visit both stores (intersection)
  4. Finds customers only at store 1 (difference)
  5. Finds customers at one store but not both (symmetric difference)
  6. Checks if John is a customer
  7. Adds new customers and shows updated count

Key Points to Remember

Sets store only unique values and automatically remove duplicates. They are unordered, so you cannot use index positions.

Use add() for one item, update() for multiple items. Both ignore duplicates automatically.

remove() errors if item doesn't exist, discard() is safe and won't error, pop() removes random item.

Union combines all items, intersection finds common items, difference finds items in first but not second.

Symmetric difference finds items in either set but not both. Use subset/superset to check if one set contains another.

Common Mistakes

Mistake 1: Trying to create empty set with {}

code.py
empty = {}  # This is a dictionary, not a set!
empty = set()  # Correct way

Mistake 2: Trying to use indexing

code.py
fruits = {"apple", "banana"}
print(fruits[0])  # Error! Sets don't support indexing

Mistake 3: Adding a list to a set

code.py
items = {1, 2, 3}
items.add([4, 5])  # Error! Lists are not allowed in sets

Sets can only contain immutable items (numbers, strings, tuples).

What's Next?

Now you understand sets and their operations. Next, you'll learn about list comprehensions - a powerful and clean way to create lists using loops in a single line.

SkillsetMaster - AI, Web Development & Data Analytics Courses