#1 Data Analytics Program in India
₹2,499₹1,499Enroll Now
6 min read min read

Sets and Set Operations

Master sets and powerful operations for working with unique collections

Sets and Set Operations

What is a Set?

Think of a set like a bag of unique items. If you try to put the same item twice, it only keeps one copy.

A set in Python is a collection that stores unique values. No duplicates allowed. Sets are also unordered, meaning items don't have positions.

Why use sets?

  • Remove duplicates from data automatically
  • Check if item exists very fast
  • Find common items between groups
  • Find differences between groups
  • Mathematical operations like union and intersection

Creating a Set

You create a set using curly braces with values, or using the set() function.

code.pyPython
fruits = {"apple", "banana", "orange"}
print(fruits)

What happens: Creates a set with 3 unique fruits.

Different ways to create sets:

  1. From a list (removes duplicates)
code.pyPython
numbers = [1, 2, 2, 3, 3, 3]
unique = set(numbers)
print(unique)

Result: {1, 2, 3} - duplicates removed

  1. Empty set (must use set())
code.pyPython
empty = set()

Important: You cannot use {} for empty set because that creates an empty dictionary.

  1. From a string (unique characters)
code.pyPython
letters = set("hello")
print(letters)

Result: {'h', 'e', 'l', 'o'} - only unique letters

Key Properties of Sets

No Duplicates

code.pyPython
numbers = {1, 2, 2, 3, 3, 3}
print(numbers)

Result: {1, 2, 3}

Python automatically removes duplicates.

No Order

code.pyPython
fruits = {"apple", "banana", "orange"}
print(fruits)

Result might be: {'banana', 'orange', 'apple'}

The order can change. Sets don't remember insertion order. You cannot use index numbers.

No Indexing

code.pyPython
fruits = {"apple", "banana", "orange"}
print(fruits[0])  # Error! Sets don't support indexing

Adding Items

Using add() - Add One Item

code.pyPython
fruits = {"apple", "banana"}
fruits.add("orange")
print(fruits)

Result: {'apple', 'banana', 'orange'}

If item already exists:

code.pyPython
fruits.add("apple")
print(fruits)

Nothing changes - set still has same items.

Using update() - Add Multiple Items

code.pyPython
fruits = {"apple", "banana"}
fruits.update(["orange", "grape", "apple"])
print(fruits)

Result: {'apple', 'banana', 'orange', 'grape'}

The duplicate "apple" is ignored.

Removing Items

Using remove() - Remove Item (Error if Not Found)

code.pyPython
fruits = {"apple", "banana", "orange"}
fruits.remove("banana")
print(fruits)

Result: {'apple', 'orange'}

Warning: If item doesn't exist, you get an error.

Using discard() - Remove Item (No Error)

code.pyPython
fruits = {"apple", "banana", "orange"}
fruits.discard("grape")
print(fruits)

What happens: Nothing breaks. discard() is safe - it won't error if item doesn't exist.

Using pop() - Remove Random Item

code.pyPython
fruits = {"apple", "banana", "orange"}
removed = fruits.pop()
print(removed)
print(fruits)

What this does: Removes and returns one item (you don't know which one because sets are unordered).

Using clear() - Remove Everything

code.pyPython
fruits = {"apple", "banana", "orange"}
fruits.clear()
print(fruits)

Result: set() (empty set)

Set Operations

Sets have powerful operations for comparing and combining groups.

Union - Combine All Items

Union gives you all items from both sets (no duplicates).

code.pyPython
set1 = {1, 2, 3}
set2 = {3, 4, 5}
combined = set1 | set2
print(combined)

Result: {1, 2, 3, 4, 5}

Or use union() method:

code.pyPython
combined = set1.union(set2)

Real-world example: All students in Math class OR Science class.

Intersection - Find Common Items

Intersection gives you items that exist in both sets.

code.pyPython
set1 = {1, 2, 3}
set2 = {3, 4, 5}
common = set1 & set2
print(common)

Result: {3}

Or use intersection() method:

code.pyPython
common = set1.intersection(set2)

Real-world example: Students in both Math AND Science class.

Difference - Items in First But Not Second

Difference gives you items in first set but not in second set.

code.pyPython
set1 = {1, 2, 3}
set2 = {3, 4, 5}
diff = set1 - set2
print(diff)

Result: {1, 2}

Or use difference() method:

code.pyPython
diff = set1.difference(set2)

Real-world example: Students in Math class but NOT in Science class.

Symmetric Difference - Items in Either But Not Both

Symmetric difference gives you items that are in one set or the other, but not in both.

code.pyPython
set1 = {1, 2, 3}
set2 = {3, 4, 5}
sym_diff = set1 ^ set2
print(sym_diff)

Result: {1, 2, 4, 5}

Or use symmetric_difference() method:

code.pyPython
sym_diff = set1.symmetric_difference(set2)

Real-world example: Students in Math OR Science but not both.

Checking Relationships

Subset - Is First Contained in Second?

code.pyPython
set1 = {1, 2}
set2 = {1, 2, 3, 4}
print(set1.issubset(set2))

Shows: True (all items from set1 exist in set2)

Superset - Does First Contain All of Second?

code.pyPython
set1 = {1, 2, 3, 4}
set2 = {1, 2}
print(set1.issuperset(set2))

Shows: True (set1 contains all items from set2)

Disjoint - No Common Items?

code.pyPython
set1 = {1, 2, 3}
set2 = {4, 5, 6}
print(set1.isdisjoint(set2))

Shows: True (no items in common)

Practice Example

The scenario: You're analyzing customer purchases from two different stores to find patterns.

code.pyPython
store1_customers = {"John", "Sarah", "Mike", "Emma", "David"}
store2_customers = {"Sarah", "David", "Lisa", "Tom", "Emma"}

all_customers = store1_customers | store2_customers
print("Total unique customers:", len(all_customers))

both_stores = store1_customers & store2_customers
print("Customers at both stores:", both_stores)

only_store1 = store1_customers - store2_customers
print("Only store 1:", only_store1)

either_not_both = store1_customers ^ store2_customers
print("Either store but not both:", either_not_both)

if "John" in all_customers:
    print("John is a customer")

new_customers = {"Alex", "Chris"}
all_customers.update(new_customers)
print("After adding new:", len(all_customers))

What this program does:

  1. Creates sets for customers at each store
  2. Finds all unique customers (union)
  3. Finds customers who visit both stores (intersection)
  4. Finds customers only at store 1 (difference)
  5. Finds customers at one store but not both (symmetric difference)
  6. Checks if John is a customer
  7. Adds new customers and shows updated count

Key Points to Remember

Sets store only unique values and automatically remove duplicates. They are unordered, so you cannot use index positions.

Use add() for one item, update() for multiple items. Both ignore duplicates automatically.

remove() errors if item doesn't exist, discard() is safe and won't error, pop() removes random item.

Union combines all items, intersection finds common items, difference finds items in first but not second.

Symmetric difference finds items in either set but not both. Use subset/superset to check if one set contains another.

Common Mistakes

Mistake 1: Trying to create empty set with {}

code.pyPython
empty = {}  # This is a dictionary, not a set!
empty = set()  # Correct way

Mistake 2: Trying to use indexing

code.pyPython
fruits = {"apple", "banana"}
print(fruits[0])  # Error! Sets don't support indexing

Mistake 3: Adding a list to a set

code.pyPython
items = {1, 2, 3}
items.add([4, 5])  # Error! Lists are not allowed in sets

Sets can only contain immutable items (numbers, strings, tuples).

What's Next?

Now you understand sets and their operations. Next, you'll learn about list comprehensions - a powerful and clean way to create lists using loops in a single line.