Introduction to NumPy
Learn what NumPy is and why it is essential for data analysis
Introduction to NumPy
What is NumPy?
NumPy stands for Numerical Python. It's a powerful library for working with numbers and doing math operations in Python.
Think of NumPy as a supercharged calculator that can handle millions of numbers at once.
Why NumPy is special:
- Very fast (100x faster than regular Python)
- Handles large amounts of data easily
- Built for scientific computing
- Used by data scientists everywhere
Why Use NumPy?
Regular Python lists are slow:
numbers = [1, 2, 3, 4, 5]
doubled = []
for n in numbers:
doubled.append(n * 2)
print(doubled)NumPy is much faster:
import numpy as np
numbers = np.array([1, 2, 3, 4, 5])
doubled = numbers * 2
print(doubled)What makes it faster: One line instead of a loop. NumPy does all operations at once.
Where NumPy is Used
Data analysis:
- Analyzing sales data
- Processing survey results
- Financial calculations
Machine Learning:
- Training AI models
- Image processing
- Pattern recognition
Science and Engineering:
- Physics simulations
- Statistics
- Signal processing
Installing NumPy
Before using NumPy, you need to install it.
pip install numpy
What this does: Downloads and installs NumPy on your computer.
Importing NumPy
Always import NumPy with the nickname "np".
import numpy as npWhy use np: It's shorter to type. Everyone uses this convention.
Check NumPy version:
import numpy as np
print(np.__version__)NumPy Arrays vs Python Lists
Python Lists
my_list = [1, 2, 3, 4, 5]
print(type(my_list))Lists are flexible:
- Can mix different types
- Easy to add/remove items
- Slower for math operations
NumPy Arrays
import numpy as np
my_array = np.array([1, 2, 3, 4, 5])
print(type(my_array))Arrays are specialized:
- All items must be same type
- Fixed size (can't easily add/remove)
- Much faster for math
- Less memory usage
Your First NumPy Array
import numpy as np
numbers = np.array([10, 20, 30, 40, 50])
print(numbers)
print("Type:", type(numbers))Output:
[10 20 30 40 50]
Type: <class 'numpy.ndarray'>
What ndarray means: N-dimensional array. NumPy's special data structure.
Basic Operations
Multiply All Numbers
import numpy as np
prices = np.array([10, 20, 30])
doubled = prices * 2
print(doubled)Output: [20 40 60]
What happens: Every number gets multiplied by 2 at once.
Add to All Numbers
import numpy as np
scores = np.array([80, 85, 90])
curved = scores + 5
print(curved)Output: [85 90 95]
Math Between Arrays
import numpy as np
prices = np.array([100, 200, 300])
taxes = np.array([10, 20, 30])
total = prices + taxes
print(total)Output: [110 220 330]
What this does: Adds matching positions: 100+10, 200+20, 300+30.
Why NumPy is Faster
Python list way (slow):
import time
numbers = list(range(1000000))
start = time.time()
doubled = []
for n in numbers:
doubled.append(n * 2)
end = time.time()
print("Time:", end - start, "seconds")NumPy way (fast):
import numpy as np
import time
numbers = np.array(range(1000000))
start = time.time()
doubled = numbers * 2
end = time.time()
print("Time:", end - start, "seconds")Why NumPy wins:
- Written in C (very fast language)
- Operations happen in parallel
- Optimized memory usage
NumPy Data Types
NumPy is strict about data types for speed.
import numpy as np
integers = np.array([1, 2, 3])
print("Type:", integers.dtype)
floats = np.array([1.5, 2.5, 3.5])
print("Type:", floats.dtype)
mixed = np.array([1, 2.5, 3])
print("Type:", mixed.dtype)Output:
Type: int64
Type: float64
Type: float64
What happens with mixed: NumPy converts everything to float to fit all values.
Practice Example
The scenario: You run a store and want to calculate total prices with tax.
import numpy as np
product_prices = np.array([99.99, 49.99, 149.99, 29.99])
tax_rate = 0.08
print("Original prices:")
print(product_prices)
taxes = product_prices * tax_rate
print("Tax amounts:")
print(taxes)
total_prices = product_prices + taxes
print("Prices with tax:")
print(total_prices)
total_revenue = np.sum(total_prices)
print("Total revenue:", total_revenue)
average_price = np.mean(total_prices)
print("Average price:", average_price)What this program does:
- Creates array of product prices
- Calculates 8 percent tax for each
- Adds tax to get final prices
- Sums all prices for total revenue
- Calculates average price
All operations happen instantly, even with thousands of products.
Common NumPy Functions
import numpy as np
numbers = np.array([10, 20, 30, 40, 50])
print("Sum:", np.sum(numbers))
print("Average:", np.mean(numbers))
print("Maximum:", np.max(numbers))
print("Minimum:", np.min(numbers))
print("Standard deviation:", np.std(numbers))What these do:
- sum: Adds all numbers
- mean: Calculates average
- max: Finds largest number
- min: Finds smallest number
- std: Measures spread of numbers
Key Points to Remember
NumPy is a library for fast numerical computing in Python. Install with pip install numpy, import as np.
NumPy arrays are faster than Python lists for math operations. All elements must be same type.
You can do math on entire arrays at once. Operations happen element by element automatically.
NumPy is written in C, making it 10-100x faster than regular Python for numerical work.
Used everywhere in data science, machine learning, and scientific computing.
Common Mistakes
Mistake 1: Forgetting to import
numbers = np.array([1, 2, 3]) # Error! np not definedImport first: import numpy as np
Mistake 2: Using wrong import name
import numpy
numbers = np.array([1, 2, 3]) # Error! Use numpy or import as npMistake 3: Mixing types without understanding
arr = np.array([1, 2, 3.5])
print(arr) # All become floats: [1. 2. 3.5]Mistake 4: Treating arrays like lists
arr = np.array([1, 2, 3])
arr.append(4) # Error! Arrays don't have appendUse np.append() instead (but it's slow, better to create right size initially).
What's Next?
You now understand what NumPy is and why it's important. Next, you'll learn about creating NumPy arrays in different ways - from lists, ranges, zeros, ones, and more.