What Makes Window Functions Different?
Regular aggregate functions (GROUP BY + SUM) collapse rows into groups — you lose the individual row detail. Window functions compute aggregates across a set of rows (the "window") while keeping every row in the result.
-- GROUP BY: collapses rows, loses detail
SELECT city, AVG(amount) AS city_avg
FROM orders
GROUP BY city;
-- Result: one row per city
-- Window function: keeps all rows, adds city average to each
SELECT
order_id,
city,
amount,
AVG(amount) OVER (PARTITION BY city) AS city_avg
FROM orders;
-- Result: all orders, each showing its city's average alongsideThe key syntax: function() OVER (PARTITION BY ... ORDER BY ...)
GROUP BY is like calculating the class average and announcing it — you lose each student's individual score. A window function is like printing each student's score AND the class average on the same report card. Every student's row stays; the aggregate gets added alongside.
ROW_NUMBER, RANK, DENSE_RANK
These three functions assign a number to each row within a window. The difference is in how they handle ties.
SELECT
customer_id,
city,
total_orders,
ROW_NUMBER() OVER (ORDER BY total_orders DESC) AS row_num,
RANK() OVER (ORDER BY total_orders DESC) AS rank_num,
DENSE_RANK() OVER (ORDER BY total_orders DESC) AS dense_rank_num
FROM customer_summary;| total_orders | ROW_NUMBER | RANK | DENSE_RANK | | ------------ | ---------- | ---- | ---------- | | 100 | 1 | 1 | 1 | | 95 | 2 | 2 | 2 | | 95 | 3 | 2 | 2 | | 80 | 4 | 4 | 3 |
- ROW_NUMBER: unique sequential number — ties get different numbers
- RANK: ties get the same rank, then skips (1, 2, 2, 4)
- DENSE_RANK: ties get the same rank, no skipping (1, 2, 2, 3)
Using PARTITION BY for Per-Group Rankings
-- Rank orders within each city by amount
SELECT
order_id,
city,
amount,
RANK() OVER (PARTITION BY city ORDER BY amount DESC) AS rank_in_city
FROM orders;PARTITION BY city resets the ranking for each city — so each city has its own #1, #2, #3.
Interview pattern: "Get the top N rows per group" is one of the most common SQL interview questions. Answer: use ROW_NUMBER() with PARTITION BY, then filter WHERE row_num ≤ N in a CTE or subquery.
⚠️ CheckpointQuiz error: Missing or invalid options array
LAG and LEAD — Looking at Adjacent Rows
LAG and LEAD access values from previous or following rows in the result set — without a self-join. This is essential for time-series analysis.
-- Month-over-month revenue comparison
SELECT
month,
revenue,
LAG(revenue, 1) OVER (ORDER BY month) AS prev_month_revenue,
revenue - LAG(revenue, 1) OVER (ORDER BY month) AS revenue_change,
ROUND(
(revenue - LAG(revenue, 1) OVER (ORDER BY month))
/ LAG(revenue, 1) OVER (ORDER BY month) * 100,
2
) AS growth_pct
FROM monthly_revenue
ORDER BY month;LAG(revenue, 1) returns the revenue from 1 row before (previous month). LEAD(revenue, 1) would return the next month's revenue.
First Order and Last Order per Customer
SELECT
customer_id,
order_id,
order_date,
FIRST_VALUE(order_date) OVER (PARTITION BY customer_id ORDER BY order_date) AS first_order_date,
LAST_VALUE(order_date) OVER (PARTITION BY customer_id ORDER BY order_date
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS last_order_date
FROM orders;Running Totals and Moving Averages
Window functions excel at cumulative calculations:
-- Running total of revenue by date
SELECT
order_date,
daily_revenue,
SUM(daily_revenue) OVER (ORDER BY order_date) AS cumulative_revenue,
AVG(daily_revenue) OVER (ORDER BY order_date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) AS rolling_7day_avg
FROM daily_sales
ORDER BY order_date;The ROWS BETWEEN 6 PRECEDING AND CURRENT ROW frame means "this row and the 6 rows before it" — a 7-day rolling window.
⚠️ FinalQuiz error: Missing or invalid questions array
⚠️ SummarySection error: Missing or invalid items array
Received: {"hasItems":false,"isArray":false}