Topic 20 of

SQL Subqueries and CTEs — With Real-World Examples

When a single query isn't enough, you need to nest queries inside each other. Subqueries and CTEs are how you do it — and CTEs make the code actually readable.

📚Intermediate
⏱️12 min
6 quizzes
📦

What Is a Subquery?

A subquery is a query nested inside another query. The inner query runs first, its result is used by the outer query. Think of it as using the answer to one question as the input to another question.

Subquery in WHERE

query.sqlSQL
-- Find all orders above the average order value
SELECT order_id, amount
FROM orders
WHERE amount > (
  SELECT AVG(amount) FROM orders
);

The inner query SELECT AVG(amount) FROM orders runs first and returns a single number (say, 450). The outer query then uses that number: WHERE amount > 450.

Subquery in FROM (Derived Table)

query.sqlSQL
-- Find cities where average order value > 500
SELECT city, avg_value
FROM (
  SELECT city, AVG(amount) AS avg_value
  FROM orders
  GROUP BY city
) AS city_stats
WHERE avg_value > 500;

The subquery in FROM creates a temporary table (city_stats) that the outer query treats like a regular table.

Think of it this way...

A subquery is like asking a junior analyst to prepare a summary table, then you use that summary to answer your own question. The junior analyst's work (inner query) feeds directly into yours (outer query).

⚠️ CheckpointQuiz error: Missing or invalid options array

📖

CTEs — The Readable Alternative

A CTE (Common Table Expression) is defined using the WITH clause. It does the same job as a subquery but is written before the main query — making complex logic much easier to read and debug.

query.sqlSQL
-- Same query as above, rewritten as a CTE
WITH city_stats AS (
  SELECT
    city,
    AVG(amount) AS avg_value,
    COUNT(*) AS order_count
  FROM orders
  GROUP BY city
)
SELECT city, avg_value, order_count
FROM city_stats
WHERE avg_value > 500
ORDER BY avg_value DESC;

Chaining Multiple CTEs

You can define multiple CTEs and reference earlier ones in later ones:

query.sqlSQL
WITH
monthly_revenue AS (
  SELECT
    DATE_TRUNC('month', order_date) AS month,
    SUM(amount) AS revenue
  FROM orders
  GROUP BY 1
),
revenue_with_growth AS (
  SELECT
    month,
    revenue,
    LAG(revenue) OVER (ORDER BY month) AS prev_month_revenue,
    ROUND((revenue - LAG(revenue) OVER (ORDER BY month)) / LAG(revenue) OVER (ORDER BY month) * 100, 2) AS growth_pct
  FROM monthly_revenue
)
SELECT *
FROM revenue_with_growth
WHERE month >= '2026-01-01';
Info

Key Takeaway: CTEs don't make queries faster than subqueries — their value is in readability and maintainability. When a query needs to reference the same subquery multiple times, a CTE is also more efficient since it's evaluated once and reused.

🔄

Correlated Subqueries

A correlated subquery references a column from the outer query — meaning it runs once for each row in the outer query. They're powerful but can be slow on large tables.

query.sqlSQL
-- For each customer, find their highest order
SELECT
  c.customer_id,
  c.name,
  o.order_id,
  o.amount
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id
WHERE o.amount = (
  SELECT MAX(amount)
  FROM orders o2
  WHERE o2.customer_id = c.customer_id  -- references outer query's c.customer_id
);

The inner SELECT MAX(amount) runs separately for each customer, finding their maximum order. This is a correlated subquery. The same result is often achievable with window functions (see Topic 21) which tend to be more efficient.

⚠️ FinalQuiz error: Missing or invalid questions array

⚠️ SummarySection error: Missing or invalid items array

Received: {"hasItems":false,"isArray":false}