On this page
Subqueries and CTEs
Subqueries and CTEs
As queries grow more complex, you need tools for building queries incrementally — using the results of one query as the input to another. PostgreSQL provides two primary mechanisms: subqueries (queries nested inside another query) and CTEs (Common Table Expressions, introduced with the WITH keyword).
Both solve the same fundamental problem: how to work with derived data sets without creating permanent database objects. The choice between them is mostly about readability and reuse.
Subqueries
A subquery is a SELECT statement nested inside another SQL statement. Subqueries can appear in three positions:
- In the
SELECTlist (scalar subquery) - In the
FROMclause (derived table or inline view) - In the
WHEREclause (filtering subquery)
Scalar Subqueries
A scalar subquery returns exactly one row and one column. It can be used anywhere a single value is expected:
-- Show each product's price and how much it deviates from the global average
SELECT
name,
price,
ROUND(AVG(price) OVER (), 2) AS global_avg, -- window function alternative
ROUND(price - (SELECT AVG(price) FROM products), 2) AS deviation
FROM products
ORDER BY deviation DESC;If a scalar subquery returns more than one row, PostgreSQL throws a runtime error. If it returns zero rows, the result is NULL.
Subqueries in FROM — Derived Tables
A subquery in the FROM clause creates a temporary result set that the outer query treats as a table:
-- Find categories where the average price is above the overall average
SELECT category, avg_price
FROM (
SELECT
category,
ROUND(AVG(price), 2) AS avg_price
FROM products
GROUP BY category
) AS category_stats
WHERE avg_price > (SELECT AVG(price) FROM products)
ORDER BY avg_price DESC;The subquery inside FROM must always have an alias (AS category_stats). This is called a derived table or inline view.
Subqueries in WHERE
Subqueries in WHERE are used for filtering. They come in three forms:
Comparison Subqueries
-- Products priced above the overall average
SELECT name, price
FROM products
WHERE price > (SELECT AVG(price) FROM products)
ORDER BY price DESC;IN Subqueries
-- Customers who have placed at least one completed order
SELECT full_name, email
FROM customers
WHERE customer_id IN (
SELECT DISTINCT customer_id
FROM orders
WHERE status = 'completed'
);EXISTS Subqueries
EXISTS is often more efficient than IN because it short-circuits as soon as it finds one matching row:
-- Customers who have at least one order (using EXISTS)
SELECT c.full_name, c.email
FROM customers c
WHERE EXISTS (
SELECT 1
FROM orders o
WHERE o.customer_id = c.customer_id
);
-- Customers who have NEVER ordered (NOT EXISTS anti-join)
SELECT c.full_name, c.email
FROM customers c
WHERE NOT EXISTS (
SELECT 1
FROM orders o
WHERE o.customer_id = c.customer_id
);The SELECT 1 inside EXISTS is conventional — EXISTS only checks whether any row is returned, not what the row contains.
Correlated Subqueries
A correlated subquery references a column from the outer query. It is re-evaluated for every outer row:
-- Products priced above their own category's average
SELECT name, category, price
FROM products p
WHERE price > (
SELECT AVG(price)
FROM products
WHERE category = p.category -- 'p.category' references the outer query
)
ORDER BY category, price DESC;While powerful, correlated subqueries can be slow because they execute once per row in the outer query. For large tables, rewrite them as JOINs against aggregated subqueries or use window functions.
Common Table Expressions (CTEs)
A CTE (Common Table Expression) is a named, temporary result set defined before the main query with the WITH keyword:
WITH cte_name AS (
SELECT ...
)
SELECT ... FROM cte_name;CTEs make complex queries readable by naming intermediate results. They are especially valuable when the same intermediate result is referenced multiple times.
Multiple CTEs
WITH
high_value_orders AS (
SELECT order_id, customer_id, total_amount
FROM orders
WHERE total_amount > 500 AND status = 'completed'
),
top_customers AS (
SELECT customer_id, COUNT(*) AS order_count, SUM(total_amount) AS lifetime_value
FROM high_value_orders
GROUP BY customer_id
HAVING COUNT(*) >= 3
),
customer_details AS (
SELECT c.customer_id, c.full_name, c.email, tc.lifetime_value
FROM customers c
JOIN top_customers tc ON tc.customer_id = c.customer_id
)
SELECT
full_name,
email,
ROUND(lifetime_value, 2) AS lifetime_value,
RANK() OVER (ORDER BY lifetime_value DESC) AS rank
FROM customer_details
ORDER BY lifetime_value DESC;Each CTE builds on the previous ones, making the query read almost like a recipe.
Writeable CTEs
CTEs can include INSERT, UPDATE, or DELETE statements, making complex multi-step data modifications possible:
WITH moved_orders AS (
DELETE FROM orders_temp
WHERE ordered_at < NOW() - INTERVAL '1 year'
RETURNING *
)
INSERT INTO orders_archive
SELECT * FROM moved_orders;This atomically moves old orders from a temporary table to an archive table in a single statement.
Recursive CTEs
A recursive CTE references itself, enabling queries that traverse hierarchical or graph structures — category trees, organizational charts, bill of materials, comment threads.
WITH RECURSIVE org_chart AS (
-- Anchor member: the CEO (no manager)
SELECT
employee_id,
full_name,
manager_id,
title,
1 AS level,
ARRAY[employee_id] AS path
FROM employees
WHERE manager_id IS NULL
UNION ALL
-- Recursive member: direct reports of already-found employees
SELECT
e.employee_id,
e.full_name,
e.manager_id,
e.title,
oc.level + 1,
oc.path || e.employee_id
FROM employees e
JOIN org_chart oc ON e.manager_id = oc.employee_id
WHERE NOT e.employee_id = ANY(oc.path) -- cycle guard
)
SELECT
REPEAT(' ', level - 1) || full_name AS indented_name,
title,
level
FROM org_chart
ORDER BY path;The structure of every recursive CTE is:
- Anchor member: the base case (non-recursive part), connected by
UNION ALL - Recursive member: references the CTE name itself, adding one more level per iteration
- Termination: recursion stops when the recursive member returns no more rows
The path array and NOT e.employee_id = ANY(oc.path) cycle guard prevent infinite loops if the data has cycles.
CTE vs Subquery — When to Use Which
| Scenario | Use |
|---|---|
| Single use, simple condition | Inline subquery |
| Intermediate result used multiple times | CTE |
| Complex query needing step-by-step logic | Multiple CTEs |
| Hierarchical / recursive data | Recursive CTE |
| Performance optimization needed | Test both — CTEs are optimization fences in older PostgreSQL versions |
In PostgreSQL 12+, CTEs are "inlined" by default (treated like subqueries by the planner), removing the old performance concern. For the rare cases where you want the old materialization behavior (forcing the CTE to run independently), use WITH MATERIALIZED.
Subqueries and CTEs are the building blocks for expressing any query logic, no matter how complex. Combined with the JOINs from the previous lesson, they give you the full power of relational algebra. Next, we look at views and PL/pgSQL functions for encapsulating this logic permanently in the database.
-- Scalar subquery: returns a single value
SELECT name, price,
price - (SELECT AVG(price) FROM products) AS diff_from_avg
FROM products
ORDER BY diff_from_avg DESC;
-- Subquery in WHERE: products priced above category average
SELECT name, category, price
FROM products p
WHERE price > (
SELECT AVG(price)
FROM products
WHERE category = p.category -- correlated: references outer query
)
ORDER BY category, price DESC;
-- Subquery in FROM (derived table)
SELECT category, avg_price
FROM (
SELECT category, ROUND(AVG(price), 2) AS avg_price
FROM products
GROUP BY category
) AS category_averages
WHERE avg_price > 50.00;
-- CTE: WITH clause — reusable named subquery
WITH revenue_per_customer AS (
SELECT
c.customer_id,
c.full_name,
ROUND(SUM(oi.quantity * oi.unit_price), 2) AS total_spent
FROM customers c
JOIN orders o ON o.customer_id = c.customer_id
JOIN order_items oi ON oi.order_id = o.order_id
WHERE o.status = 'completed'
GROUP BY c.customer_id, c.full_name
)
SELECT
full_name,
total_spent,
CASE
WHEN total_spent >= 1000 THEN 'VIP'
WHEN total_spent >= 500 THEN 'Regular'
ELSE 'New'
END AS customer_tier
FROM revenue_per_customer
ORDER BY total_spent DESC;
-- Recursive CTE: traverse a category hierarchy
WITH RECURSIVE category_tree AS (
-- Anchor: top-level categories
SELECT category_id, name, parent_id, 0 AS depth
FROM categories
WHERE parent_id IS NULL
UNION ALL
-- Recursive: children of each found row
SELECT c.category_id, c.name, c.parent_id, ct.depth + 1
FROM categories c
JOIN category_tree ct ON c.parent_id = ct.category_id
)
SELECT depth, REPEAT(' ', depth) || name AS indented_name
FROM category_tree
ORDER BY depth, name;
Sign in to track your progress