On this page

Aggregate Functions

14 min read TextCh. 2 — Basic Queries

Aggregate Functions

Aggregate functions are SQL operations that take a set of rows and compute a single summary value. They are the engine behind analytics, reporting, and business intelligence queries — letting you answer questions like "What is the average order value?", "Which category generates the most revenue?", or "How many users signed up this month?"

PostgreSQL 18 provides a rich set of aggregate functions, and when combined with GROUP BY, HAVING, and conditional filtering, they become extraordinarily powerful.

The Core Aggregate Functions

COUNT

COUNT is the most fundamental aggregate. It counts the number of rows:

-- COUNT(*): counts ALL rows, including NULLs
SELECT COUNT(*) FROM orders;

-- COUNT(column): counts rows where column IS NOT NULL
SELECT COUNT(shipped_at) FROM orders;  -- only shipped orders

-- COUNT(DISTINCT column): counts unique non-NULL values
SELECT COUNT(DISTINCT customer_id) FROM orders;  -- unique customers who ordered

SUM

SUM adds up all values in a column (ignoring NULLs):

SELECT
    SUM(total_amount)         AS total_revenue,
    SUM(quantity)             AS total_units_sold
FROM orders
WHERE status = 'completed';

AVG

AVG computes the arithmetic mean (NULLs are excluded):

SELECT
    AVG(price)               AS avg_product_price,
    ROUND(AVG(price), 2)     AS avg_price_2dp
FROM products;

Because AVG can return many decimal places, always ROUND the result to a meaningful precision.

MIN and MAX

SELECT
    MIN(price)               AS cheapest_product,
    MAX(price)               AS most_expensive_product,
    MIN(created_at)          AS first_order,
    MAX(created_at)          AS latest_order
FROM products;

MIN and MAX work on any data type that supports ordering: numbers, strings, dates, and timestamps.

GROUP BY — Aggregating by Category

GROUP BY divides rows into groups and applies the aggregate function to each group separately:

SELECT
    category,
    COUNT(*)            AS product_count,
    AVG(price)          AS avg_price,
    SUM(stock * price)  AS inventory_value
FROM products
GROUP BY category
ORDER BY inventory_value DESC;

Result:

    category    | product_count | avg_price | inventory_value
----------------+---------------+-----------+-----------------
 electronics    |             5 |    294.99 |       89497.00
 furniture      |             2 |    424.99 |       19249.55
 stationery     |             3 |      6.32 |        4797.00

Important rule: any column in the SELECT list that is not inside an aggregate function must appear in the GROUP BY clause. If you write SELECT category, name, COUNT(*) FROM products GROUP BY category, PostgreSQL will throw an error because name is ambiguous — which name should be returned for the group?

GROUP BY on Multiple Columns

-- Revenue by year and month
SELECT
    EXTRACT(YEAR  FROM ordered_at) AS year,
    EXTRACT(MONTH FROM ordered_at) AS month,
    COUNT(*)                       AS order_count,
    SUM(total_amount)              AS monthly_revenue
FROM orders
WHERE status = 'completed'
GROUP BY
    EXTRACT(YEAR  FROM ordered_at),
    EXTRACT(MONTH FROM ordered_at)
ORDER BY year, month;

HAVING — Filtering Groups

WHERE filters rows before grouping. HAVING filters groups after aggregation. This is a critical distinction:

-- Find categories with more than 2 products AND an average price above $50
SELECT
    category,
    COUNT(*) AS product_count,
    AVG(price) AS avg_price
FROM products
GROUP BY category
HAVING COUNT(*) > 2
   AND AVG(price) > 50.00
ORDER BY avg_price DESC;

You can combine WHERE and HAVING in the same query. WHERE reduces the input rows before grouping; HAVING reduces the number of output groups:

-- Among active products, find categories averaging over $100
SELECT
    category,
    COUNT(*) AS product_count,
    ROUND(AVG(price), 2) AS avg_price
FROM products
WHERE is_active = TRUE          -- filters rows before grouping
GROUP BY category
HAVING AVG(price) > 100.00      -- filters groups after aggregation
ORDER BY avg_price DESC;

CASE WHEN — Conditional Logic Inside Queries

CASE WHEN is SQL's if-else expression. It can appear anywhere an expression is valid:

-- Classify products by price tier
SELECT
    name,
    price,
    CASE
        WHEN price < 20.00  THEN 'budget'
        WHEN price < 100.00 THEN 'mid-range'
        WHEN price < 500.00 THEN 'premium'
        ELSE 'luxury'
    END AS price_tier
FROM products
ORDER BY price;

CASE WHEN Inside Aggregates — Pivot Tables

Combining CASE WHEN with aggregates lets you pivot row data into columns:

-- Count products per tier in a single row
SELECT
    COUNT(CASE WHEN price < 20.00  THEN 1 END) AS budget_count,
    COUNT(CASE WHEN price BETWEEN 20 AND 99.99 THEN 1 END) AS mid_count,
    COUNT(CASE WHEN price >= 100.00 THEN 1 END) AS premium_count
FROM products;

PostgreSQL also offers the cleaner FILTER syntax for this pattern:

SELECT
    COUNT(*) FILTER (WHERE price < 20.00)   AS budget_count,
    COUNT(*) FILTER (WHERE price < 100.00
                     AND   price >= 20.00)  AS mid_count,
    COUNT(*) FILTER (WHERE price >= 100.00) AS premium_count
FROM products;

String Aggregation

PostgreSQL provides STRING_AGG to concatenate values from multiple rows into a single string:

-- Get all product names per category as a comma-separated list
SELECT
    category,
    STRING_AGG(name, ', ' ORDER BY name) AS products
FROM products
GROUP BY category;

Result:

    category    |                     products
----------------+---------------------------------------------------
 electronics    | Laptop Pro 15, Mechanical Keyboard, USB-C Hub, Wireless Mouse
 furniture      | Ergonomic Chair, Standing Desk
 stationery     | Ballpoint Pens, Notebook A5, Sticky Notes

ROLLUP and CUBE for Subtotals

ROLLUP generates subtotals along a hierarchy:

-- Revenue by category and year, plus category subtotals and grand total
SELECT
    COALESCE(category, 'Grand Total')               AS category,
    COALESCE(CAST(EXTRACT(YEAR FROM ordered_at) AS TEXT), 'All Years') AS year,
    ROUND(SUM(total_amount), 2)                     AS revenue
FROM orders
JOIN order_items USING (order_id)
JOIN products    USING (product_id)
GROUP BY ROLLUP(category, EXTRACT(YEAR FROM ordered_at))
ORDER BY category NULLS LAST, year NULLS LAST;

ROLLUP(a, b) produces groupings (a, b), (a), and () — the last two being the subtotal per a and the grand total, respectively.

Practical Example — Sales Dashboard Query

Putting it all together: a real-world analytics query for a sales dashboard:

SELECT
    p.category,
    COUNT(DISTINCT o.customer_id)             AS unique_buyers,
    COUNT(DISTINCT o.order_id)                AS total_orders,
    SUM(oi.quantity)                          AS units_sold,
    ROUND(SUM(oi.quantity * oi.unit_price), 2) AS gross_revenue,
    ROUND(AVG(oi.unit_price), 2)              AS avg_selling_price
FROM   orders      o
JOIN   order_items oi ON oi.order_id  = o.order_id
JOIN   products    p  ON p.product_id = oi.product_id
WHERE  o.status = 'completed'
  AND  o.ordered_at >= NOW() - INTERVAL '30 days'
GROUP BY p.category
HAVING SUM(oi.quantity * oi.unit_price) > 1000.00
ORDER BY gross_revenue DESC;

This query joins three tables, filters by status and recency, groups by category, filters groups by revenue threshold, and sorts the results — all in a single, readable SQL statement.

Aggregate functions and GROUP BY are the tools that turn raw transactional data into business insights. The next lesson teaches JOINs, which are essential for combining data across tables in exactly these kinds of analytical queries.

COUNT(*) vs COUNT(column)
COUNT(*) counts all rows including those with NULL values. COUNT(column_name) counts only rows where that column is NOT NULL. This distinction matters when checking completeness: COUNT(*) vs COUNT(email) tells you how many rows have a missing email address.
Use FILTER Instead of CASE WHEN
PostgreSQL supports the FILTER clause for conditional aggregation: COUNT(*) FILTER (WHERE status = 'active'). This is cleaner and often faster than the traditional CASE WHEN workaround. Both produce the same result, but FILTER is more readable.
WHERE vs HAVING
WHERE filters individual rows BEFORE grouping. HAVING filters groups AFTER aggregation. You cannot use aggregate functions in WHERE. If you write WHERE COUNT(*) > 5, PostgreSQL will return an error. Always put conditions on aggregated values in HAVING.
sql
-- COUNT: number of rows
SELECT COUNT(*)                  AS total_products,
       COUNT(description)        AS with_description,
       COUNT(DISTINCT category)  AS unique_categories
FROM products;

-- SUM and AVG
SELECT
    SUM(stock)             AS total_stock,
    AVG(price)             AS avg_price,
    ROUND(AVG(price), 2)   AS avg_price_rounded,
    MIN(price)             AS cheapest,
    MAX(price)             AS most_expensive
FROM products;

-- GROUP BY: aggregate per group
SELECT
    category,
    COUNT(*)               AS num_products,
    ROUND(AVG(price), 2)   AS avg_price,
    SUM(stock)             AS total_stock,
    MIN(price)             AS min_price,
    MAX(price)             AS max_price
FROM products
GROUP BY category
ORDER BY avg_price DESC;

-- HAVING: filter groups (not individual rows)
SELECT
    category,
    COUNT(*) AS num_products,
    AVG(price) AS avg_price
FROM products
GROUP BY category
HAVING COUNT(*) >= 3         -- only categories with 3+ products
   AND AVG(price) > 50.00
ORDER BY avg_price DESC;

-- CASE WHEN inside aggregate: conditional counting
SELECT
    COUNT(*) FILTER (WHERE price < 50)    AS budget_products,
    COUNT(*) FILTER (WHERE price BETWEEN 50 AND 200) AS mid_range,
    COUNT(*) FILTER (WHERE price > 200)   AS premium_products
FROM products;

-- GROUP BY with ROLLUP: subtotals + grand total
SELECT
    COALESCE(category, 'ALL CATEGORIES') AS category,
    COUNT(*)                             AS num_products,
    ROUND(SUM(price * stock), 2)         AS inventory_value
FROM products
GROUP BY ROLLUP(category)
ORDER BY category NULLS LAST;