July 18, 2026

5 min read

Reporting Analytics

Understanding Window Functions in BigQuery

Window functions enhance SQL by allowing calculations across rows without collapsing data, unlike GROUP BY. They enable detailed analytics such as running totals and rankings, while preserving row integrity.

Most people learn GROUP BY early, and it works well right up until it doesn’t. The moment you want to keep your individual rows while also adding a calculation across a group, like a running total, a rank, or a comparison to the previous row, GROUP BY runs out of road. It collapses your rows into summaries. Window functions solve exactly this problem. They let you perform calculations across a set of rows while keeping every row intact.

10 code along for practice: https://datalad.co.uk/window-functions-in-bigquery-10-code-along-examples/

A free cheatsheet: https://datalad.co.uk/window-functions-in-bigquery-cheatsheet/

What a Window Function Does

A window function performs a calculation across a set of table rows and returns a single value for each row in that set. The crucial difference from GROUP BY is that it does not collapse your rows. You keep the full detail of your data and gain an extra column with the calculated value attached to each row. This makes window functions ideal for aggregations, rankings, and comparisons that would otherwise force you to lose the row-level detail.

The Basic Syntax

Every window function follows the same pattern: a function, followed by an OVER clause that defines the “window” of rows to operate on.

			
function_name(expression) OVER (
  PARTITION BY column_name
  ORDER BY column_name
  ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
)

		

There are three pieces inside the OVER clause. PARTITION BY divides the data into subsets, or windows, so the calculation restarts for each group. ORDER BY defines the order of rows inside each window, which matters for things like running totals and rankings. And the ROWS or RANGE clause specifies precisely which rows to include in the window frame, though the default behaviour is often all you need. All three are optional, and you will frequently use only the ones a given calculation requires.

The Three Families of Window Functions

Window functions fall into three broad categories. Aggregations like SUM(), AVG(), COUNT(), MAX(), and MIN() perform the familiar calculations, but per row rather than collapsing the data. Ranking functions like ROW_NUMBER(), RANK(), and DENSE_RANK() assign positions to rows. And value accessors like LAG(), LEAD(), FIRST_VALUE(), and LAST_VALUE() let one row reach out and read values from other rows in the window.

Running Totals with Aggregations

The classic example is a running total. Here we calculate cumulative sales over time:

			
SELECT
  date,
  sales,
  SUM(sales) OVER (ORDER BY date) AS running_total
FROM `dataset.daily_sales`

		

Because there is no PARTITION BY, the whole table is one window, and the ORDER BY date tells the function to accumulate sales in date order. Each row shows the cumulative total of all sales up to and including that date, while still showing that day’s individual sales figure. A plain GROUP BY could give you the grand total, but never this row-by-row accumulation.

Restarting the Calculation per Group

Add PARTITION BY and the running total restarts for each group. Here we compute a separate running total for every category:

			
SELECT
  category,
  date,
  sales,
  SUM(sales) OVER (
    PARTITION BY category
    ORDER BY date
  ) AS running_total_by_category
FROM `dataset.daily_sales`

		

Now the cumulative total resets each time the category changes, so you get an independent running total per category, all in the same query and without losing any rows. PARTITION BY is the window function equivalent of grouping, except it slices the data into windows rather than collapsing it.

Ranking Rows

Ranking functions assign a position to each row based on an ordering you define. Here we rank customers by how much they have spent:

			
SELECT
  customer_id,
  total_spent,
  RANK() OVER (ORDER BY total_spent DESC) AS revenue_rank
FROM (
  SELECT customer_id, SUM(amount) AS total_spent
  FROM `dataset.orders`
  GROUP BY customer_id
)

		

The inner query aggregates total spend per customer, then RANK() assigns each customer a rank from highest to lowest. With RANK(), customers with identical totals receive the same rank, and the next rank skips accordingly. This is the kind of leaderboard logic that GROUP BY simply cannot produce on its own.

Top N per Group with ROW_NUMBER()

ROW_NUMBER() is the go-to function for finding the top few records within each group. Unlike RANK(), it always assigns a unique sequential number, even when values tie, which makes it perfect for “top N per group” problems. Here we get each customer’s three largest orders:

			
SELECT *
FROM (
  SELECT
    customer_id,
    order_id,
    amount,
    ROW_NUMBER() OVER (
      PARTITION BY customer_id
      ORDER BY amount DESC
    ) AS rank_by_amount
  FROM `dataset.orders`
)
WHERE rank_by_amount <= 3

		

The window numbers each customer’s orders from largest to smallest, restarting at one for each customer. The outer query then keeps only the rows numbered three or below. This pattern, numbering inside a subquery and filtering outside it, is one of the most useful techniques in all of SQL.

Comparing Rows with LAG and LEAD

Value accessors let a row read data from other rows. LAG() reaches backwards to a previous row, and LEAD() reaches forwards. This makes period-over-period comparisons easy:

			
SELECT
  date,
  sales,
  LAG(sales) OVER (ORDER BY date) AS yesterday_sales,
  sales - LAG(sales) OVER (ORDER BY date) AS daily_change
FROM `dataset.daily_sales`

		

LAG(sales) pulls in the previous day’s sales figure onto the current row, and subtracting it from today’s sales gives the day-on-day change. Achieving this without window functions would require an awkward self-join. Here it is a single, readable line.

First and Last Values in a Window

FIRST_VALUE() and LAST_VALUE() return the first or last value within a window, which is useful for things like a customer’s first and most recent purchase:

			
SELECT
  customer_id,
  date,
  sales,
  FIRST_VALUE(sales) OVER (
    PARTITION BY customer_id ORDER BY date
  ) AS first_sale,
  LAST_VALUE(sales) OVER (
    PARTITION BY customer_id ORDER BY date
    ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
  ) AS last_sale
FROM `dataset.customer_sales`

		

There is an important subtlety here. FIRST_VALUE() works as expected with the default frame, but LAST_VALUE() does not. By default, the window frame only extends to the current row, so LAST_VALUE() would just return the current row’s value rather than the true last value in the partition. That is why the explicit ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING is added: it widens the frame to cover every row in the window, so LAST_VALUE() genuinely returns the final value. This is one of the most common window function gotchas, so it is worth remembering.

Window Functions Versus GROUP BY

It helps to be clear about when each tool applies. GROUP BY aggregates by collapsing rows into one summary row per group, which is exactly what you want when you only need the totals. Window functions aggregate per row while keeping the full row count, which is what you want when you need the detail and the calculation side by side.

Beyond aggregation, window functions do things GROUP BY cannot do directly at all: ranking and ordering within groups, and comparative analytics like comparing a row to the one before it. And critically, window functions retain the underlying detail of your data, where GROUP BY discards it in favour of the summary. The short version: reach for GROUP BY when you want summaries, and window functions when you want per-row calculations that preserve the original rows.

Keeping Window Functions Efficient

Because window functions operate over sets of rows, a few habits keep them performant in BigQuery. Partition thoughtfully, since well-chosen partitions limit how much data each window has to process. Filter early to reduce the volume of rows the functions work across. Avoid unnecessarily large windows by using partitions to keep each window focused. And select only the columns you actually need, since scanning fewer columns directly lowers your query cost.

Common Pitfalls

A few mistakes come up repeatedly. Forgetting PARTITION BY when you meant to calculate per group means the function runs across the entire table instead, so add the partition to scope it correctly. An incorrect ORDER BY produces the wrong running totals or rankings, so check that the ordering logic matches your intent. And unintended ROWS or RANGEbehaviour, like the LAST_VALUE() frame issue above, can give surprising results, so either rely on the default deliberately or define the frame explicitly when it matters.

The Takeaway

Window functions let you calculate across a set of rows without collapsing them, filling the gap that GROUP BY leaves behind. Use the OVER clause to define the window, PARTITION BY to split it into groups, and ORDER BY to sequence rows for running totals and rankings. Aggregations give you running calculations, ranking functions power leaderboards and top-N queries, and value accessors like LAG() and LEAD() make period-over-period comparisons trivial. Keep an eye on the LAST_VALUE() frame quirk, partition thoughtfully for performance, and you have one of the most powerful tools in BigQuery for analytical SQL.

See you soon.

Reporting Analytics

Andrei

July 18, 2026

5 min read

View Comments (7)

Recommended for You

Speeding Up BigQuery with Materialized Views

Materialized views in BigQuery cache precomputed query results, enhancing performance and reducing costs by avoiding repetitive data scans. They refresh automatically, making them ideal for frequent aggregations in dashboards.

July 11, 2026

6 min read

Flattening Nested Data in BigQuery with UNNEST()

BigQuery's UNNEST() function flattens nested data structures like arrays and structs into standard rows, enabling easier querying and analysis of complex datasets often encountered in real-world applications.

July 29, 2026

6 min read

Python Packaging Cheatsheet

Turning Your Python Code into a Real Package: 10 Code-Along Examples

Martech Engineer Roadmap

CSS Display Property Cheatsheet

Understanding Window Functions in BigQuery

What a Window Function Does

The Basic Syntax

The Three Families of Window Functions

Running Totals with Aggregations

Restarting the Calculation per Group

Ranking Rows

Top N per Group with ROW_NUMBER()

Comparing Rows with LAG and LEAD

First and Last Values in a Window

Window Functions Versus GROUP BY

Keeping Window Functions Efficient

Common Pitfalls

The Takeaway

Related

Leave a ReplyCancel reply

Recommended for You

Speeding Up BigQuery with Materialized Views

Flattening Nested Data in BigQuery with UNNEST()

Python Packaging Cheatsheet

Turning Your Python Code into a Real Package: 10 Code-Along Examples

Martech Engineer Roadmap

CSS Display Property Cheatsheet

Understanding Window Functions in BigQuery

What a Window Function Does

The Basic Syntax

The Three Families of Window Functions

Running Totals with Aggregations

Restarting the Calculation per Group

Ranking Rows

Top N per Group with ROW_NUMBER()

Comparing Rows with LAG and LEAD

First and Last Values in a Window

Window Functions Versus GROUP BY

Keeping Window Functions Efficient

Common Pitfalls

The Takeaway

Related

Leave a ReplyCancel reply

Subscribe to My Newsletter

Recommended for You

Speeding Up BigQuery with Materialized Views

Flattening Nested Data in BigQuery with UNNEST()

Discover more from Discuss Data Science, Machine Learning and Analytics