User-Defined Functions in BigQuery: 10 Code-Along Examples

Learn BigQuery UDFs by running them. Ten copy-and-run examples covering scalar SQL functions, parameters, CASE logic, JavaScript UDFs, JSON parsing, applying functions across tables, persistent functions, and a parameterised table function.

The UDFs guide explains the idea: when the same calculation keeps reappearing across your queries, wrap it in a function and call it by name. This workbook builds that skill from the ground up. You will write scalar SQL functions, add parameters and conditional logic, switch to JavaScript for the jobs SQL handles awkwardly, apply a function across a table, save a persistent function for your team, and finish with a table function that returns rows. Run each query, read the result, then change a value and run it again.

1. Your first SQL UDF

A scalar SQL UDF takes typed inputs, returns a typed value, and wraps a single expression in parentheses. CREATE TEMP FUNCTION makes it available for the current query or session only, which is the safest way to experiment.

CREATE TEMP FUNCTION double_value(x FLOAT64)
RETURNS FLOAT64
AS (x * 2);
SELECT double_value(21) AS result; -- 42.0

The function declares one parameter, x FLOAT64, and a return type, FLOAT64, then the body is just the expression x * 2. Declaring types explicitly is a best practice the guide stresses: it documents intent and lets BigQuery catch misuse early.

2. Multiple parameters

UDFs can take several inputs. Here apply_discount accepts a price and a discount rate and returns the reduced price, rounded to two decimal places.

CREATE TEMP FUNCTION apply_discount(price FLOAT64, discount FLOAT64)
RETURNS FLOAT64
AS (ROUND(price * (1 - discount), 2));
SELECT apply_discount(100, 0.2) AS final_price; -- 80.0

You can call any built-in function inside a UDF body, as ROUND shows here. Wrapping this small piece of pricing logic in one named function means every query that needs a discounted price stays consistent and reads clearly.

3. Conditional logic with CASE

A SQL UDF body can be a full CASE expression, which is how you encode categorisation rules once and reuse them everywhere. This categorize_order function turns an amount into a size label.

CREATE TEMP FUNCTION categorize_order(amount FLOAT64)
RETURNS STRING
AS (
CASE
WHEN amount >= 200 THEN 'large'
WHEN amount >= 50 THEN 'medium'
ELSE 'small'
END
);
SELECT
categorize_order(250) AS a, -- large
categorize_order(75) AS b, -- medium
categorize_order(10) AS c; -- small

The thresholds now live in exactly one place. If the business changes what counts as a large order, you edit the function rather than hunting down every CASE scattered across your queries.

4. A SQL UDF built from built-ins

SQL UDFs can do more than arithmetic. This one derives a person’s initials by splitting their name, taking the first letter of each part, and joining them, using a subquery over UNNEST inside the function body.

CREATE TEMP FUNCTION initials(full_name STRING)
RETURNS STRING
AS (
(SELECT STRING_AGG(UPPER(SUBSTR(part, 1, 1)), '')
FROM UNNEST(SPLIT(full_name, ' ')) AS part)
);
SELECT initials('Ada Lovelace') AS inits; -- AL

The parentheses around the inner SELECT make it a scalar subquery, which a SQL UDF body allows. This shows how much you can express in pure SQL before reaching for another language, which the guide recommends you do whenever possible.

5. Your first JavaScript UDF

Some logic is simpler in a procedural language. Adding LANGUAGE js lets you write the body in JavaScript, passed as a string. Here reverse_text reverses a string, which is fiddly in SQL but trivial in JavaScript.

CREATE TEMP FUNCTION reverse_text(s STRING)
RETURNS STRING
LANGUAGE js
AS r"""
return s.split('').reverse().join('');
""";
SELECT reverse_text('BigQuery') AS reversed; -- yreuQgiB

The r"""...""" is a raw triple-quoted string, which keeps backslashes and quotes intact, ideal for JavaScript bodies. The SQL parameter s STRING arrives in the JavaScript as a normal variable, and whatever you return becomes the function’s result.

6. Parsing JSON in JavaScript

JavaScript UDFs shine at JSON. When a column holds a JSON string, JSON.parse turns it into an object you can navigate with dot notation to pull out a nested value.

CREATE TEMP FUNCTION get_city(payload STRING)
RETURNS STRING
LANGUAGE js
AS r"""
const data = JSON.parse(payload);
return data.address.city;
""";
SELECT get_city('{"name":"Ava","address":{"city":"London"}}') AS city; -- London

Reaching data.address.city in JavaScript is far cleaner than chaining SQL JSON functions for deeply nested fields. This is exactly the kind of procedural task where stepping outside SQL pays off.

7. String logic in JavaScript

JavaScript’s string and regex methods make text transformations easy. This title_case function capitalises the first letter of every word, something that would take several nested SQL calls.

CREATE TEMP FUNCTION title_case(s STRING)
RETURNS STRING
LANGUAGE js
AS r"""
return s.replace(/\w\S*/g, function (word) {
return word.charAt(0).toUpperCase() + word.substr(1).toLowerCase();
});
""";
SELECT title_case('hello there general kenobi') AS titled;
-- Hello There General Kenobi

The regular expression matches each word, and the callback upper-cases its first character and lower-cases the rest. Keep in mind the guide’s advice though: JavaScript UDFs are slower than SQL ones because each call crosses into a JavaScript runtime, so prefer SQL unless the logic genuinely needs JavaScript.

8. Applying a UDF across a table

A UDF is most useful run over many rows. Define the function, then call it in the SELECT against a table. Here we reuse categorize_order from Example 3 over a small set of sample orders.

CREATE TEMP FUNCTION categorize_order(amount FLOAT64)
RETURNS STRING
AS (
CASE
WHEN amount >= 200 THEN 'large'
WHEN amount >= 50 THEN 'medium'
ELSE 'small'
END
);
WITH orders AS (
SELECT * FROM UNNEST([
STRUCT(1 AS order_id, 250.0 AS amount),
STRUCT(2, 75.0),
STRUCT(3, 30.0)
])
)
SELECT order_id, amount, categorize_order(amount) AS size
FROM orders
ORDER BY order_id;

Each row gets its label from the same function, so order 1 is large, order 2 medium, and order 3 small. Calling a UDF in a query reads just like calling a built-in, which is the whole point: your custom logic becomes a first-class part of the SQL.

9. A persistent function for reuse

Temporary functions vanish when the query ends. To share a function across queries and across your team, create a persistent one, which is stored in a dataset. Change scratch to any dataset you can write to.

-- Stored permanently in a dataset; reusable by anyone with access
CREATE OR REPLACE FUNCTION scratch.apply_vat(net FLOAT64)
RETURNS FLOAT64
AS (ROUND(net * 1.20, 2));
-- Call it like any built-in, qualified by its dataset
SELECT scratch.apply_vat(100) AS gross; -- 120.0

Once created, scratch.apply_vat is callable from any query until you drop it, with no need to redefine it each time. The guide recommends persistent functions precisely for team-wide reuse: a single canonical definition of a calculation that everyone calls the same way.

10. A table function that returns rows

Scalar UDFs return one value; a table function (a TVF) returns a whole table. It takes scalar arguments and its body is a query, so you can parameterise a reusable result set. This one returns orders at or above a threshold you pass in.

CREATE TEMP TABLE FUNCTION orders_above(min_amount FLOAT64) AS (
SELECT order_id, amount
FROM UNNEST([
STRUCT(1 AS order_id, 40.0 AS amount),
STRUCT(2, 120.0),
STRUCT(3, 250.0)
])
WHERE amount >= min_amount
);
SELECT * FROM orders_above(100) ORDER BY amount; -- orders 2 and 3

You call a table function in the FROM clause, exactly where a table would go, passing the argument in parentheses. This lets you package a parameterised query once and reuse it like a virtual, configurable table.

Work through these and you will have covered the whole article: scalar SQL UDFs with one or many parameters, conditional logic, JavaScript UDFs for strings and JSON, applying functions across tables, persistent functions for sharing, and table functions for returning rows. The guidelines worth keeping are the ones the guide closes on: keep each function small and focused, declare your parameter and return types explicitly, prefer SQL over JavaScript unless the task demands otherwise, and promote a function to persistent once the whole team relies on it. Used that way, UDFs turn repeated logic into a small, reliable vocabulary your queries can share.

See you soon.

View Comments (2)

Leave a Reply

Prev Next

Subscribe to My Newsletter

Subscribe to my email newsletter to get the latest posts delivered right to your email. Pure inspiration, zero spam.

Discover more from Discuss Data Science, Machine Learning and Analytics

Subscribe now to keep reading and get access to the full archive.

Continue reading