Imagine you make the same sandwich every single day. Two slices of bread, butter, cheese, a pinch of salt, cut in half. You could describe those five steps out loud every morning, but after a while it would be quicker to just say “make my usual” and have someone know exactly what that means.
A user-defined function in BigQuery is “make my usual” for a calculation. It is a little instruction you write once, give a name, and then call by that name whenever you need it. Instead of typing out the same maths in query after query, you teach BigQuery the recipe once and then just say the recipe’s name.
Here is the simplest possible example. Suppose you keep needing to double a number. You could write x * 2 everywhere, but you can also save it as a function called double_value.
CREATE TEMP FUNCTION double_value(x FLOAT64)RETURNS FLOAT64AS (x * 2);SELECT double_value(21); -- gives 42
Why bother? Two reasons, and they are the same two reasons you would write “make my usual” instead of listing the steps each time.
The first is that you stop repeating yourself. If the same calculation shows up in ten queries and you copy it ten times, then the day the rule changes you have to find and fix all ten. With a function, the rule lives in one place. You change the recipe once and every query that uses it updates automatically.
The second is that your queries become easier to read. A line that says categorize_order(amount) tells you what is happening at a glance. A line with a tangle of nested CASE statements makes you stop and decode it. A good function name is a label on a drawer: you do not have to open it to know what is inside.
There are a few flavours of this idea, but they are all the same trick. Some functions are written in plain SQL, which is the usual choice. Some are written in a bit of JavaScript, which is handy when the job is awkward in SQL, like reversing text or digging a value out of a lump of JSON. Some functions are temporary, meaning they only exist for the one query you are running. Others are permanent, saved into a dataset so your whole team can call the same recipe by the same name forever.
Full article here: https://datalad.co.uk/user-defined-functions-in-bigquery/
10 Code along here: https://datalad.co.uk/user-defined-functions-in-bigquery-10-code-along-examples/
Cheatsheet here: https://datalad.co.uk/bigquery-udfs-cheatsheet/
That is really all a UDF is. Not a clever piece of machinery, just a named recipe. You write the steps once, give them a sensible name, and from then on you ask for the result by name instead of spelling out the steps every time. If you have ever saved a formula in a spreadsheet so you did not have to retype it, you already understand exactly what a BigQuery function is for.
See you soon.