I would like to programmatically handle NULL values in a database.
If I add or multiply a known value to a NULL, the result is still NULL. This is an appropriate result as a NULL value in a database schema means “unknown” or “nonexistent”, so mathematic functions against known values and NULL should produce NULL.
That said, if I want a sum of line-items I could filter out NULL values, e.g.,
SQL> SELECT SUM(col) FROM foo WHERE col IS NOT NULL;
However, this could get difficult to manage in larger queries.
In Oracle, we can use NVL to get the above result
SELECT SUM(NVL(col,0)) FROM foo;
In MySQL and SQLite, we can use IFNULL
SELECT SUM(IFNULL(col,0)) FROM foo;
In Postgres we can use NULLIF
SELECT SUM(NULLIF(col,0)) FROM foo;
And in Microsoft SQL Server we can use ISNULL
SELECT SUM(ISNULL(col,0)) FROM foo;
While I prefer NVL for it’s conciseness, I recommend COALESCE as it works in each of the above database engines. COALESCE accepts two or more parameters and simply returns the first non-null parameter, e.g.,
SELECT SUM(COALESCE(col,0)) FROM foo;
Accepting multiple parameters can be very useful in cases where we’re aggregating otherwise mutually exclusive columns, e.g.,
SELECT SUM(COALESCE(eggs.weight, spam.weight, 0)) AS lonely_weight FROM eggs FULL OUTER JOIN spam ON eggs.id = spam.id WHERE eggs.id IS NULL OR spam.id IS NULL;
In the above case, it’s the sum weight of all eggs without spam, and spam without eggs.