„Ternary boolean logic“ in SQL – making NULL useful
Typically, SQL's NULL is used as „unknown value“ or „Not
Applicable“.
For example, you have a table of users who may provide their e-mail, but if
they don't, you store NULL to the e-mail column. That's
„uknown“.
Other usage example is a column with last user's forum post. But before his first post, you store NULL there to mark there's no post yet.
That's the basic NULL usage. But it can provide you much more – mainly
for complex SELECTs used for searching or for reports.
NULL features and performance
On most enterprise-ready RDBMS's, NULL can be indexed.
On some, it also has some special features like MySQL's PRIMARY KEY
AUTO_INCREMENT or TIMESTAMP's auto-updating feature.
Storage space for a NULLable column is usually one bit, but
these bits are grouped in a byte-aligned field per row, which means, 1 to
8 null columns will occupy 1 byte per row, 9–16 will take
2 bytes, etc.
SELECTing the rows (not) having a NULL value is
usually very quick, because this information is part of the index; more, the
database system does not have to sort them internally in a tree index, like it
has with concrete values (e.g. numeric ID's).
Basic NULL constructs in SQL
SELECT a IS NULLSELECT a IS NOT NULLSELECT ISNULL(a)SELECT IFNULL(a, 'default')returns'default'ifaisNULL.SELECT COALESCE(a, b, c)returns the first non-NULLvalue.
NULL semantics
Null has these characteristics:
Most operations with null result in a NULL.
'' + NULL⇒NULL0 + NULL⇒NULLNOT NULL⇒NULL
Comparison
Comparison always returns NULL, even for NULL =
NULL. You have to use IS NULL.
'' = NULL⇒NULL0 = NULL⇒NULLNULL = NULL⇒NULL
There's also a null-safe comparison operator, <=>, which
behaves like this:
1 <=> 1⇒1NULL <=> NULL⇒11 <=> NULL⇒0
There are some NULL-related functions, which provide you a
convenient way when you need to work with multiple nullable values:
COALESCE()IFNULL()NULLIF()
Boolean logic
Boolean logic is specific. Here, NULL is considered as meaning „Unknown value“.
NULL AND 1 ⇒ NULL NULL AND 0 ⇒
0 NULL OR 1 ⇒ 1 NULL OR 0
⇒ NULL NULL XOR 1 ⇒ NULL NULL
XOR 0 ⇒ NULL
This makes sence, because with AND, you can tell the expression
will be false if one of operands is false. However, if one operand is true,
it's the other one which determines the result; so if it's unknown, the result
is unknown as well.
Similarly it works with OR. For XOR, it's always
NULL, as both operands are always needed.
Using NULL features to your advantage
Without learning the basics above, the NULL behavior might seem strange to
you and can lead to an unexpected behavior. But if you keep that in mind, you
can make your complex SELECTs less complex by using some of the
following principles.
To be written later.