Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

NULL isn't a Boolean value in ternary logic any more than 3.2 or 'Hello' or December 12, 2018, have Boolean values. It's UNKNOWN. UNKNOWN is related to NULL, but they don't work identically.

NULL is a value that any column data type can potentially have. NULL is what comparison and evaluation operators work with. UNKNOWN is a ternary Boolean type, and the Boolean type is what Boolean operators work with (AND, OR, NOT) and nothing else. This Boolean type in an RDBMS is unavailable to the user and is for internal evaluation purposes only. RDBMSs that support a "bool" type are not implementing the same thing. You can never say UPDATE MyTable SET Col = Value1 AND Value2. That's not going to work. Many RDBMSs have a documentation page that explains this difference, like this one[0] from Microsoft SQL Server.

Notably, NULL + 3 and NULL * 5 are both NULL. Any mathematic operation on NULL is NULL. But UNKNOWN AND FALSE is FALSE, and UNKNOWN OR TRUE is TRUE.

[0]: https://docs.microsoft.com/en-us/sql/t-sql/language-elements...



NULL is an alias for UNKNOWN on many systems (like MySQL.) Other DBs don't even have UNKNOWN.

UPDATE table set col=value1 and value2 works fine IF value1 and value2 are booleans.


That's a great example of MySQL creating a proprietary extension of ANSI SQL that does little more than deliberately mislead users.


According to https://en.wikipedia.org/wiki/Null_%28SQL%29#BOOLEAN_data_ty... NULL is the same as UNKNOWN. The standard also asserts that NULL and UNKNOWN "may be used interchangeably to mean exactly the same thing"

In 20+ years of DB work, I have NEVER seen anyone use UNKNOWN. It is always NULL. Always.


Alright, I will withdraw my criticism of MySQL on this issue.

However....

> In 20+ years of DB work, I have NEVER seen anyone use UNKNOWN.

I mean, I've already shown where Microsoft does just that [0]. Oracle pretty clearly does the same [1] [2]. People don't use it because you can almost never refer to it directly. The language intentionally hides it. About the only place I know that you can is PostgreSQL [3], which supports the "boolean_expression IS UNKNOWN" predicate.

> The standard also asserts

I assume you've got the 2003 draft standard that's around [4]. I will use that because I don't see any more recent version of 9075-2 that's freely available.

Yes, the standard does say under 4.5 Boolean types:

> This specification does not make a distinction between the null value of the boolean data type and the truth value Unknown that is the result of an SQL <predicate>, <search condition>, or <boolean value expression>; they may be used interchangeably to mean exactly the same thing.

However, that's in the context of describing the Boolean user data type, a.k.a., BOOLEAN. You can tell because 4.2 describes character strings (CHAR, VARCHAR, etc), 4.3 describes binary strings, 4.4 describes the numeric data type, 4.6 describes DATETIME, and 4.7 describes user-defined types.

The standard is not saying that UNKNOWN and NULL are the same. It's saying that the Boolean user data type can use NULL to represent UNKNOWN. It's saying that if you choose implement a BOOLEAN user data type, you can use NULL to represent UNKNOWN. If you choose to assign a boolean expression to a column, that is. Nevertheless, an SQL <predicate>, <search condition>, or <boolean value expression> has a value of True, False, or Unknown. This shown by looking at 6.34 <boolean value expression>:

  <truth value> ::=
      TRUE
      | FALSE
      | UNKNOWN
Or by searching section 8 and seeing where every time they talk about one of the value expressions being the null value, then the predicate "is Unknown".

[0]: https://docs.microsoft.com/en-us/sql/t-sql/language-elements...

[1]: https://docs.oracle.com/cd/B19306_01/server.102/b14200/condi...

[2]: https://docs.oracle.com/cd/B19306_01/server.102/b14200/sql_e...

[3]: https://www.postgresql.org/docs/11/functions-comparison.html

[4]: http://www.wiscorp.com/sql_2003_standard.zip

Edit: Bit of cleanup.


Ok, I will concede you are technically correct! However, I never seen a developer use "is unknown", even with Postgres. (I have been working with Postgres for over 15 years.) They always use "is null", which is, for all intents and purposes, the same thing from a developer perspective.


I've only seen it once that I can think of, and I don't remember where. It might've been an example when they added or explained that predicate. I recall something like (Column1 = Column2) IS NOT UNKNOWN, but I don't know why you wouldn't use Column1 IS NOT NULL AND Column2 IS NOT NULL instead. I guess it might save a bit of rewriting, but it still seems pretty narrow.

It's really not useful unless you're talking about the value of a boolean expression or the underlying concepts of SQL, and most RDBMSs don't let you manipulate that directly with DML (MySQL is the first one I've seen that let you do it, and you just taught me that was the case). It's somewhat hidden because of that.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: