Are the feature comparison tables in this article supposed to reflect the state of SQL databases as of several years ago?
I've been using Postgres very consistently for the past half decade (as a dev, not a DBA) and it's had a `GROUP BY` and `WHERE` clause for at least as long.
I'm not totally sure what "base table `UNIQUE`" is supposed to mean in this context, but if it's just the bare `UNIQUE` constraint, Postgres has also had that for a long, long while.
The table is about dependency checking for those constructs.
E.g. SELECT countries.code, countries.name, count(*) FROM countries JOIN countries_languages as co_lang on co_lang.country_code = countries.code GROUP BY countries.code
This isn't supported by Postgres, because countries.name is not in the group by clause, on the other hand it does make sense, because code will be unique in the countries table, so there will always be only one name for a given code.
And as specifically noted in that linked article you reference, it talks about how functional dependencies are a feature of SQL-99.
This aspect of the original article is really talking about how MySQL decided to implement a more up-to-date version of the spec that many competitors when they finally addresses their deficiency (as well they should, finally being SQL-92 compliant in the 2010's wouldn't really be anything to boast about).
That's a fairly obtuse example. Is there a standard you're drawing from, for this? Because this example works fine:
postgres=# create table func_test ( id int primary key, data text );
CREATE TABLE
postgres=# insert into func_test values ( 1, 'josh' ), ( 2, 'markus' );
INSERT 0 2
postgres=# select id, data from func_test group by id;
id | data
----+--------
2 | markus
1 | josh
> 4.24.13 Known functional dependencies in the result of a <group by clause>
Not that this is about the result of a <group by clause>
edit: The matrix I have in the article is not complete. I've just picked a few examples from the standard and checked them to get an overview. I guess the fact that PostgreSQL supports some of them but MySQL more of them is well represented in the matrix.
The full text is below. I'm happy to correct, when I'm interpreting this wrong (which is perfectly possible).
Let T1 be the table that is the operand of the <group by clause>, and let R be the result of the <group by clause>. Let G be the set of columns specified by the <grouping column reference list> of the <group by clause>, after
applying all syntactic transformations to eliminate ROLLUP, CUBE, and GROUPING SETS.
The columns of R are the columns of G, with an additional column CI, whose value in any particular row of R somehow denotes the subset of rows of T1 that is associated with the combined value of the columns of G in that row.
If every element of G is a column reference to a known not null column, then G is a BUC-set of R. If G is a subset of a BPK-set of columns of T1, then G is a BPK-set of R.
G ↦ CI is a known functional dependency in R.
Hmmm. PostgreSQL treats output rowsets as their own thing, so this works differently in a subquery than in a top-level query. Not that that's desireable, it's just a consequence of an implementation that gives a lot more flexibility in other areas (table functions, for example, where Postgres goes well beyond the standard). So looking at the standard I agree that having your example work is superior to not having it work. I'd appreciate a footnote for that particular item for Postgres, though, since the most common forms of GROUP BY dependencies do work.
I’m constantly flabagastered by what’s considered novel in the world of MySQL.
But then I remember this is the database that lied (lies?) about ACID transactions, and had no problems storing non-existent dates in a DateTime-column because its creators considered ensuring data-integrity and consistency “the job of the application layer” (which back then often was PHP).
> I’m constantly flabagastered by what’s considered novel in the world of MySQL.
What's that in reference to? The article explains that MySQL implemented group by using a newer standard than many competitors did, by looking at functional dependencies. Not exactly novel, but good for MySQL, they decided to go beyond the competition when finally catching up on this particular feature.
There are, of course, many other problems with MySQL (as you allude to), such as silently ignoring CHECK constraints, but that doesn't seem to make sense as what you were referring to in context of what they consider novel.
> this is the database that lied (lies?) about ACID transactions
Time moves on. The Innodb storage engine has had ACID transactions for over a decade.
> no problems storing non-existent dates in a DateTime-column
PostgreSQL checks date formats, but updating (or not updating) timezone databases would affect validity. So it's not as simple an issue as you make it to be.
Yeah, later on in the article it has an "X" for PostgreSQL for "JSON_OBJECTAGG" and "JSON_ARRAYAGG" and I'm 100% sure postgres has the former (I used it like 10 minutes ago), and like 99% sure it has the latter.
The table is very strict, it requires that the actual standard be supported (possibly incompletely). Postgres has ways to aggregate into a json object or array, but they're non-standard. If you write standard SQL making use of those features, it will not work.
I haven't done any exact test, but it looks identical to how the json_object_agg function in postgres works, but with a different syntax.
Is it really that different to not even warrant a note in the table?
Clearly you know better than I do, but the big red X surprised me as the fantastic JSON support is one of the reasons why I tend to reach for Postgres so often over other DB engines.
A big red X doesn't mean the database doesn't have other, similar but proprietary functionality. It just means: it doesn't support this function in a standard confirming way yet.
There is nothing against using non-standard functionality in that case. In this particular case. PostgreSQL got JSON support (~2012?) long before it was added to the standard (2016). Obviously, their JSON functions differ from the standard (as they didn't lobby their functions into the standard).
However, slowly but surely database will add the standard functionality, which makes life easier.
Having that said, the standard JSON_OBJECTAGG is more powerful as it gives you control over null handling (NULL ON NULL -vs- ABSENT ON NULL) and allows checking for duplicate keys ([(WITH|WITHOUT) UNIQUE KEYS]).
Maybe add some little footnotes to the X? It took me a few minutes of thinking "Of course postgres has json object aggregation here it's working perfectly and it's documented" before I realised that sure it existed but it's not the standard version, at all, which is probably why it's marked as a no, because it's not Troel's page and Markus really really cares about the standard.
That part of the article doesn't make much sense. It sounds like MySQL finally enabled rational GROUP BYs instead of randomly choosing a row when the grouped rows aren't distinct. This is something at least PostgreSQL has done for ages.
Nope, the old mysql was extremely stupid, but the postgres behaviour wasn't that smart because it refused anything not in the group by clause (even when constraints on the table ensure that everything in your select clause can in fact be determined from the columns used to group).
I've been using Postgres very consistently for the past half decade (as a dev, not a DBA) and it's had a `GROUP BY` and `WHERE` clause for at least as long.
I'm not totally sure what "base table `UNIQUE`" is supposed to mean in this context, but if it's just the bare `UNIQUE` constraint, Postgres has also had that for a long, long while.