Hacker News new | past | comments | ask | show | jobs | submit login

I'm heading in the direction which is engines/tooling and I'm amazed at the number of corners of SQL that I have discovered that I don't know properly. You can use it fine for decades but when you need to know precisely what is acceptable to the GROUP BY/HAVING clauseS... then you suddenly realise you don't. It's quite surprising.

it would be useful to have that standard. Thanks, will have a hunt.




Are you at-least familiar with the concept of relational-division?


Very. Why?


In that case I'm curious what it is about GROUP BY and HAVING that you recently learned - everyone I know that groks division knows their RDBMS-of-choice's SQL dialect inside out - it's like it's impossible to learn one without the other.


I use both in SQL regularly and without any problem, done so for decades. The problem is if you're actually implementing something then those rules you informally understand suddenly start to look less obvious when you try to write them down. Even SQLite got it wrong:

.

6. Aggregate Queries Can Contain Non-Aggregate Result Columns That Are Not In The GROUP BY Clause

For example to find the highest paid employee:

SELECT max(salary), first_name, last_name FROM employee;

.

https://www.sqlite.org/quirks.html


ISO SQL permits aggregate queries returning (“plain ol’ data”) non-aggregate/non-grouping-key columns - but only when the engine can prove a functional-dependency from the grouping-keys - this isn’t exactly the same thing as what SQLite supports (and MySQL 4-5’s weird behaviour was disabled by default in v8) - but Postgres supports it now too.

I don’t know exactly where in the spec it’s defined, though - but I’ve seen people say it was introduced in SQL99.


Oh, that's very interesting about the functional dependencies. I'd certainly no idea about that. Interesting discussion, thanks!


SQLite would describe this as a feature, because it's a PITA do that type of query otherwise - window functions with rank() or row_number(), or a self-join. (Obviously this specific query could be done with ORDER BY salary DESC LIMIT 1 but if you want the highest paid employee per team or department etc...)


> but if you want the highest paid employee per team or department

That's straightforward, no?

   max(salary) over(partition by emp, dept order by salary desc) as top_emp
   ... 
   where top_emp = 1
(You need to put the window in a subquery)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: