Hacker News new | past | comments | ask | show | jobs | submit login

"you cannot do everything related to data in SQL. In some cases, you need a traditional programming language to work with the data"

Them's fighting words :)

But point taken regarding the benefit of modern language tools applied to query-based work.




I inherited a graphql project that was a work of art in that respect. All access control etc was done in the database and triggers were effectively used the works.

The resulting application was a complete performance failure. Databases are slow and generally involve latency. They are a source of truth and an engine for querying.

They are not meant to be your applications working data model.


Especially since its literally untrue. SQL, with the addition of recursive queries (supported by all since the late 90s AND used in an example within this very article), is Turing complete. There's literally no operation on data that can't in principle be expressed as an SQL query (albeit the construction might be painful and tedious).


Do a dynamic pivot (I'll wait.)

But in all seriousness, DataFrame centric operations are a superset of SQL (you can always do a df.sql("...") if you want to ) and have a lot more efficient implementation of both OLTP/ORM requirements and OLAP/DS/BI requirements.

They also encourage composability, modularization, reuse, unit and data testing ...

So it's ironic I feel like SQL's replacement is another declarative language - its original inspiration - plain English.

Just a natural language transpiler (like Palantir's Ontology plus Looker's Malloy (reverse disclaimer: I do not work for or enjoy either of these products but these underlying concepts are correct)) with some fancy domain heuristics and light AI (I suspect a Pareto like model that supports 80% of use cases only needs a semantic graph with a few thousand nodes and vertices)


> There's literally no operation on data that can't in principle be expressed as an SQL query (albeit the construction might be painful and tedious).

There are some problems though:

- Single query can't write to multiple tables.

- Single query can't return multiple resultsets.

You can write a stored procedure, but this is no longer "an SQL query", strictly speaking. And once you start writing stored procedures, you are no longer using just SQL, but whatever Ada-inspired procedural extension to SQL was implemented by the database vendor. In other words, you are using "a traditional programming language to work with the data".


A single query can write to multiple tables, using CTE's in PostgreSQL for example.

You could compose a SQL query that allows you to map multiple resultsets to 1 resultset, although that feels a bit awkward.

    WITH a AS (
        insert into a (k, v) values ('a', 1.0) returning *
    ), b AS (
        insert into b (k, v) values ('b', 2.0) returning *
    )
    SELECT
        row_to_json(a)
    FROM
        a
    UNION ALL
    SELECT
        row_to_json(b)
    FROM
        b;
Returns:

        row_to_json        
    --------------------------
    {"a_id":1,"k":"a","v":1}
    {"b_id":1,"k":"b","v":2}
    (2 rows)


That is eye-opening but - if it actually works and I find it hard to believe - is no way ...

ah, the insert is a CTE because it produces a value ('returning' I guess). Hmm. This is very odd. Doesn't seem to work in mssql.

Well thanks for the can of worms...


mssql CTE support is very very basic, to the point of being not very useful.


What the hell are you talking about?


You can only select, you can't nest cte, the query planner has no understanding of joins that cross cte boundaries so they are totally unoptimized, you can't use distinct or group by or etcetc. Really the CTE implementation in SqlServer is basically a parser level hack.


> You can only select

   with x as (...)
   update x set ...
> you can't nest cte

ok but you can linearise them

   with x as (...), y as (...)
> the query planner has no understanding of joins that cross cte boundaries so they are totally unoptimized

utter, reeking garbage.

> you can't use distinct or group

more garbage. I have. Show me an example of it not working.

(Edited for less rudeness)


so you can do this?

  WITH t AS (
      DELETE FROM foo 
  )
  DELETE FROM bar;
Or can you only use SELECT for WITH queries? Did you not realize that other databases and the SQL standard allow you to do this?

Yes on the final query of the CTE you can do all sorts of things, but that's way less useful if you can't do them in all the component queries.


Show me any sql dialect that will allow that CTE you give here (edit; and what on earth is that supposed to actually mean)

> Or can you only use SELECT for WITH queries

you can only use a select inside a cte (or should be able to) because the 'e' stands for 'expression'. It seems postgres does allow an insert with and output which sort of makes sense but I doubt it's in the standard.

Your last sentence makes no sense to me. Give an example.

Also you failed to give an example that group by/distiinct weren't allowed in ctes.


Good to know, I wasn't aware PostgreSQL supports this.

I'm currently on SQL Server and it doesn't support INSERT as a CTE (and I think most DBMSes out there still don't). It would definitely make my life easier if it did...


It's not the only DB or way of doing it either. Snowflake supports multi-table inserts for example:

https://docs.snowflake.com/en/sql-reference/sql/insert-multi...


> There's literally no operation on data that can't in principle be expressed as an SQL query

That may be technically true (or not, I don't know), but in many cases some complex data manipulation (especially when it's done in multiple passes) practically needs a lot of ram and a programming language to be more time efficient.



Technically true, but sql has the ability to induce discipline - every single time I’ve seen folks given the opportunity to do the analysis off a database or warehouse and use pandas/spark, they almost always end up writing bad code, processing one row or cell at a time. Almost always, they could have done it with sql itself and gotten much better performance but this “extra flexibility” allowed them to be lazier.

Sql can also have the same issue with feature creep (snowflakes array columns often invites people to write pathetic sql trying to join on this col for example) but with the right design you can force your users to interact with your data more efficiently.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: