Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Why GraphQL APIs but no Datalog APIs?
162 points by networked on Dec 8, 2019 | hide | past | favorite | 107 comments
Why doesn't anyone use Datalog (or another limited-by-design logic programming language) as the query language in a public Web API? The adoption of GraphQL suggests demand for letting the API user write more sophisticated queries even at a performance cost per query. Datalog would let the user do more, likely saving many HTTP round-trips. A lot more research exists on optimizing Datalog than GraphQL. It is unclear that the total impact on performance would be significantly negative compared to the typical GraphQL API. If this line of thinking is valid, where are the Datalog API experiments?



For context, I'm a frontend development lead who oversees a number of projects of different shape and size. My personal and very subjective opinion is it's simply hype. If you say "Datalog" to a frontend developer, they will either hear "obsolete" or just not know what you mean. If you say "GraphQL", they hear "+5 CV points". Obviously that is slightly tongue in cheek, but the marketing side of it is a very real factor.

Facebook currently holds one of the largest levers in frontend development — React. If they say the officially recommended way to do X with React is Y, you can be sure that Y will get at least moderate traction.

I'm sure they exist out in the wild, but personally I haven't seen a project where GraphQL really shines through. In my experience, HTTP2 and reasonably well designed RESTful endpoints are the right default to go with. There is the argument that building good APIs is hard, but I believe someone who has reasonable experience with a stable technology will outperform someone using a new tool for the job. If you're not familiar with it, you won't know what to look out for.


> I'm sure they exist out in the wild, but personally I haven't seen a project where GraphQL really shines through. In my experience, HTTP2 and reasonably well designed RESTful endpoints are the right default to go with. There is the argument that building good APIs is hard, but I believe someone who has reasonable experience with a stable technology will outperform someone using a new tool for the job. If you're not familiar with it, you won't know what to look out for.

GraphQL doesn't make a ton of sense to me, unless you're expecting many clients with significantly different query pattern requirements and prefer to take on performance and (maybe) security uncertainty & complexity to accommodate that more easily.

If you only expect one client (or a few very similar ones) and you want GraphQL-like division of labor just have your frontend and backend folks actually talk to each other and maybe have the backend folks also write a client library for their services (my preference, most of the time).

[EDIT] but you bet your ass I'm not putting up much of a fight against it if I'm in anything lower than a development lead or architect position—no use fighting that fight from poor footing when, as you mention, it's +5 CV points anyway, so who cares if it's maybe not the best solution :-)


> "GraphQL doesn't make a ton of sense to me, unless you're expecting many clients with significantly different query pattern requirements and prefer to take on performance and (maybe) security uncertainty & complexity to accommodate that more easily."

My impression of it is that it's an "API" that's masquerading as a dumb relay, essentially giving almost datastore-level access to the frontend client (even though they don't market it that way). And this is done so that "Frontend devs" can just do "Frontend" things with access to all the data they might end up using. A bit snarky, but I do see it as a way to just "offload" the problem down the line, instead of thinking about the API-usage and what it means for potential frontend clients. That's the "old way" of doing things of course.

It fits in to the rest of the "movement" or trend currently happening: Thick (web) clients that do everything and the kitchen sink (including auth. See JWT in arb HTTP headers instead of cookies), SPAs so that there is no server side page rendering (this must all be done in the thick web client you see), document databases instead of SQL so the "frontend" devs don't have to worry about schemas, SPAs so that we break half the bloody internet because we now need to retrofit browser navigation using hashtags, everything has to be an "API" now because we can't convey data using server-rendering, etc.

Honestly, just that navigation thing alone is nuts and I can't believe we ever allowed it to be imposed on us. At this point, I'm afraid to touch those buttons on a new website without first trying them to learn their specific behavior on that site.


>My impression of it is that it's an "API" that's masquerading as a dumb relay, essentially giving almost datastore-level access to the frontend client (even though they don't market it that way). And this is done so that "Frontend devs" can just do "Frontend" things with access to all the data they might end up using. A bit snarky, but I do see it as a way to just "offload" the problem down the line, instead of thinking about the API-usage and what it means for potential frontend clients. That's the "old way" of doing things of course.

Or to take into the account that you have no idea what people actually use your API for. Lately i had to write an simple reporting site(simple, as i work with backend and GIS mostly) - using an existing API.

I need a fraction of data sent by API, but in bulk - it doesn't support that, nor i can change it.

Currently i do have to make a bulk requests that includes filters on the query, then i do need to individually request each record - as bulk API does not return the bloody two extra fields i need.

It turns a single query into N queries - which is slow and annoying as server cannot handle being hammered by thousands of rest API calls. Not to mention that i literally need 4 fields out of like 40.

If the app implemented graphQL it would be a breeze to just fetch what i need instead of waiting minutes(!).


You can handle proper URLs in a SPA if you know what you're doing. It's not something your average frontend dev would pay attention to though, sadly. Server-side "isomorphic" rendering of pages is also possible but a lot more complicated to resolve all the async requests that have to run to render the page.


Unless you use GraphQl of course!


>My impression of it is that it's an "API" that's masquerading as a dumb relay, essentially giving almost datastore-level access to the frontend client (even though they don't market it that way)

Well that's exactly what GraphQL doesn't do. Each GraphQL endpoint is effectively as powerful as a REST endpoint that takes in a JSON object as a parameter. (in other words the QL in the name is a lie) The primary difference is that you can send a batch of GraphQL queries at once and that you can filter attributes that are sent by the endpoint.

I will admit though, that there are a lot of GraphQL implementations that are just a thin wrapper over a database but REST had these for a long time as well.


A few weeks ago, I was pondering over the pain involved in designing and implementing a REST api, for an simple backend.

After much deliberation, it was clear to me that I dont understand the HATEOS part of REST and that I have been using the url as a filter and get or post parameters as variables for calling some functions on the backend server. And often, for an SPA, I would have to do multiple queeies in quick succession.

Then I realised, why cant I simply group those multiple query calls into one giant post request, with custom codes for querying or mutation, send a post request with a body that is a DSL i have thought of, and let my backend decode that DSL and send the replies. No longer would I be limited by URL structure, no longer would I have to make multiple query calls for getting a complex nested entity.

The only downside is that I have to docume t it more thoroughly and that my api does not have easy discoverability.

But since I have only one client, my frontend that I program,this is not an issue.

After pondering over the DSL a bit, I realised I am actually conceptualizing GraphQL.


The complexity is just moved somewhere else; instead of being explicit in an API, it's implicit in a behaviour. If you want to permission those bits you're doing CRUD ops on, or refactor how they're stored, change their normalization in the database, split them out of a monolith into separate services, ensure you're not permitting the UI team (or worse, your customers writing direct to your API) take dependencies on stuff you want to deprecate, well, the complexity just comes back; it's just in the code handling those custom codes, and all you've invented is a multiplexer that obscures the complexity.

I think there's value to it though, to GraphQL, but mostly on the read side rather than CUD side. Often all sorts of widgets need to knit together data from different sources, and if your data is relational under the hood, with REST endpoints per relation, the knitting is work better left to a library. I'm less convinced for updates.


After years of programming, I am convinced that complexity cannot be overcome. For any non-trivial application, complexity is inevitable.

In my view, we, as developers, should take up complexity, so that the user is unburdened from it. Ultimately, that is the value addition of software. Remove complexity and improve productivity.

Fundamentally, data is relational and hierarchical, no matter how we store it (Documents / SQL). The value added by GraphQL, and often touted in documentation, is that the client, has control over the shape of the data received from the server. Though packaged in a slightly easier to read syntax, GraphQL is essentially SQL. Sending a GQL query is fundamentally sending an SQL query set, with joins and where clauses, and having a layer of abstraction that understands the query and converts it into lower levels calls to data storage systems.

In my view, this is a more powerful syntax and more flexible than using a REST API. To achieve the same with REST calls, you would have to do something akin to multiple simpler SQL queries, with manual linking of relational data.

I am sure, there will be more innovation in this area, particularly with pairin GQL with Document databases.


The options are not only GraphQL vs REST. In a lot of cases, you can also render server side and push just plain html / components to the client instead.


Aren't you just moving the client to the backend here? As I understand it we are talking about how whoever needs to render the data gets the data. GraphQL vs. REST vs. whatever is still as relevant in the backend-rendered case.


Yes, exactly true if you see it from that angle.

From your angle, I guess one other alternative to GraphQL and REST would be RPC. ;)


It absolutely is. It is in fact, one of my preferred ways to design systems when doing it in tight coupling between backend and frontend. Such as in a micro-fronted-style vertical. It’s very effective when the needs of the frontend dictate the needs of the backend.


GraphQL actually makes a fantastic frontend for a microservices architecture. We have several GRPC services that drive a suite of applications, we wrap those in a graphql schema and we've now made consuming those services much easier while maintaining the original underlying services.


I think the issue is more with framing.

GraphQL is excellent for when you’re implementing an external facing api or a data focused api.

Internal APIs where you also own the client or have operation (side effect) heavy clients. These imo would be far better suited to REST.


Interesting, I was thinking the complete opposite. GraphQL for internal api or single client api, while a REST api for external consumers.

Internal graphql api can be iterated and broken faster, adapting it for internal use as the app grows, while an external REST api would be more carefully generated and thoroughly documented.


In addition to my other reply, it serves well to finely granularize an external facing API so that consumers can mix and match to their requirement. This granularity is best accomplished by a REST Api.


Sounds more like SQL, if you ask me.

Either that or something more like OData.


Or datalog.


The big win of using GraphQL, at least for me, is not the performance, but the developer experience. I can craft a query in Graphiql, paste it into my application, and from there generate Flow types for the query response. Make a breaking change to my graphql API? Flow types will fail CI tests until I fix any dependent client code. All of this works pretty nicely out of the box, and it’s leaps and bounds better than my past experience with homegrown rest APIs.


You mean frontend developer experience. The backend developers I interact with are not too enthusiastic on moving the complexity from the frontend to the backend when using GQL.


Personally, as a full stack developer I prefer having the complexity on the back end, as I find it easier to manage there.


From my experience the backend also becomes simpler because it standardizes a lot of patterns, especially regarding subscriptions. I have a feeling any backend developer that thinks GraphQL is a significant increase in complexity is just reluctant to learn a new technology.


It's a shift from supporting some defined, limited set of behavior to supporting a much broader set of possible behavior, to the point that it likely can't practically be fully enumerated. If that's somehow not the case, then I'm not sure what you're using GraphQL for, since that's its whole thing.

Of course that's more complex.


It's a great thing to have! We do the same with json-schema converted to Typescript.


Many years ago I worked at a company that had built a janky version of GraphQL in-house. The model fit our use case perfectly: our customers were dealing with a complicated ontology of objects and it allowed us to build new UIs on top of that data without implementing new REST apis. Of course our implementation wasn't the best but for a certain class of applications the graph-query model is a godsend.


GraphQL makes a ton of sense when your queries use data from multiple different and possibly interdependent source APIs. It allows you to hide all manners of ugly hacks behind a single, neat, cacheable endpoint that can even act as a normalisation layer.


What you've said is true of any aggregating intermediate API, graphql or not. If you don't want to make n calls, you make a single API endpoint that makes two calls for your one, just like when you compose functions in code. You don't need GraphQL for this.

GraphQL only really brings a novel way of defining what data you want. That's it. GraphQL fixes none of your data fetching problems if you're responsible for the actual end to end solution - if anything it gives you more.

It's marketed at front end devs, who in my experience, really love _easy_ solutions, rather than simple.

As far as I can see, there is precisely one scenario where GraphQL makes sense - and that's where you are trying to build a lot of separate frontend clients which will consume the same large, complicated data model. And unless both of those things are true, GraphQL is going to cost you more than the value it adds.


> As far as I can see, there is precisely one scenario where GraphQL makes sense - and that's where you are building an unbounded number of front end clients which will consume a complex data model. And unless both of those things are true, GraphQL is going to cost you more than the value it adds.

That matches my impression. Frankly I don’t think it’s very interesting. But damned if 3/4 the webtech-focused job postings I see don’t list “graphQL experience” under the nice-to-haves.


> As far as I can see, there is precisely one scenario where GraphQL makes sense - and that's where you are trying to build a lot of separate frontend clients which will consume the same large, complicated data model. And unless both of those things are true, GraphQL is going to cost you more than the value it adds.

Exactly


I don’t get why GraphQL would be better at that particular use case than anything else, really. Capable of it, sure, but shifting the query-crafting to the client doesn’t really have any effect on that problem.

[edit] I just mean it seems unrelated to that problem, as far as I can tell.


> If you only expect one client (or a few very similar ones) and you want GraphQL-like division of labor just have your frontend and backend folks actually talk to each other and maybe have the backend folks also write a client library for their services (my preference, most of the time).

Yeah, but same result is much easier faster with graphql. When frontend developer can in most cases just slightly tweak the query, then it takes less time then having to talk to backend developer and wait till backend developer even starts working on it.(I am assuming that backend developer is normally occupied and is not sitting around bored until frontend developer needs query.)

And also, it avoids friction when backend developer is one of those superstars who are difficult to talk with.


GraphQL turns a social problem (getting your frontend and backend folks to coordinate with each other) into a technical problem that someone else already solved 75% of for you. That's a big win.

GraphQL doesn't do anything that you can't do with sufficient coordination, but coordination is actually one of the hardest parts of getting things done at a large company.


I sort of agree. GraphQL just seems like a solution for a small set of niche problems that's being promoted as a general REST replacement for any and all web APIs.

Seems like we (as in, our industry) went through this before with everyone ditching relational databases and glomming onto NoSQL (particularly MongoDB) - only to realize later it was a mistake.


For what it's worth, if I saw Datalog on a CV I would think "has studied the ways of the old masters" and when I see GraphQL I think "blub". But I don't claim to be good at hiring.


Kinda unfair since it's 1000x easier to get started with GraphQL than Datalog.

> I'm sure they exist out in the wild, but personally I haven't seen a project where GraphQL really shines through.

Look into how Gatsby.js uses GraphQL. You can query data from markdown files, configuration files (JSON) and resize images by specifying it in a query.


Github's api [1] makes sense to me - there are lots and lots of query patterns and the previous Rest api results in quite a bit of link chasing and bandwidth burning.

It would actually be interesting to see if someone could design a very good Rest api for gh-like usage.

1: https://developer.github.com/v4/


8 closed brackets in their first example query.

query { repository(owner:"octocat", name:"Hello-World") { issues(last:20, states:CLOSED) { edges { node { title url labels(first:5) { edges { node { name

              }
            }
          }
        }
      }
    }
  }
}

https://developer.github.com/v4/guides/forming-calls/


And some people says that lisp like languages use too much parenthesis...


This assumes the backend teams are even ready to offer HTTP/2 for us. For some reason or another, we're stuck at HTTP/1.1 since we are fronted by a certain CDN provider and our process for change management in that tool is skittish, since the rules grew organically and making changes is now tough.


If you are trying to attach a lot of different API’s together in a consistent way, a graphQL API may be what you need.


It's all about the data manipulation capabilities that our current programming languages and libraries provide. Datalog and GraphQL are both graph query languages, but they differ significantly in the datastructures they return.

GraphQL queries describe Tree unfoldings of graphs, and thus return trees.

Datalog describes recursive conjunctive queries on hypergraphs (relational model) without or limited negation, and thus return a set or bag of hypergraph edges.

The reason why GraphQL is so successful is that it fits well with Reacts data-model (trees) and the way it performs its efficient delta updates (tree walking). Furthermore its somewhat easier to implement (albeit not simpler). Consider that there are very few actual GraphQL query engines for actual data (e.g. DGraph DB), instead GraphQL backends implement resolver functions which compute the "graph"/tree on the fly, based on side effects. Resolvers are what you'd call a computable in prolog or datalog knowledge bases and even though I work full time on incremental query evaluation in said knowledge bases I don't have a clue on how to make those efficient without resorting to the Big Cannon of Differential Dataflow.

Datalog queries don't return trees, they return relations a.k.a. hypergraph edges, and that simply doesn't map well to the datastructures (maps/dicts and lists/vectors/arrays, JSON and other tree description formalisms) that we have in basically every mainstream programming language that isn't prolog or some variant on logic programming. So the query results are hard to work with, and you'd want a LINQ style embeddable datalog query engine in your language of choice to work with the returned data, which is a much bigger undertaking.

So to recap: Datalog is great if you have it everywhere and your language is build around it and hypergraphs, but alas most languages we use today are build around trees, and graphql is a tree language.


That's a very justified question!

In my opinion, Datalog or — more generally — Prolog is where many querying APIs will eventually arrive, because this syntax has many nice advantages: It is very convenient, expressive and readable, programs and queries can be easily parsed and analyzed with Prolog's built-in mechanisms, there is an ISO standard for it etc.

When semantic web formalisms were discussed, Prolog was sometimes mentioned. However, it is so far not very widely used as a querying or modeling language even for semantic web applications. Recent advances in Prolog implementations may increase the uptake of this syntax.

Also, public perception of Datalog and Prolog lags somewhat behind recent developments in these areas and what these technologies can actually do or what they even are. For example, here on HN, many posted articles that claim to be about Datalog often show snippets and examples that use completely different notations which are no longer a subset of ISO Prolog and therefore also not Datalog. Thus, they also do not immediately benefit from recent advances in Prolog systems in that they cannot directly be parsed and interpreted by them. It may take some time to advertise these improvements, and increase interest in these formalisms.


This was (is?) what the semantic web dream was all about, and I find it funny that it took a corporation (FB) developing a clone before others discovered the concept. Semantic web data would be published as RDF data and clients would consume this data using a SPARQL query engine to get whichever shape of data that they desired.


To clarify: I mean that Prolog was under consideration to be used for what SPARQL was then developed, as a more readable and simpler query and also modeling language.

Several researchers working in this area considered the development of SPARQL to be a missed opportunity for better formalisms, since Prolog could have been used instead.

And, yes, to many who have worked with RDF, SPARQL and the various notations that have been developed for semantic web technologies, Prolog seems like a dream that shows what these technologies could have been, and can become in the future, i.e., easily processable, with a uniform and simple syntax that already has an ISO standard, and amenable to formal reasoning based on well-known logical rules.


> easily processable

> formal reasoning based on well known logical rules

Oof, I do not know any serious researcher of logic who would support that claim.

Prolog is, due to its age and heritage, inherently side/effectful and undecidable/turing-complete.

SPARQL has big fat warts, for sure, design by committee, bulky syntax, way too long spec, way too many capabilities, and questionable decidability (looking at you MAYBE clause), BUT at its heart it's non recursive conjunctive queries over regular path expressions.

CJ over PathRegex are a really elegant way to describe arbitrarily nested datastructures with flat queries, and are a lot more intuitive and concise than the corresponding datalog query.

Edit: Not saying that prolog doesn't have its place btw, it's a neat language with awesome capabilities, but is NOT a query language.


That's right- Prolog is not a query language.

But the claim I believe is that _Datalog_ is a subset of Prolog that is a query language. That claim is true by design.

Also, given bottom-up evaluation, Datalog programs are guaranteed to terminate and, restricted to definite clauses with no function symbols (of arity more than 0) they are also amenable to formal proofs of correctness, perhaps more so than Prolog itself.

EDIT: Oh. I assume that by "formal reasoning" you and triska are discussing formal proofs of correctness- otherwise I don't understand the disagreement. Resolution is a well-known logical rule (an inference rule) and it's amenable to formal reasoning, for sure.


Sure Datalog is a query language, but Prolog isn't.

I object to datalog being viewed as a subset of prolog, other than for historical reasons. It's just not useful as both languages differ significantly in philosophy, properties and implemenentation.

Datalog program equality is undecidable btw so while it's certainly more amenable to correctness proofs than prolog (which isn't at all with it's undecidability and implementation definedness everywhere), you'd still want more restricted logics for that.

Which isn't to say that's a bad thing, often times you want that power, and you don't need the correctness proof.

Fun fact, DLLite can tell you that the shin bone is a bone, and it can do so 10 billion times a second, and verifiably correctly so ;)

Edit: Some systems that arguably implement a superset of Datalog, but for which their Datalog subset is not a subset of Prolog are LogicBlox, Datomic, Datascript and TriQ-Lite.


>> Edit: Some systems that arguably implement a superset of Datalog, but for which their Datalog subset is not a subset of Prolog are LogicBlox, Datomic, Datascript and TriQ-Lite.

That's interesting, I didn't know about that. I might have a look now that you mention it.

Regarding correctnes proofs, I think triska knows this subject better than me but my understanding has always been that if you stay within the pure subset of Prolog (i.e. no destructive update of terms, no reads and writes to the database and basically nothing with side-effects) then you can actually fully reason about your programs at least in principle.

I see this in practice everytime I write an extensive piece of Prolog code (for the record, I do that a lot, for my PhD) and I end up using the assert/1-retract/1 mechanism to update the program database as the program runs. The problem with that is that it kicks you right out of the cozy world of immutability, right back to the world of programmig with mutable states. Once your program reaches a certain degree of complexity it becomes impossible to predict its state at any given point in time. [Edit: the reason is that if something goes wrong and your program fails unexpectedly, the program database is left in an unpredictable state - and you have to understand each such state with great precision or it all goes to hell in a handcart. You're back to doing memory allocation by hand basically].

The hurt I feel everytime I use the assert/retract mechanism, or rather, the difference between the level of pain using assert/retract compared to not using them, is, for me, a great measure of the extent to which pure Prolog code is predictable. I'd even say amenable to formal proofs of correcntess but I've never actually tried that, to be fair.

As to decidability, my understanding is that you can either have Turing completeness, or decidability, but not both. So to slightly correct my comment from above, Datalog itself is decidable only given a finite constant and predicate signature (the set of constants and predicate symbols). But finiteness means incompleteness.


Out of the examples I would take a look at the LogicBlox Stuff, they gave join algorithm research a big push with their Leapfrog-Triejoin.

The following papers ascend in difficulty, and are based on each other, but offer a fascinating glimpse into the connection of DPLL SAT solving and Conjunctive Query evaluation.

https://openproceedings.org/2014/conf/icdt/Veldhuizen14.pdf

https://arxiv.org/abs/1310.3314

https://arxiv.org/abs/1404.0703

As for reasoning and decidability, I think we need to clarify the decidability we're talking about. Decision problems are decidable or undecidable, and when one talks about decidability for a logic one usually means the decision problem wether a given formula of the logic is valid or not.

Everything finite is always decidable, because you can just enumerate all possibilities. And logics without negation (like Datalog without stratified negation) are generally (?) decidable because you can just construct a Herbrand interpretation accordingly.

However when we talk about correctness and formal reasoning there are other decision problems which are relevant, most of them revolve around stuff like "does this program implement this specification?", "does this program terminate?", "is this program keeping these invariants?", "are these two programs equal?"

Note that even if these problems are undecidable, it doesn't mean that they're always undecidable, just that they're not decidable in general, e.g. the halting problem is undecidable, but I can give you infinitely many turing-machines which always halt.

The pure side effect free fragment you describe is still at least as (?) powerful as FOL, even HOL if you include the meta predicates like call, thanks to the build-in predicates like forall. Heck I guess the build-in arithmetic alone will cost you decidability (of validity).

Datalog is even more restricted than what you describe and equality of two Datalog programs is not decidable according to https://core.ac.uk/download/pdf/82609701.pdf

So the question is, how much reasoning a.k.a. how many decision problems can you actually do/solve on Datalog and Prolog. For Datalog, I'd say quite a bit, but not everything, and for Prolog I'd argue very little. But both give you a lot of power in return and that's the price you have to pay.

Edit: And yeah, I don't envy you, on having to work with side-effectful prolog code. At some point it's just writing C with extra steps, ground through the Warren Abstract Sausage Machine. ^^'


Thanks for the papers you suggest! I'll try and have a quick look. I'm actually interested in the subject because my research does touch on Datalog (exactly because of the decidability guarantees).

Agreed about the definitions of decidability and formal correctness. Agreed also about the expressivity of the pure subset of Prolog. Oh, I remember now about Datalog and deciding equality. Yes. Well. You hit these kinds of walls anytime you try to do anything intersting with logic above the propositional level. The history of logic programming is a history of one negative result after the other and the compromises that must be made to avoid them. C'est la vie.

But we persist because there is a pot of gold at the end of the rainbow and what best way is there to spend your life than chasing rainbows? :0

>> So the question is, how much reasoning a.k.a. how many decision problems can you actually do/solve on Datalog and Prolog. For Datalog, I'd say quite a bit, but not everything, and for Prolog I'd argue very little. But both give you a lot of power in return and that's the price you have to pay.

Yes, I agree absolutely about that.

For example, my code doesn't _need_ the impure constructs. But it's a lot more efficient with them. So it's a trade-off. And a trade-off very common in Prolog which, to my opinion makes many pragmatic choices that sacrifice purity for the benefit of having a useable language.

That goes for reasoning about correctnes, also. The fact that you can sometimes reason about program correctness, given you can accept some restrictions to what kind of programs you can reason about is already miles ahead from what is possible with anything that isn't a logic programming language. That has to count for something.

>> Edit: And yeah, I don't envy you, on having to work with side-effectful prolog code. At some point it's just writing C with extra steps, ground through the Warren Abstract Sausage Machine. ^^'

Aw, I enjoy writing Prolog. So much so that I left my lucrative career in the industry to do research that would let me write a lot of it :)


> notations which are no longer a subset of ISO Prolog and therefore also not Datalog.

This is false. Datalog is not (or no longer) about prolog like syntax, Datalog as used nowadays in research, describes the semantics for a logic which includes conjunctive queries and recursion, often denoted as an abstract syntax consisting of rules with a body and head, no ascii syntax attached.

> Thus, they also do not immediately benefit from recent advances in Prolog systems in that they cannot directly be parsed and interpreted by them.

This is like saying, that Javascript can benefit from updates to the Java virtual machine. Prolog and Datalog require radically different evaluation strategies and implementations.


My last and only experience with Datalog (or a variation thereof) was (I believe) Logicblox, which (at least at the time) had zero support for pagination. In spite of all the supposed benefits of the system, that makes it fundamentally unsuitable imho.


What's the best example of modern Datalog?


A good way to try Datalog is to use a state-of-the-art Prolog system that supports SLG resolution (tabling).

SLG resolution is an evaluation strategy of Prolog programs that provides favourable termination properties compared to SLDNF resolution which is Prolog's default execution strategy.

A great example for this is XSB Prolog, since it has pioneered and refined many of these techniques and serves as an example and benchmark for many other Prolog systems in this area:

https://xsb.com/xsb-prolog

The nice thing is that this approach allows you to use Datalog, and at the same time gives you a Turing-complete programming language that is comparatively easy to grasp once you understand Datalog, while sharing many of the same advantages and being syntactically only a slight extension of it.


A turning complete query language sounds like a recipe for DoS attacks.


It is only Turing complete if you allow new rules. Usually a client would only be allowed to formulate query goals.

In that sense, GraphQL is Turing complete as well, as the underlying implementation could just simulate a Turing machine.


What I read from your post here is that GraphQL is safe by default and would require effort to make it dangerous while Datalog is dangerous by default and would require configuration to make it safe. I know which one I'd rather put on a server.


You have read that incorrectly. A Datalog/Prolog query does not define new rules. That's like calling arbitrary code on remote machines. You would never allow that.

But IF you were allowed to define new rules, THEN you could simulate a turing machine or create infinite reductions.


> A great example for this is XSB Prolog,

I'm really sorry but it honestly cannot be. I was curious because Datalog/Prolog based technologies are intriguing so I clicked on that link.

https://xsb.com/xsb-prolog seems to have no download but links to

http://xsb.sourceforge.net/ which claims version 3.8 (October 2017) is current and links to

http://xsb.sourceforge.net/downloads/downloads.html which does not specify a version but offers a tar.gz that contains version 3.7 from July 2016.

Going back to sourceforge and heading to the SVN https://sourceforge.net/p/xsb/src/HEAD/tree/trunk/XSB/ states version 3.9, November 2019 and looks pretty active commit-wise.

Okay, let's do that. Checking out trunk, I see perl and C scrolling by. While I have no problem with Perl, I know quite a few developers - especially in the web context - that find it rather off-putting. I get a README that tells me to read the manual in order to install. Fair enough. Opening that ( http://xsb.sourceforge.net/manual1/manual1.pdf ) gives me a 622 page PDF from October 2017. Chapter 2:

"Make sure that after you have obtained XSB, you have uncompressed it by following the instructions found in the file README." There were no such instructions. But well, configure and make are in build, that's fine and the rest of the instructions are clear.

I was confused why configure looked for mysql and make did something with Java but whatever, both worked great out of the box. Let me check the binary. I get a REPL, claiming to be version 3.8 from October 2017 and despite having 7 lines of output I have no idea how to quit out of that process, so I turn to Google which gives me http://xsb.sourceforge.net/shadow_site/manual1/node14.html

Neither ctrl^c, nor ctrl^d work, and halt + return is ignored. I go ahead, kill the process and delete the directory.

Alternatives to the above? That original site tells me to contact some organisation, "xsb prolog" on Google turns up ancient results.

Now compare that with https://graphql.org/graphql-js/

Just like many of these rather research centric languages, it does not matter how cool their tech is if that's the first experience people get. In my view experiences like the above are why they fail to gain adoption in today's world, not some technical aspect or technique. At least unlike much of the linked data and SPARQL world this thing compiled out of the box and I would have been up and running in about 10 minutes if I had any intention to continue.


That sucks. XSB is old code and I believe it was meant mostly as a proof of concept even at the time it was written. I doubt many people use it, even in research, nowadays.

I believe what triska meant about XSB is that it demonstrated the benefits of SLG resolution, by being a proof of concept like I say. So yes, XSB would be more of academic interest.

If you want a modern Prolog to play around with you should try Swi-Prolog. From the conclusion of your comment I'm assuming you haven't tried it because you wouldn't have trouble "getting up and running" in 10 minutes. Or 5 or so. Seriously, Swi's maintainers (triska is a contributor) have gone out of their way to make it useable.

Anyway, if you put in all this work to get XSB working I think you should definitely give Swi-Prolog a try. Sunk cost fallacy, innit.

Here's the link:

https://www.swi-prolog.org/Download.html

On windows, there's an installer. On Linux you can yum-install or apt-get etc an earlier version and try it out, then if you like it you can follow the Download page's links to figure out how to install a newer version (that does take a bit more work).

If you're stuck with Prolog, there's a discourse group, here:

https://swi-prolog.discourse.group/

People there are always happy to help newcomers (and will never send you to RTFM).

But be warned that learning Prolog is not easy. I help with markings and labs for Prolog courses at my university and students always have a hard time with it, until some things start to click.

EDIT: "halt." should work at any Prolog prompt. The "." is a statement terminator. You probably didn't get anything after "halt return" because you missed the dot.

EDIT II: Oh, I forget. You can try Swi-Prolog on Swish (Prolog notebooks):

https://swish.swi-prolog.org

So no need to install anything but keep in mind that some stuff is limited for security reasons (you can do a lot of damage with a language that lets you rewrite it on the fly).


Much appreciated, I kind of got what triska meant (didn't realize they were a contributor to that project), I just wanted to highlight that this is a real barrier for adoption of these alternative stacks. I find this interesting enough to spend more than those 10 minutes when I have time, sure. It's just that a random dev on the search for some component to solve something in their stack likely wont.

Just gave the SWI implementation a try and that was honestly a much better experience and I could immediately jump on their getting started guides to get a feel. Thanks for the pointers as well, I've got a few colleagues that actively use prolog but good to see that there's an active community out there.

> EDIT: "halt." should work at any Prolog prompt. The "." is a statement terminator. You probably didn't get anything after "halt return" because you missed the dot.

Whoops. Yup, I think I might have, I was under the impression that dot in the documentation was the sentence delimiter.

(And btw, my bad about the Perl comment, I just realized when setting up the SWI one that they seem to share a file extension with Perl.)


I agree, it's very bad to have to go through so much hassle when you just want to have a quick look to evaluate a possible tool for a problem you have right now. This is a difficult situation to resolve: usabiity won't improve until there's more users and until usability improves there won't be more users.

I'm glad you like Swi. It's not the fastest implementation but it's certainly the one with the largest community and the most quality-of-life features, documentation server, unit tests library, package manager etc. I love it :)

>> (And btw, my bad about the Perl comment, I just realized when setting up the SWI one that they seem to share a file extension with Perl.)

Oh yes, I totally forgot about that (can't say I use much perl!). It can look funny if you don't expect it :)

Anyway I hope you have time to look into the language more in the future. Like I say, it's hard to learn but it's worth the pain.


>Just like many of these rather research centric languages, it does not matter how cool their tech is if that's the first experience people get. In my view experiences like the above are why they fail to gain adoption in today's world, not some technical aspect or technique. At least unlike much of the linked data and SPARQL world this thing compiled out of the box and I would have been up and running in about 10 minutes if I had any intention to continue.

Just what I was thinking as I read the parent. This is the real answer to the OP's question.


Datomic. Also interesting:

https://github.com/sixthnormal/clj-3df


Datomic provides a fantastic range of query features, but the underlying Datalog engine does have drawbacks in how it evaluates results eagerly and requires intermediate result sets to fit in memory [0].

For comparison, I work on https://opencrux.com which uses lazy Datalog evaluation and can spill to disk when necessary. Naturally there are downsides to this approach also, including a dependence on seek-heavy use of local KV indexes.

clj-3df is definitely interesting although not strictly comparable :)

[0] https://docs.datomic.com/on-prem/query.html#memory-usage


Oh hello! I hadn't realised JUXT had their own products now, will definitely check this out.


You might want to check out Grakn (https://github.com/graknlabs/grakn/) - its query language is an implementation of recursive datalog with negation and comes with lazy execution. I am one of its developers.


Certainly not the best, but worth mentioning the Python implementation pydatalog.

https://sites.google.com/site/pydatalog/home


Prolog (Datalog) has always been the gold standard for query languages. GraphQL is merely the latest attempt to reinvent it by people who obviously don't know what came before (or don't care). This might be because CS programs are no longer teaching Prolog, but I think it's more likely because one doesn't get career advancement points for using something "old" unless one reinvents it and renames it so it seems "new."


I know Prolog (from college) and SPARQL (from hobbyist dinking around with SemWeb). I love DSLs and weird languages. I should be almost the ideal Datalog user, so how did it stay completely off my radar until this post?


Because it's a niche tangent of an already niche language.

I'm only aware of it because a databases course taught it (relational algebra to datalog to SQL) and I thought it was superb, and a couple of times since I've tried to find if I can use it, but it seems to suffer from the combination of age + academia + proprietariness that makes such things hard to discover and use. (Compared to graphql and plethora of open source libs on Github, etc.)


You are asking for a rational explanation for why something better is less popular than something that is worse. The answer almost certainly is "just because", and more specifically "because Facebook made GraphQL, not Datalog".

Graphql had a lot of hype around it when it was first announced. It so happens that I had just learned Prolog at the time and was personally bemoaning why SQL was so popular when a Prolog-like solution would be so much better. A month or so later comes all the podcasts and articles on HN from Facebook devs about GraphQL.

Datalog is a really great technology, and the idea that you can guarantee that each statement terminates is very powerful.

GraphQL seems to be a great technology when you have an extremely specific requirements. I personally would certainly use it if i had those requirements, but IMO it seems totally unjustified for your average tech company.


Hyperfiddle (my startup) is a datalog API experiment, wrapped up in a low-code tool. We vertically integrate Datomic, a datalog database for AWS. Can you imagine GraphQL ever being low-code?

http://www.hyperfiddle.net/

https://gist.github.com/dustingetz/654e502340070280ab9744723...


That is really cool. I’ll have to check this out this week.

FYI Your HTTPS cert is tied to cloudfront and not configured correctly.


Ultimately GraphQL is popular because the query ergonomics fit well with the component architecture everyone is using these days, and once you're familiar with it you can be extremely productive.


>GraphQL is popular because the query ergonomics fit well with the component architecture everyone is using these days

Were they using that architecture prior to moving to GraphQL, or did the tail wag the dog in most cases?


I can't speak for everyone, but in my case it's "yes". I'd also be fairly comfortable claiming that there are a lot more people using component architectures (React, Vue, Elm, Ember and Angular) than are using GraphQL.


jQuery, for its time, fit well its architecture and those familiar with it were extremely productive. Doesn't necessarily mean something is a great paradigm.


Do you feel GraphQL is better than Datalog here?


sorry not a frontend dev, where can i get basic info about 'component architecture'


Modern SPA frameworks are usually built on components. You can combine components to define a tree of elements that represents your UI. This pattern is also often referred to as a Declarative UI pattern.

GraphQL works well with declarative UIs, because you can define fragments and combine them to create queries that best match the data the current set of components you are loading require. In this way, it's flexible in a way REST isn't (at least typically) and syncs well with Declarative UIs.

I can't speak to datalog though.


He's talking about the recommended use of things like React or Elm. React will probably have the basic info that ties it to GraphQL the closest, given the creators.


This is actually fairly common in Clojure annd Clojure script. Sometimes using the Datomic DB as a backend for it.

This is an older presentation but its all about doing Datalog from the client to the DB.

See: https://www.youtube.com/watch?v=aI0zVzzoK_E

But if you go threw the Clojure world, there is Datalog all over the place, both client and server. Check out Datascript:

https://github.com/tonsky/datascript

This full-stack framework: http://book.fulcrologic.com/


> This is actually fairly common in Clojure annd Clojure script. Sometimes using the Datomic DB as a backend for it. What do you mean by "this" which is fairly common? Datascript is a Datalog implementation, however you can't just bring your query from backend to frontend, not only for incompatiblity but also for security issues. And Datascript is not that popular among Clojurescript apps.

Fulcro is defintely the go to framework. It makes use of EQL, which is a superior alternate to Graphql, for server-client communication.


It's not clear to me that GraphQL and Datalog solve the same problems. GraphQL, to me, is fundamentally about designing the interface between a client and server, decoupled from the logic that satisfies requests. I think it's actually not a query language at all, any more than HTTP is a query language. Datalog is very much a query engine. And it's not super clear to me that the services exposed by a server map cleanly to a logic model. GraphQL requires no such purity.


That: GraphQL is a mechanism for clients to specify the data document they want, and for servers to specify the documents they can serve. Datalog, or SQL for that matter, is a mechanism to expose data within a database. The GraphQL data model must be geared towards the end client point of view, e.g. some internal IDs can be hidden. Datalog on the other hand exposes the internal model of a data store, which is domain driven not client driven. Also you do not want end client devices to have the full power of datalog or SQL on your database, or else be ready to always watch those crazy queries that bring your server down to a crawl. GraphQL at least provides an indirection point where you can apply some form of resource control.


I will say something that is commonly known. a lot of frontend software dev revolves around hype. if React is the in thing and fb dev's are promoting Graphql, then everyone will go towards that. I don't know much about Datalog, but JSonAPiSpec meets most of things that people tout Graphql for. + if you're a person trying to keep employed then you've to follow the trends .


React came out in 2013. That's six years and counting of "hype".


We currently looking for a powerful query language for a distributed event sourcing system. It would have to be embeddable in rust.

Is there a site where I can see more complex datalog queries and how they compare to other query languages? All the examples I have found so far were relatively trivial and basically just assumed that you knew that datalog was super powerful.

A comparison of how to solve problem X, Y, Z in datalog, graphql and SQL would go a long way to convince people that it is a superior alternative.


As a company that runs both Django Rest Framework and GraphQL backends, there is one phenomenal argument in favor of GQL: Apollo Client. When you just want a simple, normalized reactive data store it’s really amazing (even though the Apollo kids seem to break it with every minor release). Loads of redux boilerplate go right out the window.


Per the last comment, please check out our new official Redux Toolkit package. It includes utilities to simplify several common Redux use cases, including store setup, defining reducers, immutable update logic, and even creating entire "slices" of state at once without writing any action creators or action types by hand:

https://redux-toolkit.js.org

Not saying it's a replacement for Apollo (since Redux itself doesn't provide anything around data fetching), but we specifically created RTK to eliminate the concerns around "Redux boilerplate".


Still requires boilerplate for thunks, which in my case is most redux actions. Also I think you still don't support for specifying same reducer for many actions (for example as an array - https://github.com/piotrwitek/typesafe-actions). The immutable update logic however is amazing and I am glad I got introduced to it by you!


Not sure what you mean by "boilerplate for thunks". You write thunks the same way you always have.

And yes, `createSlice` explicitly supports responding to actions that were defined elsewhere in the app, using the `extraReducers` argument:

https://redux-toolkit.js.org/api/createSlice#extrareducers

Although... are you saying that you want one reducer function to handle many different actions in one slice? Just define the reducer outside the slice itself, and pass it as the handler for each action:

    createSlice({
        name: "someSlice",
        initialState,
        reducers: {
            action1: someCaseReducer,
            action2: someCaseReducer,    
        }
    })
Did something similar in my own app recently at work, where multiple slices all needed to handle loading data similarly. So, I wrote a couple generic reducers, and passed them in to each of the slices.


TerminusDB does use a Datalog-like language as the query language for its public web API. Datalog has long seemed to me to be the obvious "next step" for query languages as it enables richer manipulations than those provided by vanilla SQL-like languages in a clean fashion.

However, as j-pb noted in another comment:

"GraphQL queries describe Tree unfoldings of graphs, and thus return trees.

Datalog describes recursive conjunctive queries on hypergraphs (relational model) without or limited negation, and thus return a set or bag of hypergraph edges."

TerminusDB allows you to use both approaches. Due to the way we have designed our graphs, with a strongly typed schema langage, you can extract sub-graphs as unfolded trees using a special predicate exposed in our datalog language.

This approach (hopefully) gives the best of both worlds, allowing graph traversal and extraction of objects/documents in a single framework, all communicating information via JSON-LD.


Sounds reasonable to me :-)


Well, there's EQL and various implementations that use datalog under the hood.

https://edn-query-language.org/eql/1.0.0/what-is-eql.html


EQL is surely a great alternative to Graphql because of namespaced keywords and rich data structure (in contrast of Graphql's string base). It is also inspired by Datomic's Pull API. However, it's a specification and the most popular imlementation (Pathom) has nothing to do with Datalog.


The original question made this mistake probably because the poster didn't know it but Pathom is the actual alternative to GraphQL-based APIs. Datalog solves a subtly different problem of querying the data, while Pathom/GraphQL/REST is about exposing the data (interfacing?) as API.

It's more like:

Pathom vs GraphQL-based API (vs REST)

and

Datalog vs SQL (vs NoSQL)

The mainstream is in the process of migrating from REST to GraphQL, but I think Pathom could be the black horse in this race. Just watch the last part of Wilker Silva's The Maximal Graph talk to see the potential: https://youtu.be/IS3i3DTUnAI?t=2080


> It is unclear that the total impact on performance would be significantly negative compared to the typical GraphQL API. Some of the problems also apply to GraphQL, people just need to constrain the query complexity in the end.

But yeah, it would bite Datalog more if not designed well enough, and it's harder to tackle Datalog than to GraphQL.

Other thoughts on this topic:

I love GraphQL but I would love Datalog more if the toolchain is as good as GraphQL.

There's a spectrum on API of simplicity and power -- on the left, there is HTTP/REST which could be seen as a primitive/naive way of describing API it's not consumer-driven at all. On the right there's Datalog which is a powerful way of describing API, it can maximize the flexibility of both the server and the client.

Things like 'map' and 'filter' became well-known to average programmers not very long time ago, though the elite programmers already knew them well half a centry ago. Datalog is definitely a very good idea for API, but I'm afraid it wound't be as popular as GraphQL if Facebook chose this route instead.

That being said, the industry always adopts things in a very passive manner, usually because the current solution is too painful to use, or almost couldn't solve the problem on new problem scale. There's a documentary[0] for GraphQL, the creators had enough of REST, they ended up with inventing GraphQL. It originally was to solving their data fetching problems, in the meantime it would solve others too. Meanwhile, it pushed the common understanding of API for average programmer from left to right on the spectrum. If it's close enough to the right, more and more people would discover Datalog and buzz other people to use it.

Just like type system, it's hard to expect the industry have a deep understanding on the topic itself. Type systems similar to Java dominates the industry for a long time, though ML already existed almost half a centry. Recently, languages like TypeScript tookoff because people need to type existing JavaScript. Then people would discover that the type system they used to is far from enough to describe existing code design. It's like a pandora box, new type operator would came out every release, until people realize there's a spectrum of type system (Lambda Cube[1]), and Dependent T ype would be on the same spot as Datalog on the spectrum.

[0] https://youtu.be/783ccP__No8?t=645 [1] https://en.wikipedia.org/wiki/Lambda_cube


How do I implement Datalog in my application? Crickets...


Depending on the stack, you can use some open source libraries. Datascript is quite robust, and useable from javascript/clojurescript. https://github.com/tonsky/datascript


"An immutable in-memory database and Datalog query engine"

Most uses of GraphQL is to expose existing data from a relational database through an API, so although this looks like an interesting project, is doesn't look like it would be a contender against any GraphQL implementation.

I'd argue there are two faulty assumptions in the OP: 1. "GraphQL serves the same use case as Datalog.", and 2. "Datalog is better at that use case."

GraphQL is not a general purpose query language! There are practically no query operations at all built into the specification - it's all about field selection and parameter passing. At heart it's more like a RPC protocol geared towards requesting tree-like data. You would evaluate GraphQL against options like plain REST, OpenAPI, gRPC, not against Datatalog.

A more relevant question to ask is why Datalog never took over for SQL, but it would be harder to pretend that SQLs dominance is due to it being the current passing fad.


> Crickets...

Why? There are libraries that you can use in most programming languages.


I've got this idea that you can chuck JMESPath into the range header on a standard json endpoint for ghetto graphql


For me the most important thing about graphql is composability.

Child component can communicate their data requirements to the parent and parent can include them in the query and feed the data to the children as it arrives.

Don't know if Datalog has anything for that.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: