This was (is?) what the semantic web dream was all about, and I find it funny that it took a corporation (FB) developing a clone before others discovered the concept. Semantic web data would be published as RDF data and clients would consume this data using a SPARQL query engine to get whichever shape of data that they desired.
To clarify: I mean that Prolog was under consideration to be used for what SPARQL was then developed, as a more readable and simpler query and also modeling language.
Several researchers working in this area considered the development of SPARQL to be a missed opportunity for better formalisms, since Prolog could have been used instead.
And, yes, to many who have worked with RDF, SPARQL and the various notations that have been developed for semantic web technologies, Prolog seems like a dream that shows what these technologies could have been, and can become in the future, i.e., easily processable, with a uniform and simple syntax that already has an ISO standard, and amenable to formal reasoning based on well-known logical rules.
> formal reasoning based on well known logical rules
Oof, I do not know any serious researcher of logic who would support that claim.
Prolog is, due to its age and heritage, inherently side/effectful and undecidable/turing-complete.
SPARQL has big fat warts, for sure, design by committee, bulky syntax, way too long spec, way too many capabilities, and questionable decidability (looking at you MAYBE clause),
BUT at its heart it's non recursive conjunctive queries over regular path expressions.
CJ over PathRegex are a really elegant way to describe arbitrarily nested datastructures with flat queries, and are a lot more intuitive and concise than the corresponding datalog query.
Edit:
Not saying that prolog doesn't have its place btw, it's a neat language with awesome capabilities, but is NOT a query language.
But the claim I believe is that _Datalog_ is a subset of Prolog that is a query language. That claim is true by design.
Also, given bottom-up evaluation, Datalog programs are guaranteed to terminate and, restricted to definite clauses with no function symbols (of arity more than 0) they are also amenable to formal proofs of correctness, perhaps more so than Prolog itself.
EDIT: Oh. I assume that by "formal reasoning" you and triska are discussing formal proofs of correctness- otherwise I don't understand the disagreement. Resolution is a well-known logical rule (an inference rule) and it's amenable to formal reasoning, for sure.
Sure Datalog is a query language, but Prolog isn't.
I object to datalog being viewed as a subset of prolog, other than for historical reasons. It's just not useful as both languages differ significantly in philosophy, properties and implemenentation.
Datalog program equality is undecidable btw so while it's certainly more amenable to correctness proofs than prolog (which isn't at all with it's undecidability and implementation definedness everywhere), you'd still want more restricted logics for that.
Which isn't to say that's a bad thing, often times you want that power, and you don't need the correctness proof.
Fun fact, DLLite can tell you that the shin bone is a bone, and it can do so 10 billion times a second, and verifiably correctly so ;)
Edit:
Some systems that arguably implement a superset of Datalog, but for which their Datalog subset is not a subset of Prolog are
LogicBlox, Datomic, Datascript and TriQ-Lite.
>> Edit: Some systems that arguably implement a superset of Datalog, but for which their Datalog subset is not a subset of Prolog are LogicBlox, Datomic, Datascript and TriQ-Lite.
That's interesting, I didn't know about that. I might have a look now that you mention it.
Regarding correctnes proofs, I think triska knows this subject better than me but my understanding has always been that if you stay within the pure subset of Prolog (i.e. no destructive update of terms, no reads and writes to the database and basically nothing with side-effects) then you can actually fully reason about your programs at least in principle.
I see this in practice everytime I write an extensive piece of Prolog code (for the record, I do that a lot, for my PhD) and I end up using the assert/1-retract/1 mechanism to update the program database as the program runs. The problem with that is that it kicks you right out of the cozy world of immutability, right back to the world of programmig with mutable states. Once your program reaches a certain degree of complexity it becomes impossible to predict its state at any given point in time. [Edit: the reason is that if something goes wrong and your program fails unexpectedly, the program database is left in an unpredictable state - and you have to understand each such state with great precision or it all goes to hell in a handcart. You're back to doing memory allocation by hand basically].
The hurt I feel everytime I use the assert/retract mechanism, or rather, the difference between the level of pain using assert/retract compared to not using them, is, for me, a great measure of the extent to which pure Prolog code is predictable. I'd even say amenable to formal proofs of correcntess but I've never actually tried that, to be fair.
As to decidability, my understanding is that you can either have Turing completeness, or decidability, but not both. So to slightly correct my comment from above, Datalog itself is decidable only given a finite constant and predicate signature (the set of constants and predicate symbols). But finiteness means incompleteness.
Out of the examples I would take a look at the LogicBlox Stuff,
they gave join algorithm research a big push with their Leapfrog-Triejoin.
The following papers ascend in difficulty, and are based on each other, but offer a fascinating glimpse into the connection of DPLL SAT solving and Conjunctive Query evaluation.
As for reasoning and decidability, I think we need to clarify the decidability we're talking about. Decision problems are decidable or undecidable, and when one talks about decidability for a logic one usually means the decision problem wether a given formula of the logic is valid or not.
Everything finite is always decidable, because you can just enumerate all possibilities.
And logics without negation (like Datalog without stratified negation) are generally (?) decidable because you can just construct a Herbrand interpretation accordingly.
However when we talk about correctness and formal reasoning there are other decision problems which are relevant, most of them revolve around stuff like "does this program implement this specification?", "does this program terminate?", "is this program keeping these invariants?", "are these two programs equal?"
Note that even if these problems are undecidable, it doesn't mean that they're always undecidable, just that they're not decidable in general, e.g. the halting problem is undecidable, but I can give you infinitely many turing-machines which always halt.
The pure side effect free fragment you describe is still at least as (?) powerful as FOL, even HOL if you include the meta predicates like call, thanks to the build-in predicates like forall. Heck I guess the build-in arithmetic alone will cost you decidability (of validity).
So the question is, how much reasoning a.k.a. how many decision problems can you actually do/solve on Datalog and Prolog. For Datalog, I'd say quite a bit, but not everything, and for Prolog I'd argue very little. But both give you a lot of power in return and that's the price you have to pay.
Edit:
And yeah, I don't envy you, on having to work with side-effectful prolog code. At some point it's just writing C with extra steps, ground through the Warren Abstract Sausage Machine. ^^'
Thanks for the papers you suggest! I'll try and have a quick look. I'm
actually interested in the subject because my research does touch on Datalog
(exactly because of the decidability guarantees).
Agreed about the definitions of decidability and formal correctness. Agreed
also about the expressivity of the pure subset of Prolog. Oh, I remember now
about Datalog and deciding equality. Yes. Well. You hit these kinds of walls
anytime you try to do anything intersting with logic above the propositional
level. The history of logic programming is a history of one negative result
after the other and the compromises that must be made to avoid them. C'est la
vie.
But we persist because there is a pot of gold at the end of the rainbow and
what best way is there to spend your life than chasing rainbows? :0
>> So the question is, how much reasoning a.k.a. how many decision problems
can you actually do/solve on Datalog and Prolog. For Datalog, I'd say quite a
bit, but not everything, and for Prolog I'd argue very little. But both give
you a lot of power in return and that's the price you have to pay.
Yes, I agree absolutely about that.
For example, my code doesn't _need_ the impure constructs. But it's a lot more
efficient with them. So it's a trade-off. And a trade-off very common in
Prolog which, to my opinion makes many pragmatic choices that sacrifice purity
for the benefit of having a useable language.
That goes for reasoning about correctnes, also. The fact that you can
sometimes reason about program correctness, given you can accept some
restrictions to what kind of programs you can reason about is already miles
ahead from what is possible with anything that isn't a logic programming
language. That has to count for something.
>> Edit: And yeah, I don't envy you, on having to work with side-effectful
prolog code. At some point it's just writing C with extra steps, ground
through the Warren Abstract Sausage Machine. ^^'
Aw, I enjoy writing Prolog. So much so that I left my lucrative career in the
industry to do research that would let me write a lot of it :)