Dafny is a verification-aware programming language

hawkice · 2024-04-23T23:44:54.000000Z

I think compact verification aware languages will be of particular interest to anyone wanting to make machine learning models write code at scale. Verification, and test generation, seem critical to the process of getting feedback from the program itself and improving the automatic code generation process. Not quite as powerful as self-play was for two player game playing, but it's got to be close.

mofosyne · 2024-04-24T01:49:18.000000Z

Doesn't ADA have some method of checking if the input/output of a function is within bound? Can't remember if its a runtime or compile check however.

But the concept of function contract verification would be interesting.

carlmr · 2024-04-24T07:21:03.000000Z

The pre/postconditions in Ada are runtime checks [1]. GNATprove can be used to statically check these though in the SPARK subset of Ada [2].

[1] https://www.adacore.com/gems/gem-31 [2] https://docs.adacore.com/spark2014-docs/html/ug/en/source/as...

littlestymaar · 2024-04-24T06:14:35.000000Z

Yeah “Reinforced learning with compiler feedback” in addition to RLHF.

For that reason I suspect that Rust could do pretty well in terms of its writeablility by an LLM, but there's the problem of the relatively small training codebase in the wild which make LLMs perform quite poorly in Rust at the moment (Llama3 70B being far worse than even ChatGPT 3.5 on that, but even ChatGPT 4 can't write nontrivial Rust code right now). So it seems hard to even bootstrap the process.

yosefk · 2024-04-24T10:32:15.000000Z

OTOH when it writes buggy code, you tend to get decent error messages when trying to use it. In this sense Rust + LLM is way better than C++ + LLM

littlestymaar · 2024-04-24T11:27:51.000000Z

True, but having to fix all the stupid mistakes they make is kind of ruining the point of using an LLM in the first place.

ComputerGuru · 2024-04-24T03:15:19.000000Z

fyi Dafny is written by Microsoft and used by Amazon: https://www.amazon.science/blog/how-we-built-cedar-with-auto...

mintplant · 2024-04-24T03:37:33.000000Z

Recently replaced by Lean, though. At least in Cedar.

https://github.com/cedar-policy/cedar-spec

https://lean-lang.org

pjmlp · 2024-04-24T06:24:36.000000Z

Which is also a Microsoft created language,

https://en.wikipedia.org/wiki/Lean_(proof_assistant)

mintplant · 2024-04-24T21:59:47.000000Z

Yes, Microsoft Research is huge in this space. Z3, Boogie, Corral, Dafny, F*, Lean...

naasking · 2024-04-26T03:52:42.000000Z

F* seemed promising at one point, but haven't heard much about it in a long time, what happened to it?

ComputerGuru · 2024-04-24T15:28:26.000000Z

Thanks for that. Any idea why? Couldn't find anything by searching their github repo.

IshKebab · 2024-04-24T09:22:36.000000Z

I've tried a few times to get into this but honestly it's very very difficult. You pretty much need to have a degree in formal verification to be able to actually verify anything except toy examples. I kept hitting cases where things wouldn't verify, and the reason is due to something deep in the implementation that really only the authors would know about.

Also the language is huge, and while it's quite well documented, the level of documentation you want for this is far more than for a normal language.

On the plus side, the authors are really helpful on Github, and the IDE support in VSCode is great.

bjackman · 2024-04-24T10:38:59.000000Z

This doesn't surprise me too much. I've written a fair bit of BPF code and getting programs to verify is very difficult and inevitably requires reading the verifier code.

It's interesting to compare this to the Rust borrow checker. As a Rust novice I've found out difficult, but not impossible, to translate my idea into a borrow-safe program. But the borrow checker is surely much much simpler than a full theorem prover, and nonetheless I believe it has taken _huge_ amounts of work to get it to its error messages to their present state of usability.

So I think this is likely to be a major challenge in any high-level language with a verification aspect.

kldx · 2024-04-24T20:43:44.000000Z

My takeaway with formal verification tools is that they are leaky abstractions. Sure, I can write a contract to prove a function's implementation but the moment Z3 takes too long to run, I need to peek into the tool is implemented and if there are any bugs in my triggers for quantifier instantiations.

I agree with parent. In an ideal world, I should not have to take graduate level coursework to prove a function's correctness. Sometimes, the programs I'm trying to verify are not proof-friendly, so I end up redoing my data structures just to make the proofs go through.

naasking · 2024-04-24T13:41:07.000000Z

> I kept hitting cases where things wouldn't verify, and the reason is due to something deep in the implementation that really only the authors would know about.

Can you recall a specific example? I'm not only curious how this comes about, but how Dafny expects you to resolve it. You must have to declare some kind of invariant for that property somewhere. This was the case when I played with C#'s code contracts, and sometimes this was challenging to ensure.

IshKebab · 2024-04-25T06:17:14.000000Z

It's been a while but there was something about Dafny only loop unrolling or maybe inlining functions a certain number of times which meant my proof worked for like 1 and 2 byte numbers but not 3 or more.

naasking · 2024-04-26T03:55:31.000000Z

Would that call for replacing that function with some kind of inductive abstraction to help it see that it generalizes to N byte numbers?

naasking · 2024-04-24T01:49:01.000000Z

Looks like a cornucopia of nice programming language features, including one that's in Ada that almost no other language provides: integer subset types/range types. Definitely promising.

The documentation is very jittery/laggy while scrolling on a phone though. I don't know why that's about, but it's distracting.

carlmr · 2024-04-24T08:42:44.000000Z

>including one that's in Ada that almost no other language provides: integer subset types/range types.

This is one of these features that seems so simple and makes life so much easier in embedded programming where you often have exact ranges for things, and also delta types for fixed-point math.

pjmlp · 2024-04-25T09:54:38.000000Z

As usual, people keep referring to Ada integer subset types/range types, when Ada was influenced by their use in Pascal and Modula-2.

anonzzzies · 2024-04-24T08:12:12.000000Z

> integer subset types/range types

You mean like Pascal or more elaborate like that?

Avshalom · 2024-04-24T09:12:19.000000Z

It comes (like a lot of Ada) from Pascal I don't know what the current state of Pascal is to be able to compare but probably a bit more elaborate.

sn9 · 2024-04-24T20:35:41.000000Z

There's a recent textbook for Dafny aimed at undergraduates with a bit of programming experience. [0]

[0] https://mitpress.mit.edu/9780262546232/program-proofs/

xpe · 2024-04-24T01:33:46.000000Z

Looks like it has been around since 2008:

https://www.microsoft.com/en-us/research/project/dafny-a-lan...

> K. Rustan M. Leino. Dafny: An Automatic Program Verifier for Functional Correctness. In LPAR-16, volume 6355 of LNCS, pages 348-370. Springer, 2010.

> K. Rustan M. Leino. Specification and verification of object-oriented software. In Engineering Methods and Tools for Software Safety and Security, volume 22 of NATO Science for Peace and Security Series D: Information and Communication Security, pages 231-266. IOS Press, 2009.

junon · 2024-04-24T08:12:00.000000Z

Very neat. Needs Rust compilation support!

syrak · 2024-04-24T12:14:17.000000Z

For verified Rust there is also https://github.com/creusot-rs/creusot

junon · 2024-04-24T15:00:34.000000Z

Cool, thanks! Will have to look into this

MikaelMayer · 2024-04-24T17:08:14.000000Z

I'm working on it! Feel free to launch a discussion in the GitHub project if you are interested, with examples in mind, if have any questions.

r9295 · 2024-04-24T09:16:21.000000Z

Probably never gonna happen due to how unstable rustc is and due to the fact that a lot of projects require nightly.

junon · 2024-04-24T09:27:07.000000Z

It actually appears they have a rust backend, it just isn't listed in the readme.

It's also a myth that you can't generate stable rust.

_flux · 2024-04-24T10:09:54.000000Z

Surely generating Rust is not a moving target, if you choose to target stable?

r9295 · 2024-04-24T10:13:36.000000Z

It depends on how feasible stable Rust is for popular projects. Many projects mandate nightly. Even if it is a fixed nightly version, the discrepancies across nightly versions add up since they don't just add APIs, they remove them too.

_flux · 2024-04-24T16:37:29.000000Z

Unstable rustc can compile stable code. For code you'd write with Dafny I highly doubt it would be an issue. Perhaps you have a proof of the opposite?

I've only ever needed nightly when doing something a bit off the beaten path, like targeting microcontrollers. The Linux kernel is another big project that needs unstable. It's much more rare a need than you make it out to be.

junon · 2024-04-24T14:59:32.000000Z

Stable rust is inherently compatible with nightly. Especially if they're in their own crate.