Hacker News new | past | comments | ask | show | jobs | submit login
F* – An ML-like functional programming language aimed at program verification (fstar-lang.org)
264 points by philonoist on Oct 30, 2017 | hide | past | favorite | 95 comments



This post describes how Mozilla is incorporating F* verified cryptographic code into Firefox: https://blog.mozilla.org/security/2017/09/13/verified-crypto...


A nice project using F* is the project Everest: https://project-everest.github.io . They implement HTTPS and all of its crypto, completely verified.


"Current status - We generate a C library, but the verification is not complete."


One can always look at the empty part of the glass, but the quote does continue with a big full part:

"The TLS 1.3 handshake verification is work in progress and still relies on the OCaml extraction mechanism of F* ; thus, the C library still encapsulates the OCaml runtime system.

We have completed verification of the TLS 1.3 record layer it currently extracts to C.

The AES and SHA2 cryptographic assembly routines are verified and extract to assembly via Vale.

HaCl* provides verified C code for multiple other primitives such as Curve25519, Chacha20, Poly1305 or HMAC.

Our test client implements TLS 1.2 + TLS 1.3 Draft 18 and successfully connects to TLS 1.3 test servers. We have a prototype integration of miTLS within libcurl that one can use to git clone a remote repository."


Note however that the page does list a bunch of useful, completely[1] verified components just below that sentence.

[1] Modulo things like the correctness of their C semantics, assembly semantics, various extraction procedures.


Vs the pile that is OpenSSL.


One thing I am amused is the behemoth support by Microsoft Research. Why is it still dormant? https://www.microsoft.com/en-us/research/project/the-f-proje... https://en.wikipedia.org/wiki/F*_(programming_language)


Microsoft Research is also behind Dafny and Lean and a whole bunch of more specialized tools: https://rise4fun.com/

They are very wisely not claiming that any of these is the verified programming language to rule them all. There is still a lot to learn.


Much of this uses Z3 under the hood, and Z3 is made of pure wizardry.


Dafny is pretty damn cool, though. And it handles the major problem of other verification environments (SPARK, Frama-C) (by using garbage collection).


Maybe they're looking to improve the developer experience? Seems like f# took a long time before entering the spotlight as well.


One cool thing with F* is that it starts bringing real contributions such as the HACL* formally verified cryptography to Firefox. https://blog.mozilla.org/security/2017/09/13/verified-crypto... Realistically, even if F* is not fully foundational it seems to be a very good compromise to write security sensitive applications in terms of proof effort...


Why was it not named FML?


Phrasing?

I wish they at least named it Fasterisk


"Faster and with less risk!"


Can't wait to see someone write some kind of checking program with the name F*ck.


Is it easier to write a program that behaves the way you want and a verification to prove the program behaves the way you want, or to just write the program? What if you goofed writing the verification, and now you have a program that is guaranteed to misbehave? How do you verify the verification?


Validation vs verification. Validation: "is this the right spec", verification: "does this meet the spec".

Turns out once you have a formal specification, you can do all sorts of things with it that you couldn't do with just the code sitting alone. You can prove properties about your specification. For example, seL4. Even the abstract specification is quite big and can be hard to reason about in complex cases. How do you know that spec is the one you want? Well, they wrote down some security properties they'd like it to have, and then proved that the abstract spec has those properties (eg, you can configure it to act like a separation kernel; it enforces isolation given an isolated initial condition; it enforces integrity).

How do you know those are the right properties? Well, that's the challenge of software validation isn't it :) Ultimately formal methods doesn't make that problem much harder. You just need to figure out a way to formally phrase the properties you want. A lot of software has very fuzzy validation criteria, and isn't necessarily suited for formal verification for that reason.


In addition to other good answers, there is also the fact that verification doesn't necessarily have to be "deep": "shallow" verification of only some security properties can also be very useful. For example, it may be hard to verify the "correctness" of a web server since the specification of what a web server should do is quite complex.

But (for example, for a web server written in Ada or C) you can also do without a specification and just apply verification tools to only check things like "all memory accesses are valid", proving that you have no buffer overflows that would cause denial of service or security holes.

Of course someone still has to write the verification tool correctly to ensure that it actually gives the safety guarantees you want, but at least it's not you who has to worry about that ;-) And specifying the absence of buffer overflows is comparatively simple.


cmrx64 addressed the validation well. I'll add that there's formal specification tools for requirements (the What) such as Software Cost Reduction and Abstract, State Machines that have helped there. The trick on those is they help be precise about the environment, inputs, and outputs in a way human reviewers can understand. So, the reviewers think on and discuss stuff with their thoughts shared to others in English and the formalism to spot the problems or even improvement opportunities. The main gotcha there is ensuring English and formal specs don't diverge.

Far as verified stacks, the throwaway linked the basic concept but here's two works (old and new) on verified provers so you can see it in action:

https://www.cs.princeton.edu/~appel/papers/flit.pdf

https://www.cs.utexas.edu/users/jared/milawa/Web/

The former was early work by Appel et al on getting the TCB of provers down to a few thousand lines of code. The other uses composable logics and small implementation that itself uses the verified tooling of Myreen et al whose research covers a lot of angles:

https://www.cl.cam.ac.uk/~mom22/research.html

Most of their work is done in Isabelle/HOL which HOL/Light is aiming to verify in a lightweight way. The idea being you just have to trust the code of the kernel (hundreds to thousands of lines) that checks proof terms and the specs that are fed into them. They're doing everything they can in the proof assistants so the minimal checkers will be able to confirm the operations.


> How do you verify the verification?

https://mathoverflow.net/a/18436


Why didn't refinement types find their way into mainstream languages yet? It seems like it would have a tremendous impact on how programs are written and tested.


I find it even more frustrating that some mainstream languages accept all the complexity of dependent typing while restricting the benefits by hiding it deep within an ad-hoc construction (C++ and Scala are both guilty of this).


Genuinely curious; can you elaborate which benefits the C++ implementation is missing out on?


C++ and Scala don't really have dependent typing, but rather a single feature that is typical of dependent typing. That is: type-level values of any type.


Both Scala and C++ have a set of type-system features that when combined allow you create functions whose return type can depend on the value of the passed argument. For a good step by step of how to do this in Scala see here[1], and a (unfortunately less in depth) C++ example is here[2].

[1]: http://gigiigig.github.io/tlp-step-by-step/dependent-types.h...

[2]: https://www.reddit.com/r/cpp/comments/2tilfw/dependent_typin...


I am not familiar enough with Scala, but the C++ example certainly does not seem like dependent types.

Which function in there do you consider to have a return type that depends on the value of the input?


Doesn't C++ only have type-level integers (as template parameters)? Are you referring to something else?


I haven't used C++ in years. That may be true.

In that case, it may be even more puzzling to claim that C++ has dependent types.


Statically checking refinements still only works so-so, that is, what do you do when the SMT solver times out? Writing manual proofs (or proof fragments) is great for special-purpose languages used by highly trained people, but it's not quite ready for the mainstream yet.

Checking refinements at run time is considered "too expensive" by many people.


I use the Refined library for scala and it works incredibly well (a few bumps here or there, and some occasional boilerplate for library interop). On my app server, it basically becomes my validation layer for all request data, after which I only worry about object ownership/authorization.

The big win with refined types is from compiler level support though. With relatively simple checking occurring mostly at IO boundaries, you can eliminate or make obsolete massive portions of testing, type checking, manual optimization, etc., and enable massive amounts of automated optimization. In other words, the "too expensive" refinement checking at runtime becomes a net performance and engineering benefit over the whole product lifecycle.


Runtime refinements are actually useful for places where you'd have checks anyway. E.g. in Haskell, this library http://nikita-volkov.github.io/refined/ — can be integrated with JSON: https://github.com/myfreeweb/magicbane/blob/7c417504590fc40e... — and you get input validation specified right in the types of your data structures.


Because major organizations only want replaceable developers, as cogs in the enterprise machine.

Better quality of software is irrelevant unless they get fined, like it happens on medicine or aeronautics industry.


If anything, I would say refinement types make developers more replaceable. If Developer-1 documents all assumptions in the code using refinements, then, when they are replaced with Developer-2, the new developer can be productive more quickly.


I agree with you, but most managers don't.

They only care about cost centers, Excel reports and who delivers at lower cost per seat.


To an extent. My feeling is that developers themselves, in general, don't want to learn anything more complicated than they can pick up from a three-month bootcamp and certainly don't want to take on the responsibility for their errors.


Those "major organizations", with the exception of Google and Apple, aren't spearheading language design.

Their point of view has no impact on the evolution of language paradigms, and if you told them they could save money on QA and increase the reliability of their product for virtually no overhead with new paradigms, I don't see why many people would reject the idea.


Google, as in Go?

"Type system tyranny", page 8 - https://talks.golang.org/2009/go_talk-20091030.pdf

"Programmers working at Google are early in their careers and are most familiar with procedural languages, particularly from the C family. The need to get programmers productive quickly in a new language means that the language cannot be too radical."

https://talks.golang.org/2012/splash.article


This is a convenient explanation and you will find a lot of evidence for it, but I think it's superficial.

The real question is, why can they afford to take this approach without being outperformed by companies that use better languages and better developers?

For those of us who enjoy exploring more powerful/innovative languages, there is a convenient explanation and a not so convenient explanation.

The convenient one is that the success of these "Blub companies" is largely determined by factors unrelated to software development.

The inconvenient one is that all these interesting and powerful language features do not significantly improve software development outcomes.

I think it is remarkable that some of the companies that are on the Blub side when it comes to software development are among the most cutting edge, research oriented companies in other regards, such as AI.

Also, companies like Google can not only choose among the best candidates, they also have a lot of influence on what people learn before they even apply for a job there. So I simply don't buy the familiarity excuse, at least not for the entire workforce.

I also don't buy the large existing codebase excuse as that would apply equally to new Blub languages and new PL research based languages.

There is simply very little faith in any positive impact of PL research innovations on software development outcomes. And this lack of faith cannot be explained away by claiming that all those lacking faith are simply incompetent.

The likes of Google and Microsoft can and do hire competent PL researchers, just as they hire competent AI researchers. And they do have a massive incentive to improve software development quality and productivity.

I think the truth is that PL researchers simply haven't made the case for the effectiveness of their work.

In part because it's extremely difficult to make that case empirically. All the empirical studies I have seen suffer from huge and mostly unfixable counfounders.

But I think another reason is that PL researchers seem to largely ignore cognitive science and sociology. I haven't seen much discussion about the impact of particular PL features on developers' state of mind in any realistic context.

Nor have I seen much debate about programming languages as group communication tools or about the different roles in which people interact with code as part of a software development process. All of that is apparently considered out of scope.

And that, I think, is the reason why it is so easy for many practitioners to dismiss PL research out of hand. Even where PL research is being picked up decades later it appears more fashion driven than evidence based.


If we look at blub engineering regarding mankind's history, impressive pieces of work were build by recurring to thousand of workers each doing his/her little thing across several years, even generations.

Yet, most likely we would use heavy construction building machines nowadays.

Niklaus Wirth admits on one of his papers that he expected developers would pick Oberon by caring about tools for quality in software engineered, but he was wrong about it.

Hoare has a similar remark on his Turing award speech.

So from my point of view it is a mix of sociology, fashion driven development and office politics.


I may misunderstand your statement, you seem to be saying Google and Apple ARE spearheading language design?

If so, you must be very confused with what they are doing with Go and Swift and Dart if you still consider that a thing.


I should have expanded a bit more. They are spearheading the design of a language or a few languages, and thus have skin in the game. I didn't mean to imply that they are spearheading language design in general.


Doesn't Ada have something like this?


AFAIK Ada's "refinement types" are essentially ranged integer types, i.e., you can define

    type Int is range 0 .. 100;
and it will be impossible to store any number outside that range in a variable of this type. But you can't define more general properties than this.


Pascal also has this, but i remember some Ada-related post from a few months ago here that had code along the lines of (imaginary syntax, i don't remember the exact code):

    type Int is range 0 .. 100 check(v)
      if (v = 10) or (v = 50) then error;
    end;
That would allow all values of v between 0 and 100, except 10 and 50. Note that everything after (and including) "check" is something i came up (and is more Pascal-ish), not what i am talking about. I also remember that this could be used with objects so that you could have something like (this one is Object Pascal-ish, i do not know Ada):

    type Foo = object
      Enabled : Boolean; Magnitude : 0..1000;
      check 
        if (Enabled and (Magnitude=0)) or
           (not Enabled and (Magnitude > 0)) then error;
      end;
      procedure Bar;
    end;
The code in check would be called whenever the object changes externally, so "Bar" can change Enabled and Magnitude with the check being invoked only at the exit of Bar to ensure object consistency and not whenever Bar does the assignment (so it can do a Enabled:=False; Magnitude:=0; and not fail at the first assignment), but assigning those values directly from outside the object would trigger it (so having a var f: Foo; and then writing f.Enabled:=True; would trigger the error).

Or at least that is what i remember understanding.

Maybe i remember some other Pascal-styled language, but the only other such language i know of is Oberon and that goes against the "put everything imaginable, plus extra more" mentality of Ada seems to have :-P.


You might be thinking of type predicates, added in Ada2012 : http://sworthodoxy.blogspot.fr/2015/08/ada-2012-type-invaria... some of them statically provable in Spark : http://www.spark-2014.org/entries/detail/spark-2014-rational... (if interested, there's a link down there with a 'rationale' pdf).


In general Ada, those are only checked at runtime, IIRC.


Types are checked by names, even you have the same underlying range of values. But some checks can only be done at runtime.


Kind of, you can achieve something similar with SPARK.


I've always dreamed about ML language with verification support and dependent types. Does it have something like Haskell's typeclasses or OCaml's modular implicits? Does it support ad-hoc polymorphism in any way?


It's not intending to be a general purpose language, and as such doesn't have a lot of features that make software engineering easier. In fact, it intentionally has a phenomenally slow compiler, because the compiler is solving or attempting to solve various NP-Complete problems simultaneously.

Think of it as a language that will help build tools, analyzers, and possibly linkable libraries for small scale but highly security-critical components, but not necessarily the language you want to use to build an app server.


although, if you don't use any refinement types in your program, i don't see why you can't write general purpose programs and extract code to Ocaml and F#


Lean is quite nice dependently typed language https://leanprover.github.io/


What do you think of Idris?


Idris certainly is nice but I doubt anyone uses it for real stuff.


How does this compare with other verification tools such as TLA+ or Coq?


Compared with Coq, F* makes it easier to write programs with side effects, i.e., actually mutate variables and memory. There is a language fragment which amounts to programming in a subset of C (and this is used for generating C code for the verified crypto stuff in Project Everest mentioned in other comments).

Also compared with Coq, proofs rely heavily on SMT solving with Z3 instead of lots of manual tactics (yay!). On the other hand, where SMT solving fails, you can not resort to manual tactics because there is no tactic language. You have to write proof terms without interactive support. (This may have changed in the last few months, but I'm almost sure it hasn't.)

I wrote a bit about my experiences trying to prove things in F* two years ago: https://news.ycombinator.com/item?id=10984306

Don't rely on any of that though, things will have changed in the meantime!

I'm optimistic about F*, but at the moment you have to be quite determined to try to use it for fully verified programming. The same is true for Coq, though.


F* now has tactics, it is very new, but no doubt that it will evolve a lot... https://github.com/FStarLang/FStar/tree/master/examples/tact...


Cool! Can you develop them interactively in Emacs, like in Coq, observing how individual tactics change the proof state?


Sadly not yet. You can see each individual intermediate state but the thing runs fully. We certainly plan to add such a feature though!

We've been mostly studying hammer-like tactics to, say, apply some theorems at specific places in the goal (which the SMT might be bad at), and not for chaining small rewriting or logical steps (which the SMT is good at).


Which language, do you think, is most practical for writing formally verified programs?


Probably the four most well-known pieces of software with end-to-end verification of functional correctness are the CertiKOS microkernel from Yale (verified in Coq), the SEL4 microkernel (verified in Isabelle), the Compcert C compiler (verified in Coq), the CakeML ML compiler (verified in Isabelle).

Stopping short of full end-to-end functional correctness, there are lot more notable examples, such as the Verve system at MSR (verified type/memory safety using Boogie), and the Bedrock system at MIT (functional correctness down to a fairly low-level IR, done in Coq), the miTLS/Everest TLS stack (protocol correctness verified in F-star, most of the way down to a C-like implementation), and the Iris project (a toolkit for verifying higher-order concurrent programs, currently being used to formalize Rust including unsafe, in Coq).

In machine-checked proof, the three biggest successes are the verification of the Odd Order Theorem (in Coq), the Four Colour Theorem (in Coq), and the Flyspeck project verifying the proof of the Kepler conjecture (in HOL).

In hardware verification, there has been a huge amount of work using ACL2 and HOL, but I don't know this space as well.

Basically, Coq and Isabelle have the most actual successes, but other things like F-star, Boogie and HOL have all worked as well.


CakeML was verified in HOL (https://hol-theorem-prover.org/), not Isabelle.


Yes, you're right. Thanks for the correction!


Magnificient answer. Thank you.


In addition to the great sibling comment by neel_k: It depends. I don't think that currently there is the one system that is uniformly best.

For example, I wouldn't want to try to write the kind of fiddling with byte arrays involved in cryptography in Coq. F* is certainly better for that. I imagine Dafny is better for it, too. Possibly Why3 as well. Or, if that is really what you want, you might even go for "real" programming languages (ones that exist independently of verification frameworks) such as Ada with SPARK/Ada or C with Frama-C.

If you want to verify purely functional programs, Coq is pretty good, but proving is often boringly manual. Also, the language is more influenced by the OCaml branch of the ML family than by Haskell; for Haskellers there are Idris and Agda. Then there is Isabelle/HOL, which has a lot of proof automation but a bizarre logic.

Oh, I haven't even mentioned Lean. Well. There is really no easy answer: You have to try to get acquainted with several of the frameworks, ideally with an idea for what your application will be. And whichever you choose, it will be hard.


Having played with dependently typed languages (ATS, Idris) some, and with verification tools (SPARK/Ada, Frama-C/C), I'd like to go a little farther and say that the verification tools are much more like "normal programming" rather than building a proof with the side effect of creating a program.

https://maniagnosis.crsr.net/tags/applied%20formal%20logic.h...


In what way is HOL bizarre?


First, just to clarify, I wrote "Isabelle/HOL", and I'm not claiming anything about any of the many many other systems that have "HOL" in their name. I don't know those.

That said, I think the thing that annoyed me most were the different language levels: "assume P show Q" and "P ==> Q" are both implications, but somehow different. I never quite understood why and how, but there seem to be cases where only one can be proved but not the other. Similarly for different kinds of universal quantification. Your interpretation of "bizarre" may very well vary, but I found these unintuitive and not explained well in the documentation that I found.

(Don't ask for anything much more concrete than this. It's been a while, and I've forgotten the details.)


Fair enough :)

FWIW the differences are largely syntactic. Some of them are due to implication or quantification being in the meta-level (Pure, Λ/long ⇒) vs object-level (HOL, ∀/⟶). Transporting between them (for HOL) is easy and the automation tends to normalize object-level quantification and implication into meta-level.

"assume P show Q" is just Isar-syntax for a structured proof of "P ==> Q" (which is a term).


For small modules, SPARK with this book from the limited feedback I have:

https://www.amazon.com/Building-High-Integrity-Applications-...

Verification of larger programs is hard for a lot of reasons. Easiest method is to do Design-by-Contract in a memory-safe language. Anything easy to specify in contracts goes there where stuff easier to test for is done that way. If your language doesn't support it, you can do it with asserts, conditionals in functions, or constructors/destructors in OOP.

http://www.eiffel.com/developers/design_by_contract_in_detai...

Then, you hit the app with tests generated from the contracts and fuzz testing that both run while the contract conditions have runtime checks in the code. That means a failure will take you right to the precondition, invariant, or post condition that is broken.

For analyzing distributed stuff, TLA+ is your best bet with an accessible tutorial for a subset available here:

https://learntla.com/introduction/


TLA+ is a specification language, not a verification language. There is currently no way of establishing correspondence between a TLA+ spec and the code you (presumably) write to implement that spec. There are benefits to this, like using TLA+ to spec the difficult parts of your system without having to buy the whole formal verification farm.


Not just that. F* is convenient for code-level specification, much like Ada SPARK, JML/OpenJML for Java, and ACSL/Frama-C for C, but it's quite hard to specify global correctness conditions in it, like "the database is always consistent", or "no data is ever lost", which is where TLA+ shines.

The gap between local, code-level correctness properties and global ones is a hard one to cross. I am not aware of any tool that can do both (i.e. end-to-end verification) in a way that isn't excruciatingly hard.


And not just _that_, but TLA+ is really good at expressing liveness conditions, like "the database is EVENTUALLY consistent" or "the algorithm will eventually terminate." Most other formal methods don't cover that.


Right; not out of the box, anyway (many proof assistants are powerful enough for deep embedding of arbitrary logics, but, of course, using them is beyond the affordable reach of nearly all software shops). Many can express termination out of the box, which, in theory, can suffice to express most (all?) liveness properties, certainly many useful ones, but again, this is expressed at the function level, and hard to do for global liveness properties.


F* is more geared towards writing verified but otherwise general purpose code directly. Coq is first and foremost a proof assistant, though there are a number of tools that can help you wed it to other programming languages.


One difference with Coq is that in Coq everything is constructive. A proof of a precondition is an object and has to be constructed in F* you could use classical logic, things are proven by a solver. (I might be wrong.)


You can add the law of excluded middle in Coq like you can in F*, but it would put you at odds with almost all Coq code.


In terms of buzzword bingo, F* is as close to my ideal programming language as I can think of, but I do wonder if it'll ever get the tooling support to make it more attractive as a general purpose language.


What is it missing that something like SML has? Why couldn't I just switch from another language to this today, assuming I don't care about IDE support for it?


If you're interested, and you don't care much about tooling, go for it:

https://github.com/FStarLang/FStar/blob/master/INSTALL.md


...it's unfortunate the name kinda looks like a censored bad word


It's compatiable with Coq


What is it with this trend of giving languages unsearchable names?


This actually is a real problem. Searching for "C#", "F#" or ".NET" breaks a lot of search engines and you get nothing useful. If I had to name something now I would go for a relatively unique name like "Clojure", "TypeScript" or "Haskell".


Searching for "f*" gives correct link in 1st result in Google. Do you use Yahoo or other search engine?


I am not the other person, but when I try to find a JSON decoding module for F* with "f* json", I get almost completely unrelated results. So, either there is no JSON codec for F* (I guess they could use the F#'s one) or Google can't find it.


you would just type "fstar json" instead, just like you'd type "fsharp" instead of "f#"


DDG (which I use) doesn't.


It's only a problem if you use it.


I agree, I used to use R and found it hard to google for.


http://rseek.org/ is a specialized R search engine, only necessary due to the name


Making another new language with such a short un-search-enginable name is not a good idea imo.


Well, type "f*" into Google and it's the second result ... Why Facebook is first, I have no idea.


I believe that Google might have created a rule for that, because when you type this in Yahoo [1] (or I believe even Solr, ElasticSearch or any of the other full text based database search engines) you won't find this as a result.

1. https://search.yahoo.com/search?ei=UTF-8&fr=crmas&p=f*




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: