> It does, actually, if the integer is large enough: (snippet) Wow, this is even...

_19qg · on Aug 20, 2016

> It causes difficulties when you want to reason rigorously about your code

That's what he said. In practice it doesn't matter.

catnaroek · on Aug 20, 2016

> In practice it doesn't matter.

I reason rigorously about my code in practice.

tome · on Aug 20, 2016

I still haven't fully grasped what aspect of Haskell means you can't rigorously reason about your code.

I'm pretty sure you don't mean space leaks, and I'm pretty sure you don't mean global coherence of type classes, which are the two most popular complaints about reasoning about Haskell.

Other than that, can't you just say "this piece of code is at least as defined as the equivalent in a strict language" and leave it at that? What can go wrong?

catnaroek · on Aug 20, 2016

> I still haven't fully grasped what aspect of Haskell means you can't rigorously reason about your code.

Oh, that particular criticism wasn't aimed at Haskell, but rather at languages without compound values. And I never said you can't reason rigorously about code written in such languages. You can, but it isn't very pleasant.

> I'm pretty sure you don't mean global coherence of type classes, which are the two most popular complaints about reasoning about Haskell.

Far more serious one: Haskell doesn't have a type whose value inhabitants are the integers and nothing else. I would like to say “even Lisp has that”, but nope, it doesn't either: https://news.ycombinator.com/item?id=12328095

tome · on Aug 21, 2016

> Haskell doesn't have a type whose value inhabitants are the integers and nothing else

Sure, I get that, but I don't get the practical consequences for reasoning. In Haskell, if you write

    myIntFunction !x = expression

then within expression x is an integer and nothing else, the same as if you'd written

    let myIntFunction x = expression

in OCaml. So I understand how it differs theoretically. I don't understand how it differs practically.

What's an example of something you wanted to prove in Haskell but couldn't, because it wasn't strict?

catnaroek · on Aug 21, 2016

> Sure, I get that, but I don't get the practical consequences for reasoning.

(0) In a language without compound values, you need to manually prove that object states faithfully represent the compound values you care about.

(1) In a dynamically typed language, you need to manually prove that computations produce the classes of values that their continuations expect.

(2) In a lazy language, you need to manually prove that variables will never be substituted with nonterminating computations.

> What's an example of something you wanted to prove in Haskell but couldn't, because it wasn't strict?

All the usual type class laws, without the pesky addendum “assuming no free variables are ever substituted with bottom, globally”. (In all fairness, strictness alone doesn't fix the problem either. At least not when laws have free variables of function type.) Redefining what variables can be substituted with is redefining the language's operational semantics, which in turn is effectively creating a new programming language. You would have to prove all the usual things from scratch: that type checking is decidable, that the type system is sound, etc.

In practice, I can still prove the things I want, but the proofs themselves are less modular in a lazy language. The problem manifests itself when you build large computations from smaller ones:

(0) In a strict language, whether any of the subcomputations terminates is a local issue, and doesn't affect the other subcomputations.

(1) In a lazy language, failure of one computation to be productive may be observed as a termination problem in a different computation.

_19qg · on Aug 20, 2016

I don't.

ScottBurson · on Aug 21, 2016

> Wow, this is even more broken than I previously thought.

I don't agree [#]. Providing a predicate that tests for identity of implementation enables a useful class of optimizations. For example, if you know two sets are implemented by the same object, you don't need to compare their contents to know they're equal. The fact that the information can be misused doesn't make it useless. Non-leakiness of abstractions isn't an unalloyed good.

In fact, because all computations take time and space, no abstraction can be perfectly leak-free.

[#] Except if you want to make the point that the Lisp builtin function currently called 'eql' should have been called 'eq', and the function currently called 'eq' should have been called 'identical-objects-p' or some such long and explicit name. Formally this makes no difference, of course, but as a practical matter it makes erroneous usage of the latter much less likely and easier to spot.

> It causes difficulties when you want to reason rigorously about your code.

In practice, these are relatively minor. Consider for example my FSet collections library. One of the preconditions in the contract of each FSet operation is, of course, that the invariant required for correct operation holds on the object(s) on which the operation is invoked. Suppose we have proven, for each operation, that it correctly implements its contract: if the invariant holds on the argument(s), it holds on the return value.

So now we have an application, a complete program that uses FSet as a library. Among the many things we need to prove is that the preconditions of each FSet operation used in the application are satisfied. Because we've already proven that the FSet operations maintain the required invariant, we need to prove that nothing else in the program breaks that invariant. In a language with unbreakable access control, this would be trivial: no other code has access to the internals of these data structures. In Common Lisp, we have a little more work to do. We have to show that no other code invokes the FSet internal accessors by name; a few calls to 'grep' suffice for this. Then we have to show that there's no chance the code uses Lisp's reflective features to invoke these accessors indirectly: specifically, that no calls to 'intern' exist in the program that could return a symbol that names one of these accessors. This is probably easy. If the program calls 'intern' at all, it probably passes a constant string as the name of the package to intern into, so all we need to show is that that string is not equal to "FSET". If the package name is not constant, this starts to get a little harder, but (a) this is a rare case, and (b) if we can't easily prove that the possible package names come from a small, restricted set, the program probably has a security vulnerability and needs to be fixed anyway.

That's nonzero effort, I concede. But on the scale of all the things we have to do to verify our application, it's fairly trivial.

By way of contrast, let's look at what happens in languages where the default integer type has bounded precision. Now, to figure out the range of inputs over which some procedure satisfies its desired contract, we have to work backwards from each integer operation, looking at the range of values that would cause it to overflow, and figuring out what constraints that places on the inputs. This task can be quite difficult in complicated cases, and even in simple ones, the need to do it is easily overlooked (as shown by a famous example [0]).

I note that even your beloved SML doesn't mandate arbitrary-precision integers [1]. I can't even find, on its Web site, anything saying whether SML/NJ has them or not. Perhaps they are now considered a de facto requirement for a usable SML implementation -- in which case, you see, culture matters.

Look. Formal verifiability is a concern of mine too, and I understand where you're coming from. But this is not a black-and-white world, and practical considerations are also relevant. While language (mis)features do sometimes contribute to the difficulty of verification, most of that difficulty, I strongly suspect, comes from the complexity of the code itself. We could verify machine language programs if we wanted to, and though it would be harder, I can't believe it would be overwhelmingly harder.

[0] https://research.googleblog.com/2006/06/extra-extra-read-all...

[1] http://sml-family.org/Basis/integer.html

catnaroek · on Aug 22, 2016

> I don't agree [#]. Providing a predicate that tests for identity of implementation enables a useful class of optimizations. For example, if you know two sets are implemented by the same object, you don't need to compare their contents to know they're equal. The fact that the information can be misused doesn't make it useless. Non-leakiness of abstractions isn't an unalloyed good.

You can have identity equality in Standard ML (resp. Haskell) too, it's just a matter of using the right types: a `foo` (resp. `Foo`) has structural equality, but a `foo ref` (resp. `IORef Foo`) has object identity equality. More precisely, equality is always structural, and identity equality is the right notion of structural equality for reference cells. So absolutely nothing is lost w.r.t. what you have in Lisp or Python. And you gain actual compound values.

> In practice, these are relatively minor. Consider for example my FSet collections library. One of the preconditions in the contract of each FSet operation is, of course, that the invariant required for correct operation holds on the object(s) on which the operation is invoked. Suppose we have proven, for each operation, that it correctly implements its contract: if the invariant holds on the argument(s), it holds on the return value.

Have you ever actually proving things about code? It's far from a “minor thing” when every operation has unchecked preconditions that you need to manually verify every time you call a function. The only reason why I can prove things about my code at a reasonable scale is that I arrange things so that I have to manually prove relatively little, and the rest automatically follows from type safety and parametricity.

> (next paragraph)

You use the word “probably” a lot. An argument that uses the word “probably” can't convince me that a proposition is true in all of the cases.

> I note that even your beloved SML doesn't mandate arbitrary-precision integers [1].

Hand-rolling big integers using machine integers as the starting point is easy in Standard ML. The final step is very important: Using abstract types, I hide the representation of big integers from the rest of the program. As far as clients care, they are using big integers, not lists of machine integers, vectors of machine integers, or whatever. Unfortunately, this final step is impossible to perform in Lisp.

> I can't even find, on its Web site, anything saying whether SML/NJ has them or not.

I don't use SML/NJ, because it has a lot of broken extensions that are difficult to reason about, and it uses a questionable external language to automate builds, rather than SML itself. Instead I use Poly/ML and Moscow ML. Both have big integers.

> Perhaps they are now considered a de facto requirement for a usable SML implementation -- in which case, you see, culture matters.

Big integers are great help, but they aren't considered a de facto requirement. The only requirements are those in the Definition of Standard ML and the Standard ML Basis Library. As I said above, using abstract types, you can roll your own non-leaky big integer abstraction.

> While language (mis)features do sometimes contribute to the difficulty of verification, most of that difficulty, I strongly suspect, comes from the complexity of the code itself.

And the complexity of the code itself comes from the discrepancy between your problem domain and what you can readily express in your programming language.

> We could verify machine language programs if we wanted to, and though it would be harder, I can't believe it would be overwhelmingly harder.

Well, you are overwhelmingly wrong, unless you hand-wave away a significant chunk of your proof obligation.