Algebraic Subtyping (2016) [pdf]

rjtobin · on March 3, 2017

Also by Stephen (and previously discussed on HN): https://www.cl.cam.ac.uk/~sd601/papers/mov.pdf

As usual, he injects just the right amount of humour. Eg. this footnote in his thesis:

"*This paper is out of print, in Russian, and quite difficult to get a hold of, so the interested reader is directed to a shorter proof by Conway [Con71, p. 105], which is merely out of print."

myst · on March 3, 2017

It's not difficult to get a hold of tho

http://umj.imath.kiev.ua/article/?lang=en&article=10002

yrhz90 · on March 3, 2017

It looks like that volume was produced after this thesis was submitted.

_0w8t · on March 3, 2017

Or this footnote:

1 Eficient in the same sense as ML: tends to run fast, but with some effort pathological cases can be constructed.

benmos · on March 7, 2017

I don't think that's really humour - that's an important piece of information.

acjohnson55 · on March 3, 2017

From just skimming the intro, this seems like very exciting work. And the formulation of the problem in terms of data flow was really novel to me.

I was surprised that I couldn't find any references to Martin Odersky in the paper, whose work for years at EPFL has been focused on unifying functional and object oriented programming under a sound type system. This work produced the Scala programming language. More recently, he reworked the foundations of Scala using a system he developed called the calculus of dependent object types (DOT) [1]. I couldn't seem to find any references to this in the thesis. It's hard to imagine the author was unaware of this work, but I'm really interested in hearing more about how the two approaches compare.

Anyway, I'm imagining this work could have great applications for things like TypeScript, and maybe even Scala.

[1] http://scala-lang.org/blog/2016/02/03/essence-of-scala.html

ezyang · on March 3, 2017

Scala's type inference is not complete; it is marketed as a way to "omit certain type annotations". One of the reasons why Hindley-Milner is so influential is because you really don't need any type annotations: the inference algorithm always works.

As for DOT https://www.cs.purdue.edu/homes/rompf/papers/amin-wf16.pdf DOT is an explicitly typed calculus (similar to System F), for which no attempts at type inference are made. So... they're not really comparable at all.

acjohnson55 · on March 4, 2017

Thanks for the insight.

So, I suppose the follow-up question would be whether the techniques of the paper would make it possible to reformulate DOT without explicit types. Sorry if that's an ignorant question.

phaedrus · on March 3, 2017

It's an interesting coincidence seeing this paper and talks about data flow, because I'm interested in applying typing to logic programming for a practical system and also came to the conclusion that my type system needs to be aware of data flow.

This is because variables in logic programming are usually individually bidirectional, but may not support all combinations. For instance, consider the logic relation "half(a, b) -> a = b/2 or b = 2a". If you pass in "a,_" it fills in the blank for "b" with the number which "a" is half of. If you pass in "_,b" it fills in the blank for "a" with the number that is half of "b".

So, individually each "argument" to the relation is an input and an output. (What in C# would be called a "ref" arg, as opposed to an "out" arg or undecorated in arg.)

However calling "half( _ , _ )" is not a valid use of this relation, and we would like the type checker to complain if it is used in this way. (What would one expect if both arguments are unset? Should it produce all pairs (n, 2n) in Z? Or perhaps all pairs ((float)0.5 n, (int)n)? An uncountably infinite set of pairs of Reals?)

Clearly just marking individual arguments as "in" or "out" is not enough. The typing has to be more like "if I have this data I can get to that data" and "if I have this other data I can get to something else." I can't figure out a way to represent this in static types, but I believe it could be typed checked with dependent types.

The examples with "select" near the beginning of the paper are not exactly the same thing, but are a similar problem and solution: data flow and the direction of the arrows matters.

richard_todd · on March 3, 2017

Exactly, the first thing I searched for was scala, to see how the paper's ideas relate to scala's approach. I was surprised to see no obvious references. One would think it would be high on the list of existing material for anyone researching this area.

Jweb_Guru · on March 3, 2017

I'm really enjoying the thesis (it's super awesome!), but it does do a very, very bad job of citing related work. For example, it completely fails to cite logical relations literature despite using a bunch of techniques from it.

nuclx · on March 3, 2017

I had Oderski in mind as well when i skimmed this. Likewise I expected a Wadler referenced.

ezyang · on March 3, 2017

The lesson of "algebra before syntax" is one that I have seen before, and it is very gratifying to see it pay off such dividends here.

There is also discussion on LtU here: http://lambda-the-ultimate.org/node/5393

agumonkey · on March 8, 2017

Ha, seems like ltu is having issues; cached version https://webcache.googleusercontent.com/search?q=cache:http%3...

Patient0 · on March 3, 2017

I've only read the introduction so far - but already it has taught me something. As a lay-person I am finding it very readable and accessible.

I'd always thought that subtyping inference was only something you need for OO style languages. But he begins with two very clear, non-OO, motivating examples and a key insight: failing to support subtyping ignores the data flow of the program.

vog · on March 3, 2017

I just had a look at it, and this thesis is really well-written. A beautiful piece of work, with the right mix of concrete examples, abstract concepts, and meta-concepts that describe why and how the author developed their concepts. Can't wait to read more of it.

xyzzy123 · on March 3, 2017

Lots of positive comments. Given that I'm an idiot, can anyone summarise why this is interesting?

My view is that this extends the work possible with the type system at compile time (e.g. in an integer sense, which would be pretty cool if you scoped it as that) but doesn't extend to "real" numbers or byte manipulation of data.

Like do I need this level of structure in my data, and am I being short-sighted by thinking of it as constraining?

Perfectly happy to get smashed to heck in exchange for being enlightened :) Thanks!

EDIT: sorry in advance for the degree of my ignorance.

tomp · on March 3, 2017

(Disclaimer: I have yet to read the paper.) It's interesting because it's both important and hard.

It's important because it allows you to infer types in the presence of parametric polymorphism ("generics") and subtyping ("OOP"). For example, consider this function:

  twice(f, x) = f(f(x))
  twice : forall a. (a -> a) -> a -> a             // OCaml type (no subtyping)
  twice : forall a, b <: a. (a -> b) -> a -> b     // "ugly" type with subtyping
  twice : forall a, b. (a -> a & b) -> a -> b      // equivalent to the above, but maybe nicer to see and reason about

In OCaml, the type of this function would be too narrow, so you wouldn't be able to call `twice` with arguments such as `round : float -> int` (assuming `int <: float`). Stephen's previous work [1] infers the third type (`&` means intersection type).

It's hard because subtypes are hard to reason about. Type inference usually works by solving equations; if you have 3 type variables 'a, 'b and 'c, and you know that

  'a == list[int]
  'b == list['c]
  'a == 'b

then you can infer that `'c == int`. This breaks down in presence of subtyping; for example, you cannot use

  'a & 'b == 'c & 'b

to infer that `'a == 'c` (a possible solution is `'a == int`, `'b == int`, `'c == float`).

Type inference in the presence of subtyping is so hard that no programming language does it currently. There are many tricks and approximations - e.g. Scala does type propagation (if it knows the types of function parameters, it can infer the types of most other variables), and bidirectional type inference (it can figure out the types of anonymous function arguments), but AFAIK there is no existing implementation of it. Hopefully this paper changes that. Of course, there is an argument to be made that writing types in code makes it more readable, and therefore types should be written - in general, I agree with this argument, but type inference can still be very helpful (e.g. you write a function with complicated types and ask the compiler to fill in the type; or you write a number of tiny helper functions with obvious types).

[1] MLsub https://www.cl.cam.ac.uk/~sd601/mlsub/

naasking · on March 3, 2017

> It's hard because subtypes are hard to reason about.

I think it's more correct to say that a language with both subtyping and parametric polymorphism is hard. Subtyping on its own or polymorphism on its own are pretty straightforward.

kvb · on March 3, 2017

Though note that an even better type would probably be

    twice : forall a,b,c. ((a->b)&(b->c)) -> a -> c

(e.g. in MLSub, the type for

    twice (fun x -> x::[]) 1

is the unsatisfying

    (rec a = (a list | int) list)

rather than the expected

    (int list) list

)

tomp · on March 3, 2017

You're right, interesting example! Although I'd prefer to "fix" this by incorporating first-class polymorphism [1] with MLsub - I guess non-polymorphic functions satisfying the constraint `(a -> b) & (b -> c)` are quite rare (although I'm sure that you can think of an example).

[1] https://github.com/tomprimozic/type-systems/tree/master/firs...

Edit: although on second thought, just first-class polymorphism isn't enough to fix this... any type system where the function `twice` has the same type as the function `three_times` will exhibit the same problem!

kvb · on March 3, 2017

Non-polymorphic functions of the type `a->a` trivially satisfy it, of course. Also, I think polymorphic functions are the most interesting case where you can't use twice as desired in an ML-like language, so it's too bad that despite the fancier type they don't work too well in MLsub. Here's another example that fails when I'd hope for it to succeed:

  let one = twice (fun o -> o.x) { x = { x = 1 } }

Jweb_Guru · on March 3, 2017

It seems that if you try to ascribe the type you wrote to twice you do get (a -> (a & b)) -> a -> b. I suspect this is a direct result of the "extensional" principles used to design the subtyping relation (so that it forms a lattice with nice properties) and I'm not sure whether it would be at all easy to get around. But I could be wrong, I'm still digesting the paper.

(Actually, I think it conceivably might just be a bug in the current typechecker; ascription isn't working for records AFAICT which makes me think they might not be fully baked).

techno_modus · on March 3, 2017

> It's interesting because it's both important and hard.

Is it true that the problem (of type inference) is hard only in the presence of operations or expressions with types? In other words, if a language does not provide operations or expressions with types then it is more or less obvious how to infer types?

ezyang · on March 3, 2017

It makes some things easier, but you give up principality: without things like intersection types, you end up with programs for which there isn't a "best" type to ascribe them. You can see this in the parent's example: without subtyping constraints or intersection types, there's no way to usefully express the subtyping that is possible in this example. It's generally considered a bad thing if your language infers a type which is incompatible with another type you could have manually written, e.g., (float -> int) -> float -> int, that is also a valid type for the expression. We like there to be a best choice for type inference to pick, but if the type language isn't expressive enough, there won't be one!

mrkgnao · on March 3, 2017

It was worth trying:

    class Effable t where
      f :: forall d. Effable d
        => t -> d

    instance intsAreEffable :: Effable Int where
      f x = ???

(This is PureScript, though. Excuse the bad pun.)

xyzzy123 · on March 3, 2017

Thanks, this is the comment I wanted to "win" the thread. I'm still figuring it out.

amenghra · on March 3, 2017

>Type inference in the presence of subtyping is so hard that no programming language does it currently.

Hack does some cute stuff in this area. E.g. the type checker postpones inferring the type of empty arrays.

ezyang · on March 3, 2017

The thesis is about designing a type system that supports both type inference and subtyping. If you program Java but hate writing explicit type annotations all day, the thesis offers the tantalizing possibility of having a type system that supports all the subtyping you want, but at the same time has as convenient and predictable type inference as ML or Haskell. Conventional wisdom is that this is hard; in this respect, the thesis proves a very surprising result.

xyzzy123 · on March 3, 2017

That's not super interesting (to me personally, an acknowledged idiot) because working programmers are hardly ever confused about types. Intellij pretty much already does that thing.

I have this idea that this paper does something super interesting that I'm too stupid to understand.

Is there a correctness angle I'm totally missing where you can constrain subtypes and ... not sure?

EDIT: I watched a thing earlier today where people felt they could reasonably subtype zero away and therefore never think about NAN as in the range or domain of their mathematical functions.

I want to learn more. Probably my degree of ignorance is such that I'm unable to ask the right questions. Can you help? This is probably an unfortunately hilarious question.

ezyang · on March 3, 2017

It's a little difficult to explain why subtyping is so difficult. One answer is that Hindley-Milner inference is actually really simple; simple enough that I can ask my undergraduates to implement in a week long programming assignment. But how to add subtyping? Now there's a difficult question: one you could write papers about. Along the way, there are a number of desirable properties you want to achieve, like decidability, existence of principal types (for any expression you write, there is a "best" type you can give it), compactness of inferred types, support for records and functions... and it's been very difficult to get all of them. This thesis does it.

virtualwhys · on March 3, 2017

Assuming the thesis is "easily" implementable, will a from scratch ML need to be created, or can existing ML based languages all benefit? i.e. implement in the compiler, and voila, subtyping + inference for all MLers.

ezyang · on March 3, 2017

Unfortunately, it is never as easy as "just" putting in a new type system to a preexisting compiler. In the case of ML and Haskell in particular, we are kind of crazy about GADTs, and GADTs and subtyping are known to interact in complex ways. See for example http://gallium.inria.fr/~remy/gadts/Scherer-Remy:gadts-subty... (PDF)

virtualwhys · on March 3, 2017

That was my intuition.

The instant win scenario for ML based languages is quite appealing from this side of the fence (Scala) where it's not all roses wrt to type inference and subtyping.

Maybe even, gasp, SML can rise from the ashes and bring Rob Harper's dream to fruition (which seems to be that developers stop using Haskell); that or someone implements Rossberg's 1ML, or the theory backing this thesis.

All the MLs, including Haskell, have pretty significant tradeoffs, would be great if there was really 1 ML to rule them all.

Ericson2314 · on March 3, 2017

Yeah 1ML is what really benefits from this: You really want subtyping for modules to handle library evolution, and 1ML means if you want it for modules ya better have it everywhere.

Jweb_Guru · on March 3, 2017

Yeah IIRC width subtyping for modules was the only thing they didn't have complete inference for if you didn't use large types, so this is a huge win. Though I'm nearing the end of the thesis and I'm afraid there may be some things about 1ML that don't jive with this (1ML elaborate to full Fω and gives you explicit access to existentials and higher kinds, which are possibly difficult to unify with the work here).

coldtea · on March 3, 2017

>That's not super interesting (to me personally, an acknowledged idiot) because working programmers are hardly ever confused about types. Intellij pretty much already does that thing.

That's not true at all. For one, because programmers are not constrained to Java programmers. Try to infer the correct types in some C++ template code for example.

akiselev · on March 3, 2017

C++ templates are more a limited form of macros than generic types, though. Macros can make type inference more difficult because they usually don't provide any type bounds for the engine to work with and C++ templates can be especially complex.

ocharles · on March 3, 2017

The point is that you don't need to write the types down anymore. The program is still typed, but the compiler can figure out what the type of variables (in a well typed program) needs to be. As said, this has previously been a hard problem. If you're familiar with the `auto` keyword, think of that - but everywhere.

xyzzy123 · on March 3, 2017

That's nice but basically kotlin already implemented that thing before an academic paper showed us it was possible.

We all knew that and no one cared.

My thinking was that we now need very high resolution thinking about what types mean, for example, "do they come from functions which have these constraints as output", but that's just as ... well, it's unrealistic. How do you annotate the world?

If your output range was obvious to the compiler, that was never a problem to anyone.

chombier · on March 3, 2017

From the Kotlin spec: http://jetbrains.github.io/kotlin-spec/#_type_inference

"Type inference is conservative — it may fail to find suitable assignment of type-arguments to the type-parameters despite that one exist, and the function invocation could be successfully typechecked if the programmer explicitly provided suitable type-arguments."

My guess is that what the thesis presents is complete type inference in presence of subtyping that is decidable, efficient, and that yields types with all the nice properties one could hope for (principality, compactness, ...).

Which is hard.

xyzzy123 · on March 3, 2017

Excellent, so now this will help us write programs better. Thanks. I upvoted you. Looking forward to the blog post.

taejo · on March 3, 2017

> That's nice but basically kotlin already implemented that thing before an academic paper showed us it was possible.

According to Wikipedia, Kotlin first appeared in 2011. Algorithm W for inferring types in the Hindley-Milner type system was published in 1982.

awestroke · on March 3, 2017

Not the same thing at all

xyzzy123 · on March 3, 2017

Thanks, enlightening as always.

chriswarbo · on March 3, 2017

> That's not super interesting (to me personally, an acknowledged idiot) because working programmers are hardly ever confused about types. Intellij pretty much already does that thing.

One reason it's nice to have "toy models" and proofs is that we can gain confidence about how our "foundations" will/won't behave.

For example, Java's type system is well understood; it has some problems (e.g. https://dev.to/rosstate/java-is-unsound-the-industry-perspec... ), but we can think of them happening in 'contrived edge-cases', and we can make conservative claims about what is/isn't a 'contrived edge-case' (i.e. "that looks reasonable" vs "don't do that, even if it works").

Since we have a good understanding of Java's type system, we can do things like generating Java code automatically; this may be as simple as getters/setters in an IDE, or as complicated as a full API compatibility layer (e.g. swig, and many others). Automatically generated code can be very strange, but it's fine if we know that only "reasonable" code will be generated.

Let's compare this situation to IntelliJ's inference features. To make a more apples-to-apples comparison, let's turn IntelliJ's features into programming language features: we make a new language, which looks like Java but doesn't need as many type annotations. Our compiler takes the code it's given, fires up IntelliJ on a headless remote desktop, pastes in the code, clicks on some "infer types" button, copies out the result, kills the IntelliJ/desktop and feeds the generated code to javac.

The question is: what are the rules of our new language? Do we know what code looks "reasonable" and what looks like an "edge-case"? What properties can we be confident about? For example, can we make a claim like "return type annotations aren't required when calling a method"? How can we be sure? If our language does have such properties, do they work nicely together? For example, if we can do `a = b.c();` without annotating `a`, and we can do `a.b(c);` without annotating `c`, can we compose these together to do `a = b.c(d.e())` without annotating `a` `e`? Imagine how frustrating it would be if we couldn't be confident about what would/wouldn't work; all we could do is hit "compile" and cross our fingers, and if we did encounter a problem we wouldn't be sure about how to work around it. Third-party libraries which work perfectly well on their own might cause errors when used together; should we ask upstream to fix their (working) code? Should we abandon the libraries, or maintain in-house patches?

Now imagine that we're writing a code generator; e.g. a port of swig to our new language. What sort of code should we emit? How can we be sure that it'll work, regardless of whatever curveballs our users throw at us?

Even if we surmount all of these challenges, with a mountain of regression tests, what happens when IntelliJ push an update and a bunch of those tests break; do we start by rewriting our language's documentation?

This is why it's useful to have a small, self-contained system (like MLsub) which we can reason about semi-easily; which we can prove properties about. Even if we have to restrict things, like only proving something for a subset such that XYZ, at least we know what those limitations are.

Real systems can then be built on this foundation, with confidence about what can and can't be done.

Veedrac · on March 3, 2017

This system doesn't seem to support type-based dispatch, so it's not directly suitable for Java.

Jweb_Guru · on March 3, 2017

That's not clear to me; unusual cases in Java are even used in some of their co/contravariance examples, and an example of how they can be encoded here are given (notably, without requiring existentials or bounded quantification). In particular, I'm not sure what about type-based dispatch in Java couldn't be replaced here by their "sum-of-products" notation (which allows you to directly reference fields if they are shared by every variant) combined with making { x : A } and { y : B } subtypes of { x : A | y : B }. I wouldn't be surprised if you can replicate a significant percentage of Java with nothing more than the machinery of this paper.

Munksgaard · on March 3, 2017

I've been waiting for this paper ever since I saw Stephens presentation of MLsub [0] at ICFP 2015. I thought the paper had an interesting take on type checking, and it has given me some ideas that I've been wanting to try out for myself.

I look forward to reading the dissertation over the weekend.

[0]: https://www.cl.cam.ac.uk/~sd601/mlsub/

auggierose · on March 3, 2017

Just read the introduction, and it hits all the right spots in explaining why people (including me) have failed to come up with good subtyping in Hindley-Milner world. I expect people to rush to update their various HM-based systems to this new approach. This seems like a really big thing to me.

infinity0 · on March 4, 2017

"The primary tool by which this was achieved was not advanced or complicated algorithms, but attention to the minutiae of basic definitions."

Ericson2314 · on March 3, 2017

> Disjoint union is the coproduct in the category of sets, so we have a good excuse to use the symbol +

This is definitely making fun of somebody

vog · on March 3, 2017

[deleted]

Stupid misunderstanding. Sorry, forget what I wrote. I wanted to delete my comment, as it provides no insight and not even an interesting question. But unfortunately the delete button disappeared here in Hacker News.

Manishearth · on March 3, 2017

"PhD" stands for "Doctor of Philosophy".

runeks · on March 3, 2017

I believe I had the exact same thought as you, in that I was surprised to read "Doctor of Philosophy" on what I assumed was a computer science paper. Now I know better.

_0w8t · on March 3, 2017

And now I know why record-like types are so awkward in many functional languages. A classical type inference just cannot type them requiring heavy type annotations.

xyzzy123 · on March 3, 2017

Thanks. Do you have any useful insights such as programming in a relevant language? Shoot from github first if you want to hulk smash demonstrating competence.

P.S: let me know if I'm not adequately rating your contributions.

dang · on March 3, 2017

Incivility like this will get your account banned on HN, so please don't do it here.

We detached this comment from https://news.ycombinator.com/item?id=13782112 and marked it off-topic.