Hacker News new | past | comments | ask | show | jobs | submit login
Covariance and Contravariance (mikeash.com)
156 points by marvel_boy on Nov 21, 2015 | hide | past | favorite | 67 comments



A fascinating limitation of Go is its lack of support for functions with covariant return types.

Say that you want a function which returns a Reader:

    type ReaderMaker func() io.Reader
A function which returns a concrete type which implements Reader cannot be substituted. This makes sense because interface types are always a two-word tuple, and a concrete type could be a one-word pointer, a big struct, etc.

    var invalid ReaderMaker = func() *io.PipeReader { return &io.PipeReader{} }
More subtly, a function which returns a different but compatible interface type also won't work.

    // This assignment is invalid because although anything that implements
    // ReadCloser also implements Reader, a ReadCloser itable won't necessarily
    // have Read() at the same offset as a Reader itable.
    var invalid ReaderMaker = func() io.ReadCloser { return &io.PipeReader{} }
http://play.golang.org/p/5PPpVZhap_

* edit: fixed swapped types in comment


Variance isn't really applicable to Go at all, because Go doesn't have subtyping. It just has a lot of built-in type coercions in various places in the grammar, "interface to super-interface" being one of them. That explains why covariant-anything doesn't work in Go: there is no coercion between function types.


I quite like Go and I'm writing a lot of it these days, but this is one part I really hate.

Some of what you gain by its lauded "simplicity", you give back with type system wrestling.


I haven't learned GoLang, but what I want to say here is quite general.

I regard that (class names ending in "er") as an antipattern, and have seen them again and again in Java programs. I call them "thingers". Except in simulation programs, objects should just sit there passively and have things done to them by methods. For example, rather than a parser class you should have a parse method, which belongs to a grammar class, and takes a list of tokens as its argument, and returns a parse tree.

Here, it's simpler just to have a method/function/procedure called read() which can be an method belonging to whatever classes of object you want to read from. And if you want to make something, just call new rather than have a factory class.

I'm not the only one who doesn't like thingers: see http://steve-yegge.blogspot.co.uk/2006/03/execution-in-kingd...

Now, before you argue that I don't grok object orientation, I'll tell you that I've added object orientation to Prolog, and to a Lisp dialect (twice). (I've also got several years of OO application programming, in Java and Lisp, behind me.) Perhaps you could call me an OO skeptic though - I can accept that.


Readers in Java are character-based streams. How would you implement a composite i/o stream model then, if Readers are bad-smell thingies? In Java. Adding buffering or counting or throttling to a stream is rather easy using this pattern; is there some other fundamentally sound way of doing that?


You've answered your own question. In Java, what's called a reader is nothing of the sort. It's really a stream (i.e. a thing, rather than a thinger), and I would have called it that, but programmers are now lumbered with it being called a Reader.


So your answer is to search your vocabulary for a proper term that isn't a "nounified-verb"?

I guess I do that anyway, since it feels cleaner, though just as often there isn't one, or it's already taken (like [Input|Output]Stream in my example), and I can't say I really feel that bad about nounifying a verb.

In fact, when you nounify a verb (in other words: turn a function into an object) you can add so much to it. For instance, you could track how long it takes to execute, you can replace some of its constituent components (e.g. GOF Strategy Pattern), you could tune parameters that aren't part of its computational input (something like buffer size, or enabling caching, etc)... these are all things that would otherwise have to be communicated through a lengthy list of parameters, and turning them into object state frees up your actual method signature for the parameters that really matter.


Huh, I thought Go doesn't use variance anywhere (probably because I hit this). Any situation where it does use variance?


My favorite article on this topic is Edward Z. Yang's "Tomatoes are a subtype of vegetables", which illustrates function contravariance with very charming drawings.

http://blog.ezyang.com/2014/11/tomatoes-are-a-subtype-of-veg...


Technically (botanically) Tomatoes are actually a subtype of fruits. (they contain seeds)


In every sense tomatoes are fruit, not just technically.



No, just botanically.

You can't make jam from tomatoes. Or at least it wouldn't be called jam. You can make jam from rhubarb, so in some sense that is a fruit.

Similarly, you can make soup out of tomatoes, but not out of rhubarb.

So, from the point of view of a cook, tomatoes can be considered vegetables and rhubarb fruit.


There are recipes out there for rhubarb soup and tomato jam (nothing fancy, just search on either term).


There's also strawberry soup (http://www.food.com/recipe/strawberry-soup-122037) and carrot jam (http://www.foodandwine.com/recipes/carrot-jam).

I think we need fuzzy classes. :-)


Tomato jam I haven't seen but there is Tomato Chutney (and yes, it's sweet).


Tomatoes are not generally considered culinary fruits, so there is, in fact, a sense of "fruit" on which tomatoes are not fruits. They are biological fruits, which is what someone who invokes a "technical" sense likely means.


Yeah. For some reason some people seem to think that some contexts are superior to others just because the context concerns the scientific background, even though everyone has a prior conext in mind which is the social default.


In a precise language it is obviously utterly useless to have homonyms that are not just distinct but opposed in meaning, unless one is the inferior. Of course that depends on context, because language is inherently underspecified (i think), but i like to think that the acclaim of is the more general, here, whereas cooking actually isn't predominantly concerned with names.


Botanics and cooking are different domains with vaguely overlapping but fundamentally different vocabularies. There is nothing wrong with that, our brains and our social constructs are equipped to deal with that. In culinary context, tomatoes are not fruit. Rationale? Just one example: they are usually not used as dessert topping, whereas fruit is.


"Knowledge is knowing that a tomato is a fruit. Wisdom is not putting it in a fruit salad."

attributed to Miles Kington


Dan Grossman of the University of Washington delivered a Programming Languages course on Coursera.

Thanks to his lecture [1], I'll never forget "Function subtyping is contravariant in its argument types and covariant in its return type" - even though I have to think to remember whether the subtype is the contravariant or covariant, I remember the principle very very clearly due to his antics.

[1] https://www.youtube.com/watch?v=pb_k8h6RuAY


That's a great course, and actually a perfect conceptual base for Swift in a lot of ways.


Covariance and contravariance also have meanings in mathematics, for which this seems like a specialization though I don't have a concrete connection.

In math you often have a class of objects and functions on those objects (say, groups with group homomorphisms, or simpler, types in a programming language with functions that take one type to another). Then in math one often learns about an object by associating another object with it. For example, a topological space can be associated with a specific kind of group that tells you about the space's structure.

Then you can ask how this "association" behaves with the functions on the object. I'll use notation now, and say that for an object X the "associated" object is G(X). Often G carries over to functions, so that if f is a function X -> Y then G(f) is a function G(X) to G(Y). Such an "association" is called covariant because it preserves the ordering of domain -> codomain. The "association" is instead called contravariant if it reverses the order. E.g., f: X -> Y becomes G(f): G(Y) -> G(X). Of course, I'm using "association" but the real word for G is functor.


A generic type with a covariant parameter can be seen as a covariant functor if you pick the right category.

Specifically you need the category where the objects are types and the morphisms indicate inheritance (i.e. A -> B means that A is a subtype of B).

Then a functor F is covariant if A -> B implies F(A) -> F(B). Which is precisely what it means to be a covariant generic type.

Note: you shouldn't confuse this category with the category where the objects are types and the morphisms are functions that map values of one type to values of another type.


If I'm not mistaken, considering the category of subtyping relationship you get exactly that: with the covariant functor you get (in Scala notation) if B <: A then F[B] <: F[A] and with the contravariant functor if B <: A then F[B] >: F[A]

('<:' means subtype and '>:' supertype )


Interestingly, in physics covariance and contravariance of vectors are basically on equal footing. This is pretty different the OP usage for which covariance and contravariance denote either toward or away from the root of a tree (the object hierarchy).

Like you, I'd be interested to know whether there's a deeper connection in the terminology.


So I'm just a computer scientist, not a mathematician or phycist, but I don't think there is a direct connection. The words co/contravariance is used for two different things in mathematics, vectors in physics/differential geometry, and functors in category theory. (https://en.wikipedia.org/wiki/Covariance_and_contravariance).

The computer science meaning is inspired by category theory (and can be seen as a simple instance of it). But the vector/functor meanings seem to be unrelated.

There is a bit of a connection, because the operation ٭ which takes a vector space V to the set of 1-forms V٭ is a contravariant functor. (I.e., linear maps V→U give maps of 1-forms U٭→V٭). But in physics, we say that the members of V are contravariant and the members of V٭ are covariant, i.e. the physics notion is about individual spaces, not about mappings between spaces.


As others have said, you can recover the notation exactly through category theory


Could you link me? Ctrl-F-ing for 'vector' didn't help.


My understanding, though I've never personally made it explicit, is that you view "contra/covariance" as arising not as a property of vector spaces exactly (concretely) but instead from tensors-as-multilinear-maps and you talk about them as functors between transformation groups on vector spaces which either preserve (co-) or reverse (contra-) transformations.

The inner product lets you view vector spaces as spaces of transformations already so the same sort of idea comes out now viewing vector spaces as maps in their own right.


Eric Lippert did a great series on covariance and contravariance and how they are implemented in C# and various design decisions which went in to it. I would highly recommend it if you use C#.

http://blogs.msdn.com/b/ericlippert/archive/tags/covariance+...


There is something in the article that looks wrong. It's suggesting that `let returnsCat: () -> Cat = animalF` should work, where `animalF` returns `Animal`. The Swift playground agrees with my understanding that this is not correct: "Cannot convert value of type '() -> Animal' to specified type '() -> Cat'".


I think is just a matter of editing, the line "It even works with functions:" should probably say something like "It's even the same with functions" as before clearly stated that "let cat: Cat = Animal()" doens't work


"Editing"? You overestimate me. Anyway, thank you both for pointing this out, I've improved (I hope) the wording now.


I remember reading about this issue from my C# days. But, does all this stuff actually help write better programs?

I now work in a language that doesn't really support building "types" in this way and I can't say I miss thinking about this sort of thing nor consider its lack to be a problem. Types in this sense strike me as more of a boondoggle of overhead that gets in the way more than helps achieve the goal of whatever I'm writing.


That's a good question. I don't know the answer.

What language are you working in now? Some really don't support variance (Go, apparently, and C would be an obvious example) but many others support it but just don't enforce it (Python and all the other dynamically-typed duck-typed languages).


As a language designer, I’ve been thinking about this sort of thing lately. It’s basically a tradeoff between how much power you give a programmer to enforce properties of their programs, and how much work you ask of them to do so.

More expressive type systems let you write programs that are more internally consistent (and, by proxy, more correct), but at the cost of more work to write the appropriate types, and to read and understand such types written by others.

A lot of PL research is going into advanced static type systems, and I think that’s a good thing, but it doesn’t necessarily translate into better programmer ergonomics or productivity.

Maybe we just need new notations for expressing types (and code!)—notations that are more in line with how people naturally think about these things.


No shout out to Scala, which handles this all quite nicely? >sad face<


I don't know much Scala, and my blog is mostly focused on the Apple world.

That said, I always love seeing other perspectives, so a comment on the article describing how Scala gets it right would be welcome.


here you go. Variance explained by Martin Odersky in his coursera course https://class.coursera.org/progfun-005/lecture/83 (you need to login to watch it). Or the Scala doc http://docs.scala-lang.org/tutorials/tour/variances.html



Nice article, though I think you're mixing up subtypes and supertypes. Or I am. After reading that I no longer know if "Cat extends Animal" means that Cat is a subtype of Animal, or that it's a supertype of Animal.

E.g.

  let f1: A -> B = ...
  let f2: C -> D = f1

  (...)
  Applied here, the above code works if A is a subtype of C, and if B is a supertype of D.
  Put concretely:

  let f1: Animal -> Animal = ...
  let f2: Cat -> Thing = f1
So now Animal is a subtype of a Cat, and a supertype of Thing?

But I finally understand what covariance and contravariance means, so thanks!

--

Also, related, could somebody more proficient in English than I am help me understand which direction does the substitution go here?

"It says, in short, that an instance of a subclass can always be substituted for an instance of its superclass."

Is it "an instance of subclass can replace an instance of superclass", or "it can be replaced by"? Or is it context-dependent?

I ask because I'm not sure about the grammar aspect, and in my native language (Polish) we have a few ambiguities like that, which I sometimes jokingly write down with an arrow, e.g. "A can be ---substitutes for--> B" or "A can be <--substituted for-- B", to make it damn explicit which direction the word works.


I think the example is right. Cat is a subtype of Animal, and Animal is a subtype of Thing

Cat < Animal < Thing

So if the function takes any animal, of course it can take a cat (so its covariant in argument). And because all animals are things, we can say the output type of f is Thing.. its just less accurate than Animal but still correct - that is the contravariance in the result.

Cat extends Animal means "Cat can do everything Animal can, plus maybe more stuff". So all cats are animals, but not all animals are cats. So Cat is a subtype of Animal.

For the second question,

"an instance of subclass can replace an instance of superclass" is the right one.

Really it makes a lot of sense intuitively. e.g say Int is a subclass of Number (a simpler way to read is just "Int is a Number") . then anything you can do with numbers, you can do with ints too so we can put an int in place of a number.


My first question is about terminology used. I get why the code example works, but the description I quoted says the reverse (i.e. "Animal is a subtype of Cat", and "Animal is a supertype of Thing"). I assume that it's an error then.

Thx for the answer to the second question :). I think my mind put an additional "by" in that sentence and got confused.


> I ask because I'm not sure about the grammar aspect, and in my native language (Polish) we have a few ambiguities like that, which I sometimes jokingly write down with an arrow, e.g. "A can be ---substitutes for--> B" or "A can be <--substituted for-- B", to make it damn explicit which direction the word works.

-------------------------

I think it's confusing for 2 reasons: passive voice and prepositions.

1) Passive voice - instead of saying A substitutes for B (active voice) it says A can be substituted for B (passive voice.)

2) Prepositions - "be substituted" can collocate with "for" or "by" with the opposite meaning!

A <-- can be substituted by <-- B (use B where you would expect A)

A --> can be substituted for --> B (use A where you would expect B)


Yes I think he has made a mistake there. He did get it right earlier in the article he says "This makes Cat a subtype of Animal. That means that all Cats are Animals, but not all Animals are Cats."


The text describing the relationships between A,B,C,D makes a lot more sense to me if the example is switched to

    Thing -> Cat


This is why I like the terms "derived" and "base": pretty much impossible to mix up.


The example is actually wrong. Functions are contravariant on their input and covariant on their output.

The basic premise behind subtypes is that an object of a subtype can be used anywhere that an object of its supertype can be used. Just as an example, if you replace f1 : Animal -> Animal with f2 : Cat -> Thing, there might be some animals that aren't cats, so f2 can't replace f1 everywhere. This is the reason that functions are contravariant on their input.


I know what you mean regarding the second question, and it's not just due to language barrier. I regularly get tense when people talk about "has a dependency" or "is dependent on" - there's something naturally vague about it. It's clearer when people say "blocks" or "requires".

Another example is "subscribes" - I had a lot of trouble understanding Rx (Reactive Extensions) until I realized that they use "subscription" in the reverse way I'm used to. When I hear subscription, I think of how I might subscribe or have a subscription to a magazine. But they mean subscribe/subscription in terms of how a magazine company might "subscribe" a customer. Which completely flips the arrow.


Thanks for pointing that out, I think I've fixed it now. This stuff does bend my brain a little bit when dealing with function types. It's fine for concrete examples, but when I try to generalize it I have to think real hard and still mix it up sometimes.


I didn't expect this:

"This is a key rule: function return values can changed to subtypes, moving down the hierarchy, whereas function parameters can be changed to supertypes, moving up the hierarchy."

The part allowing a super type in an overridden methods parameters stood out. I tried this out in scala and it did not compile. Intuitively it doesn't make sense to me to allow wider types.


If you think about it it's intuitive. Take these type chains:

Thing -> Animal -> Dog Thing -> Animal -> Cat Thing -> Instrument -> Tuba

Lets say Thing has the following properties: - IsVisible - Mass - CanMove

Lets say Tube has the following properties: - PlayNote

And lets say that Cat has: - Meow

Lets take the following contrived example for overriding:

  Class Debugger {
   void debug(Cat cat); 
  }

  Class ThingDebugger : Debugger {
   void debug(Thing thing);
  }
Going "wider" to Thing works because Cats, Dogs, and Tuba all have the properties that a Thing has, because they are instances of Thing. Going "narrower" works for the same reason:

  Class Wrangler {
   Animal wrangle(); 
  }

  Class DogWrangler : Wrangler {
   Dog wrangle();
  }
A Dog can do anything an Animal can do because it is an Animal.

The type system is doing its best to save you from the following:

  Class Janky {
    void ThisIsOK();
    Animal PrepareForCrash();
  }

  Class AlsoJanky : Janky {
    void ThisIsOK();
    Instrument PrepareForCrash();
  }

  Class StillJanky : Janky {
    void ThisIsOK();
    Thing PrepareForCrash();
  }
In this case any mixing of PrepareForCrash will leave you with returned types that don't completely share an interface. If the compiler let you do that, you'd probably do this at some point start mixing the subtypes of Janky and calls to PrepareForCrash would return Tubas when you expect Cats and then all hell would break loose. Anyone who's used scripting languages should be familiar with something like this happening at one point or another.


I still don't get ThingDebugger.debug()

first you can't call super() in that method without a downcast. how does that count as overriding the method?

it's a big problem if you have another subclass of Debugger that does not change the parameter type. you now have 2 subclassed of Debugger, one has a method debug(cat: Cat). the other has a method debug(cat: Thing). that accepts cats, dogs, and tubas. these subtypes of Debugger are not safely interchangeable.

this does not compile in scala, I'm surprised swift would allow it.


I guess it depends on whether the methods are virtual or not? If they're virtual it makes sense that you couldn't override with a wider parameter type, but if they're not it makes sense that you could.


And here I was, thinking this would be about statistics.


And I thought this was about tensor analysis :)

https://en.wikipedia.org/wiki/Covariance_and_contravariance_...


And it was generalised in category theory:

> The use of both terms in the modern context of multilinear algebra is a specific example of corresponding notions in category theory.


It's a new meaning for me too. I knew General Relativity before I could program.


Another link on the topic. This picture[1] I lifted from scala course in coursera.org

[1]http://stackoverflow.com/questions/2723397/java-generics-wha...


A contravariant object transforms like a vector.

A covariant object transforms like the gradient of a scalar.


I understand these words, but not in the combination used.

I guess this is really simple to people with the nessecary maths background to understand this, but for me I just don't see the connection between vectors/gradients and type systems.


I don't think there is any connection. But since the submission was titled "covariant and contravariant" without any kind of context, it seemed appropriate that we could all share our own personal ideas about what these words were supposed to mean.


Ok. the source of the confusion for me then was that I was not previously aware that covariant and contravariant were terms that could be applied to vectors.


I just don't see the connection between vectors/gradients and type systems

There isn't really one besides the terminology.


It's amazing how well Mike Ash explains complex concepts !




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: