Hacker News new | past | comments | ask | show | jobs | submit login
Guido van Rossum: The Theory of Type Hinting for Python 3.5 (quip.com)
149 points by kibwen on Dec 20, 2014 | hide | past | favorite | 53 comments



Reading this pre-PEP, I'm not sure I understand what I will gain from this as a Python programmer. Can anyone explain what we can expect both short-term (first release, Python 3.5) and long-term (later version, accompanied by tooling and more side work)?

a. Better quality and maybe programmer productivity through static checking?

b. Better performance thanks to enforced type letting the compiler do a better job?

c. Something else?

Thanks!


In some code we have the first 5-10 lines are basically in the shape of : assert isinstance(arg1, type) assert isinstance(arg2, SubObjectClass) And so on. Mainly because we deal in code that deals a lot with traditional types (as in signed 8 bit, Strange-Endian 32bit in 16bit words , and so on) which makes code turn nasty in pretty much any language.

Further down these are cast into more native types, and then walk out into public API's.

And there once more you need to do somewhat strict data-validation.

For these cases, typing rules and annotations will be a boon.

And do note, that there is more common a case of human error or refactoring error that causes problem. Someone sending in a default [] array in an object, and somewhere else it's an expected Boolean.


Thanks :)

That matches well what I had in mind with my "a." suggestion.

Now, why cannot we expect "b. Better performance thanks to enforced type letting the compiler do a better job"? Are the semantics of what's being discuss insufficient to provide such benefits? I'm asking that question because I read often that dynamic typing is the #1 performance hit of Python (e.g. [1]). Now that this PEP lets developers provide and enforce type, what's still in the way of reaching the level of performance of statically-typed languages?

[1] Alex Gaynor - Fast Python, Slow Python: https://www.youtube.com/watch?v=7eeEf_rAJds


I don't think the Python VM will use these as actual hints, it's mainly to help refactoring tools and people maintaining large codebases as I understand it. No type is "enforced".


> For these cases, typing rules and annotations will be a boon.

You can use PEP3107 annotations today and automatically validate them using a decorator to eliminate boilerplate code like that. They don't have to be something to look forward to tomorrow.


Related: https://github.com/kislyuk/ensure

A (relatively) high performance runtime annotation type checker.

    from ensure import ensure_annotations
    
    @ensure_annotations
    def f(x: int, y: float) -> float:
        return x+y


(b) is possible as well, but it would probably happen in PyPy or Cython, rather than CPython.


I may be mistaken, but wasn't GVR fairly opposed to anything beyond simple type annotations? Don't get me wrong, I love that he might be turning around on this, but has anyone followed the transition and can provide some context? What led to this turnaround?


10 years ago GVR discussed optional static typing in Python [1]. He liked the idea but he thought it was really hard to properly implement. With PEP 3107 [2], some basic syntax constructs were added for this purpose, but no convention was suggested for specific type annotations, resulting in the feature not being really used. This year GVR started the discussion again [3][4], this time to propose an specific syntax for type annotations and include them in the standard library. This syntax is based on MyPy [5]. GVR has never been opposed to more than simple type annotations. However, he is opposed to implement static type checking in the core of Python. Instead, he suggests to use external tools such as MyPy or specific IDEs.

[1] http://www.artima.com/weblogs/viewpost.jsp?thread=85551 [2] https://www.python.org/dev/peps/pep-3107/ [3] https://mail.python.org/pipermail/python-ideas/2014-August/0... [4] https://mail.python.org/pipermail/python-ideas/2014-August/0... [5] http://www.mypy-lang.org/


Thank you very much for the information! Really cleared it up for me.


That post cheered me up for the day. Things seem to finally move to a (IMHO) great direction for this language. I was thinking about rewriting my latest large project from python to go, but now i think i'll wait a bit to see where this thing leads to.


Is there any documented information on how the decision to use consistency instead of subclassing was made? Naively it seems very similar to say c# or java with Any instead of Object at the root - except that Any is consistent with all types.

That seems odd and I'm not sure I understand the benefits of it. I get that it only applies to things typed as Any (and presumably like with Object, typing as Any is quite rare to want to do - especially with support for union types) but is there an example where you'd want this and the c#/java subclassing would be limiting?


The "documented information" you want is mentioned in the article. Guido links to a blog post by Jeremy Siek and a paper by Vitousek, Siek and Baker. Read the blog post, section "Why subtyping does not work".

http://wphomes.soic.indiana.edu/jsiek/what-is-gradual-typing...

http://wphomes.soic.indiana.edu/jsiek/files/2014/03/retic-py...


`Any` seems to be more akin to C# `dynamic`

The key is that we're talking about _Gradual_ Typing: it's not that "you never want to type things as Any" (if you always used a dynamically typed language like Python you probably never cared too much), or "subclassing is too limiting" (if anything, imho is not limiting enough).

The point is offering a tool to people who might benefit from some static checks, to avoid stumbling into problems at runtime. Those people have a HUGE codebase with dynamic types (basically, everything is already Any) and you need to _Gradual_ly specify and move over parts of the code to be correctly typed.

as keosak writes, if you want a proper explanation of theory behind it, just read Jsiek's blog posts


It reminds me of TypeScript, where everything defaults to type `any`, so it can interoperate with normal JS. This essentially makes static type checking an "opt-in" feature of the language.


Thanks - this is exactly what I was missing. I was trying to figure out the benefit of allowing it in the language, when of course the answer is supporting existing programs.


The paper reference is: Siek & Taha: Gradual Typing for Functional Languages

http://ecee.colorado.edu/~siek/pubs/pubs/2006/siek06:_gradua...

(Of course, the "functional languages" bit isn't critical here --- the paper defines gradual typing for a lambda calculus, which is the usual vehicle for explaining type systems.)

Basically, you want to partition the program into parts with and without known types. If you know the types, they must be equal. If a type is unknown, it is "equal" (consistent) with anything. The conclusion of the paper is that you need a different, symmetric notion of consistency, that is different from subtyping.

Making Any the root of the type hierarchy doesn't work. You explicitly want to allow things like implicitly passing an Any as an argument where, say, an Int is expected, which would be a down-cast. You don't want to allow implicit down-casts all over the place, because that causes the type hierarchy to collapse --- if you allow implicit down casts, you can always up cast to Any, and then immediately down cast to any other type.


http://wphomes.soic.indiana.edu/jsiek/what-is-gradual-typing... (linked in an article) explains `Any` a bit better. "Any is consistent with all types" essentially means runtime checked/dynamic, instead of statically checked.


This very much sounds like a type inference with hinting version of the C type system with a minor extension. It also seems to have many parallels to the Objective C type system with id, but again in a type inference variant.

The three rules to apply to quite directly to standard OO C:

1. If t1 is a subclass of t2, t1 is also consistent with t2. (But not the other way around.)

Using the standard OO C method of subclassing structs, ie.

struct t2 {...}; struct t1 {struct t2 t2; ...};

This is obviously true and the normal subclass relationship, though the C syntax to use this is a bit awkward:

t2_method(&t1->t2, arg1, arg2);

2. Any is consistent with every type. (But Any is not a subclass of every type.)

In C the Any type is void. Using the class definitions above this is entirely valid and produces no warnings:

void v;

struct t1 = v;

So if you have an object of void type you can use it wherever you might require a stricter type. If you pass in something which is not consistent you'll get a runtime error (usually a segfault in the case of C).

3. Every type is a subclass of Any. (Which also makes every type consistent with Any, via rule 1.)

This just says that you can do the following without getting any warnings: struct t1 t1; void v = t1;

And this works quite well in C.

The extension to the C type system, beyond the type inference, is to make these rules recursive, especially in function types. For example, this produces a warning under GCC, though it will compile and run fine:

void f(int* func(int b, int c)) {}

void* g(void b, void c) {return b;}

f(g);

That might just be a limitation of GCC's type checking though.

I think this is quite a good direction to take. C's type system has proven sufficiently powerful over the decades to build large systems and at the same time is trivially bypassed when you paint yourself into a typed corner or you want flexibility strict static type checking finds cumbersome to provide.


> In C the Any type is void. Using the class definitions above this is entirely valid and produces no warnings: > void v; > struct t1 = v;

I assume you mean a pointer to void here. You explicitly cannot have a void value, as void is an incomplete type (it has no size).

  $ gcc -x c - <<<"int main(void) {void a; return 0;}"
  <stdin>: In function ‘main’:
  <stdin>:1:22: error: variable or field ‘a’ declared void
You can have a pointer to void, which works as you describe, though you must also use a pointer to the struct.

  void *v;
  struct t1 *b = v;


>For example, this produces a warning under GCC, though it will compile and run fine:

All function pointer types are convertible to any other function pointer type, and since 'func' is never called as the wrong type there is nothing wrong with your example. It's not a limitation of GCC, it's the intended behavior according to the standard.


Right, the standard allows it, but GCC hasn't yet extended its type checker to give you the help you'd want in this case. The warning effectively says "The standard says this is fine, but in most cases this isn't what you want". Which, in nearly all cases, is correct.

If I pass "void g(int a)" to "f(int func(struct foo f))" the standard says everything is fine, but it's almost certainly going to blow up. GCC helpfully notifies me of this.

The only case where this warning isn't what I want is where some of the pointer types have been replaced with void. That is, passing "void g(void a)" to "f(int func(struct foo f))" should work just like "struct foo f = (void *a);" and output no warning.

The proposal from the article supports this case and other similar recursive cases.


This is exciting - ever since reading Jeremy Siek's post on Gradual Typing [1], I've been missing this in my python and ruby. This could be the killer feature that would both move larger corporations towards python 3!

Two things in the pragmatic side seem hairy though - type declarations in types and `Undefined`.

[1]: https://news.ycombinator.com/item?id=8594079


Yes, this is the first feature I've seen that makes me think about what it would take to move my team over from 2.x. We have a moderate codebase (around 70k lines, last I checked) and are getting to the point where we'd probably get some benefit from types and a type-checking IDE or tool.


See also here for a slightly different but related approach: http://user.it.uu.se/~kostis/Papers/succ_types.pdf



This looks interesting. If such potential features come true in Python 3, I would have better motivation to migrate to it.

However, it does, somehow, with type alias syntax as proposed in this article, make python3 looks even similar to Go.


It would also be nice to see some interface-based type assertions. For instance,

    def comma_sep(items):
        return ','.join(map(str, items))
requires two things:

- items be iterable

- each item have a __str__ method


Ten years ago, this could have accelerated the development of fast CPython replacements. With the incompatible changes in Python 3, all the alternatives to CPython took years to catch up. Or, like Microsoft's Iron Python, were abandoned. Now that PyPy is getting close to the point where it might replace CPython, a major change to the language comes out of nowhere.


These are just typing hints, they don't break compatibility. The ability to annotate function parameters and return values has been there for a while, so PyPy can at first just ignore these annotations and any valid python 3.5 program is also a valid PyPy program. Given that PyPy already does type-specialisation for optimisation, I assume its authors will want to make use of the type hints anyway, so we can expect, that support for this feature will land shortly in PyPy.


Both IronPython and IronRuby got dropped. It was I believe due to lack of interest in these on the CLR and alternatives like Powershell.


I want to get this page as HTML, instead of a Javascript script that builds the page? I can't (easily) send it to my Kindle this way.

This is a TEXT DOCUMENT, right?

(I don't want a downloadable PDF also.)


If you use Amazon's Send To Kindle plugin and highlight the body of the article, it is captured and sent to the kindle beautifully.


It works for me in EWW (and probably w3m), though it complains about an "unsupported browser." This is actually better than average for a modern Javascriptastrophe of a website.


Can we please just do Hindley-Milner? They knew all this stuff in the 1970s...


You are confusing two different things: gradual typing, and type inference.

Type inference infers a static type for every expression in a program. Hindley-Milner is such a type inference algorithm.

Gradual typing allows you to partition your program into typed and untyped parts. The untyped parts need not even be typable under your type system.

Type inference and gradual typing can be combined. See: Siek and Vachharajani: Gradual typing with unification-based inference http://dl.acm.org/citation.cfm?id=1408688

The idea of that paper is that you want to do type inference for the statically typed part of your program in a gradual type system.


Strictly speaking Hindley-Milner is the type system that itself admits tractable inference using the usual unification techniques via the Damas-Milner family of algorithms. Gradual typing itself uses type inference in the Damas-Milner family, although it diverges with it's notion of consistency.


Scala doesn't/cannot use Hindley-Milner, I doubt it could be any easier to apply it to Python. (Clojure core.typed doesn't use hindley milner either afaik. There've been experiments on that front though)

Also, type inference doesn't mix very well with dynamic types ("dynamic is viral", or similar things, I've briefly skimmed this post and I think it makes the point: http://ericlippert.com/2012/11/09/dynamic-contagion-part-two... )


The problem is that HM doesn't integrate super well with dynamic typing. They already tried doing that with Scheme a while ago (search for Soft Typing) but the end result was very complex and hard to use.


No, we can't, HM type inference is for nonunion, nondependent types. Python is of course going to need unions due to its dynamic nature.


That's false. This is gradual typing, which is more recent. Gradual typing allows dynamicity, while HM fails if not every type can be statically inferred. Furthermore, IIRC there are reasons why HM is less/not suited for imperative languages, but they escape me at the moment.


It would be nigh impossible to retrofit in a backwards compatible way. :(


The Python community is fine with that, in major versions.



Look at Shedskin and the papers it references.



Putting the type declaration in comments is a neat idea.


why?


To allow and encourage optional static typing in Python. To make it a language somewhat more suitable for writing complex systems.


And to enable much, much better tool support.


...or you can just use Julia :)


Julia doesn't have gradual typing, to my knowledge. As far as I know it doesn't do any static type checking. I wish it did!


i still have much to learn about language design, but i believe julia has gradual typing. it uses type annotations for the purposes of multiple dispatch (which is awesome, btw) and optimization.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: