Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You can not write a static compiler for python, because python does not have static typing, and it is a dynamic language. So that part is a tautology.

I suspect the author is trying to say that you can not write a compiler to machine code for python. This is wrong.

Compiling dynamic languages to machine code has been done dozens of times in languages with equivalent or greater dynamic properties (Common Lisp, Scheme, and Smalltalk, for example). I guess an example of proof of concept here is that Python was implemented in Common Lisp as a DSL (macros), and it works on the machine code lisp implementations. (http://common-lisp.net/project/clpython/manual.html#compilin...)

The truth is that no one can be bothered to do it, because there is little to be gained from a faster python implementation. All of the slow code is in little parts that can be rewritten in C (or whatever faster language... so almost any language).

The mentioned Python compiler projects are all 'research,' as far as I can tell. Doing something that is actually known to work and is difficult would be of no use to someone who is interested in tenure.



> Compiling dynamic languages to machine code has been done dozens of times in languages with equivalent or greater dynamic properties ([...] and Smalltalk, for example)

Do you have sources for a static compiler for Smalltalk? Considering how dynamic that language is I have some trouble imagining such a thing.

Please note that Alex is not talking about JIT compilers here (for good reason: PyPy has a JIT compiler), but about static compilers.

> The truth is that no one can be bothered to do it, because there is little to be gained from a faster python implementation.

That's a joke right? A significant part of Pypy's effort is a faster Python implementation.

> The mentioned Python compiler projects are all 'research,' as far as I can tell.

The only "Python compiler projects" mentioned are ShedSkin and Cython (and both actually compile python-like languages, neither pretends to compiling Python), and neither is a research project, both have purely practical goals (although ShedSkin is completely experimental at this point)


> Do you have sources for a static compiler for Smalltalk

> Considering how dynamic that language is I have some trouble imagining such a thing.

Smalltalk always uses a virtual machine, it does not always use a JIT.

I said a static compiler doesn't make sense for a dynamic language (saying you can't do it is tautological, it is like trying to get dry water).

I am talking about dynamic compilation to machine code (Not JIT). From that, you can alter how much code in-lining and optimization happen in nested calls. It is a much used technique and I do not need to prove its validity.

Everyone in here seems blind to the possibility, which puzzles me.


> I said a static compiler doesn't make sense for a dynamic language (saying you can't do it is tautological, it is like trying to get dry water).

Not at all, dynamically typed languages have varying amounts of effective dynamicity (and staticity), some should be static enough to infer most types statically. Erlang for instance is not overly dynamic.

> I am talking about dynamic compilation to machine code (Not JIT). From that, you can alter how much code in-lining and optimization happen in nested calls. It is a much used technique and I do not need to prove its validity.

You're describing JITs here, why are you saying "not JIT"?

> Everyone in here seems blind to the possibility, which puzzles me.

Everyone "seems blind" because you're describing JITs and saying you're not talking about JITs, you're about as clear as tar during a moonless night here.


No, JIT is a specific type of dynamic compilation. It is not every type of dynamic compilation. Maybe I mean 'incremental compilation.'

I am not describing JITs, I am describing VM based languages, which have the ability to incrementally statically compile functional objects. Does that help?


Then the confusion probably comes from the fact that Python's main implementation is VM based. So suggesting what they are already doing as an improvement over what they are already doing is confusing to say the least. Perhaps they need a better VM, but that is the technique they use. To see Python's byte code open up a .pyc file.


> Perhaps they need a better VM, but that is the technique they use. To see Python's byte code open up a .pyc file.

Ohyes is talking about per-function static compilation performed on the fly to machine code. Not bytecode.

It seems about halfway between static compilers and JITs really: functions are compiled to actual machine code statically, but the VM can recompile functions or compile new functions and replace old ones (of the same name) on the fly, e.g. during a REPL session.

That's not what Python does, Python code is compiled to VM bytecode and the VM does not compile it any further.

Under ohyes's scheme, the VM would compile that bytecode further down to machine code (or just skip the bytecode). It's closer to what HiPE does than what Python does.


Yes, this is correct.


Yes.


Do you mean AOT compilation? From an earlier post I gleaned that the Uladen Swallow project used the LLVM backend, which is variously described as a Jit, Ahead of Time compiler, incremental compiler and various other things. It's clear there is some confusion in language, but I got what you are talking about.

At any rate, if I am reading the earlier post, it was tried and not found to be effective. This surprises me greatly. LLVM is of the highest quality and very fast. I'd love to know why people considered it to have gone "wrong" when it came to Unladen Swallow Python.


Good call, I think you are right. My intended point was that the OP was dismissive of the idea of Incremental/AOT compilation as a possibility. I was not terribly clear and may have misread him.

The idea for LLVM is that you can target the LLVM IR or LLVM Byte-code, and LLVM will provide the platform for your regrettable compiler. It has both a JIT and a native compiler component. You can run the AOT compiler either as incremental or sucking in a bunch of source and doing C style static compilation.

I am by no means an expert obviously, but when I evaluated it for a project it seemed to be geared towards generating fast code for C/++ like languages... for which you tend to know the machine types of things, and be operating in terms of machine floats/doubles/integers/etc. Which doesn't seem to be much of a problem for Python, honestly.

The 'virtual machine' is more of a Bytecode model... as the name implies, it is low level). You would have to build your own virtual machine (PythonVM or what have you) on top of it. This would need to be a complete VM with the ability to generate LLVM bytecode. Then you could take advantage of the SSA transforms constant reduction and other nice parts of the LLVM (peephole optimization for example).

But I guess the point is, LLVM takes care of one hard part for you, but there are a bunch of other difficult parts which would still need to be handled. Particularly the garbage collector. I'm sure unladen swallow generated code is bleeding fast because of its use of LLVM.

All of this said, I'm pretty sure that the project died with Python 3. Maybe this whole discussion is missing the point entirely? How do you write a fast compiler for a language which has no standard? It is bound to change unpredictably and be an incredibly frustrating task.


JIT is dynamic compilation to machine code. Feel free to explain why your technique is not JIT, though.


This is not a JIT.

http://paste.lisp.org/+2N29


A static compiler has nothing to do with static typing, it is just a traditional compiler that passes through the source code and generates the final result of the compilation in one shot, as opposed to a byte-code JIT compiler.

I guess the point is that you can not write an _efficient_ static compiler, because there is too little information available at compile-time. I don't think your examples give a counter-proof here, most Smalltalk compilers are JIT compilers, so it's really an argument in favour of the authors orginal thesis, and Common Lisp introduces a lot of extra type annotations to achieve good compilation results. Racket (PLT Scheme), the most popular Scheme implementation, uses a JIT compiler as well.


My point is that you do not have to go full static compilation to achieve the benefits of using machine code and static compilation techniques.

Compiled Common Lisp (without any annotations) is much faster than interpreted python, simply because all of the dynamic dispatch is handled in machine code.

To achieve speeds closer to Java and C++ in Common Lisp, you certainly need type annotations (And a working knowledge of the given compiler).

But we aren't talking about those speeds, we are talking about maybe a 4x improvement, which could be done. Racket is popular because it is easy to use and has a nice library. There are a number of schemes which are faster than it is.


Cython, in my eyes, would qualify as a "production"-level compiler project - the lxml library uses it, and you occasionally run into it in other projects.

Cython follows the "optional static typing" approach, and implements a rather large (but not complete) subset of Python. It's a very good intermediate solution for those 20% of the code that take up 80% of the running time.


I mostly agree with you, but there are cases where the problem isn't that "thing X is too slow", but "after a long period of running the garbage collector gets bogged down with long-lived cycles and dies". So I think garbage collector improvements would be really helpful. Core interpreter improvements, not so much.


Absolutely, improving garbage collection is definitely more interesting/important than code that runs 'faster' for arbitrary benchmarks.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: