Hacker News new | past | comments | ask | show | jobs | submit login
Creating an embeddable Python distribution on OS X (joaoventura.net)
38 points by jventura on June 5, 2016 | hide | past | favorite | 14 comments



Picking files out of homebrew for distribution is generally a bad idea. You’ve fixed the linking issues, but not -mmacosx-version-min.

If you run $ otool -l libpython3.5.dylib and look for LC_VERSION_MIN_MACOSX – you’ll see it’s compiled only for your current OS.

So, if you do this on OS X 10.11, your users will have to have OS X 10.11. It may appear to work on older versions of OS X, until you hit something that doesn’t. For example, when I tried using a homebrew library, it was compiled using newer SSE instructions that weren’t supported on older processors still supported by older OS X versions. So when testing on an old Mac, it crashed with bad instruction at a somewhat random point in execution.


Hum, nice catch! Maybe there is a way to change that LC_VERSION_MIN_MACOSX variable, although the compilation generated instructions may be a dead end.

I see two possible solutions, either try to work out from the Python provided binaries (which support OS X 10.6) or try to compile it from source in a way that is backwards compatible, Maybe cpython's makefile has some support for that?!..


I can confirm that the Python dynamic library installed from python.org do not have any reference to LC_VERSION_MIN_MACOSX. So in theory, it would work on any OS X version..

The only catch is that the official binaries are 32/64 bits which means that the shared library is 6MB instead of 2MB for the homebrew version, and the standard library zip is 23MB instead of 10MB for the homebrew's version..

I've updated the blog post with those remarks in the end.


If it's an autoconf project you should be able to (as far as I recall) pass appropriate min version and architecture flags to ./configure

Python bundles a bunch of modules one may not need for embedding in either case, and they can be turned into byte code before distribution.


I've managed to compile Python 3.5 from source setting MACOSX_DEPLOYMENT_TARGET=10.8 on the Makefile (after doing ./configure). But by default it compiles Python to a static library, which is quite ok, but I am having trouble compiling to a "Framework", which I think would give me the dynamic library.

Although Python is widely used, everything related to these kind of things seems to be poorly documented..

Edit: just need to do "./configure --enable-shared" and it will compile Python as shared library (libpython3.5m.dylib)..


I think I have set CFLAGS to something like "-arch x86_64 -mmacosx-version-min=10.8″ when running ./configure to autoconf projects.

It's not really specific to Python, so "poorly documented" isn't a surprise.


For Python you just need to do "./configure MACOSX_DEPLOYMENT_TARGET=10.8"..

I'm creating another blog post with all the info I got from this thread, regarding compiling Python from sources. I'm going to put the link here when ready..

Edit: http://joaoventura.net/blog/2016/embeddable-python-osx-from-...


Interesting read. I had embedded Python in one of my apps and I wish I had known about being able to zip the distribution. To save space, I compiled all the .py files into .pyc bytecode files, and tediously removed a bunch of modules my app didn't need (tkinter, antigravity, ...), along with setting PYTHONDONTWRITEBYTECODE. You can probably still do this, of course, in addition to zipping it all up.

I had compiled python myself and used static linking instead of dealing with install_name_tool, although I do wonder if I should have built a dynamic library instead. Anyway, compiling from source (or adjusting the homebrew formula?) gives you control and being mindful of eg: LC_VERSION_MIN_MACOSX as already mentioned here.

For my specific use case, I also wanted to use Apple's GCD and have Python do all of its work on a secondary thread. I had to make sure the interpreter would only stay on a single thread (whereas GCD normally doesn't make such a guarantee).


It is extremely likely that that the author did not have option to look at alternatives to Python, but some other scripting languages, for example, Lua, Tcl and Guile are easier to embed because they were designed with that use in mind from the beginning.

Among other things, Tcl, Lua (not sure of latest Guile) encapsulate their interpreter thread well, so one can run independent interpreters in multiple threads that can communicate with each other by message passing but without the need for any serialization/deserialization.


If nothing else, I would imagine using Lua or Tcl instead of Python would make the binary size significantly smaller yeah? I seem to have gotten the impression that embedding the CPython runtime, even stripped down, could potentially grow binaries significantly, especially if statically linked.


My rule of thumb is the Lua core is about 100kb, and the Lua standard library is about 100kb. Tcl is about 2MB. (All built as dynamic libraries.)

Lua’s source is pretty malleable. There are things you can rip out from Lua if you don’t need them to make it smaller. I once read a claim that a micro-controller use got it under 20kb.


I prefer to embed a language that I enjoy programming in. I don't know about Tcl, but I enjoy using Python more than Lua. Python isn't that unusual of a choice for embedding either; eg: take a look at LLDB or Sublime Text.

Threading and reference counting may be some ugly aspects of embedding CPython sure, although I don't think manipulating the stack in Lua is that praiseworthy either.


I’ll stick my neck out there and praise Lua. While I will admit the stack is tedious to work with, all C binding bridges I’ve worked with are tedious. (With exceptions of languages that use compilers and architecture/platform specific tricks to pull off transparent bridging. LuaJIT and and LuaFFI would be one example. Swift would be another.)

Lua’s design is at least very clean and avoids allowing developers from accidentally creating nasty lifecycle design issues. It is very clear with Lua about ownership and visibility, which propagates cleanly through all aspects of Lua, such as working seamlessly with Lua’s garbage collection system. The design+code is also 100% portable to any processor/platform that has a C compiler (i.e. not requiring platform/arch specific #ifdefs, which is really useful when bringing up on a lesser-used platforms/chips.) And the design has very clear performance implications. In contrast, there was a recent talk at Google I/O about Android Dalvik vs. ARTs with respect to JNI bindings and performance. The results were non-obvious to say the least.

There is a research white paper that compared the C binding design tradeoffs of several different languages. http://www.inf.puc-rio.br/~roberto/docs/jucs-c-apis.pdf


I have embedded Ruby and Python before so I found that paper interesting to read. Wish there were mentions of languages I'm unfamiliar with (eg: squirrel, javascript). Anyway, I appreciate the design decisions and no doubt Lua has merits, but my personal experience is I haven't found it to be a pleasant language to use or embed.

Garbage collection is a controversial topic too, and if I understand correctly, a part of the reason Lua has a stack based API (with how its GC works). I somewhat liked the level of control I had with reference counting from working with CPython. Some applications may even go out of their way to disable the GC and use weak references to guarantee determinism similar to Apple's ARC. Squirrel, born as a competitor to Lua, looks to use a GC approach similar to Python at a glance..




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: