Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Run Python, Ruby, Node.js, C++, Lua in the Browser via x86 to WASM JIT (leaningtech.com)
171 points by apignotti on Nov 22, 2021 | hide | past | favorite | 31 comments



This is neat. I actually managed to run Bash and a custom executable (a 32bit compiled version of my LIL interpreter - that i compiled under 86box running Caldera OpenLinux from 1999, which i guess also shows how you can be backwards compatible if you pay attention to it :-P):

    $ /usr/bin/python3  
    Python 3.7.3 (default, Jan 22 2021, 20:04:44) 
    [GCC 8.3.0] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import os,sys,base64
    >>> os.system("/bin/bash")
    bash: cannot set terminal process group (-1): Bad file descriptor
    bash: no job control in this shell
    user@:/$ exit
    exit
    0
    >>> with open("/usr/lil", "wb") as f:
    ...     f.write(base64.decodebytes(b'...base64 encoded binary goes here...'))
    ... 
    53448
    >>> os.system("/bin/bash")
    bash: cannot set terminal process group (-1): Bad file descriptor
    bash: no job control in this shell
    user@:/$ cd /usr
    user@:/usr$ ls
    bin  foo  games  include  lib  lib64  libx32  lil  local  sbin  share  src
    user@:/usr$ ./lil
    Little Interpreted Language Interactive Shell
    # print [reflect version]
    0.1
    # 
Though it was kinda slow and hanged a few times during my attempts (my initial attempt was actually to try and put the entire Free Pascal installer there but pasting the base64 encoded string in there caused Firefox to eat all of the 32GB of RAM for some reason and made by system unresponsive :-P).

I did notice that reloading the page keeps some files around which makes me wonder, where are these stored? Local storage? I notice that there is 25MB of data stored for the site.


Any disk block which is used or modified is stored locally into an IndexedDB instance. You can find more info in the accompanying blog post: https://medium.com/leaningtech/cheerpx-using-webassembly-to-... ("Data Storage" section)


This is awesome! Thanks for providing more details (was a bit hard to notice your medium post at first) [0].

For folks here interested in doing this kind of thing (one example is for building web-available IDEs) the other way to run languages in the browser is to find implementations of the language in JavaScript like Brython for Python and there are a few Schemes that come to mind. I wrote a bit about this here [1]. Some people have taken this even further [2, 3] than I did.

And specifically both this OP post and all the links I'm talking about work differently than repl.it does because repl.it has a server where your code runs. Everything I'm talking about runs solely in your browser. Which makes compute free, with a bunch of limitations (io, basically).

[0] https://medium.com/leaningtech/cheerpx-using-webassembly-to-...

[1] https://datastation.multiprocess.io/blog/2021-06-16-language...

[2] https://github.com/fiugd/plugins/tree/main/.templates

[3] https://github.com/viebel/klipse


Thanks!

The cool thing about Cheerpx is that it's not limited to interpreters at all, it's a more general framework that can be used to run many Linux tools out of the box - we are currently experimenting with bash and some common commands, for example.


This is cool! But this suffers from multiple translations:

actual_cpu(browser_sandbox(wasm(x86_emulator(python_interpret(python source)))))

Is there some way to do a more direct JIT in wasm? Or maybe just a normal python interpreter written in wasm? Or would that not be faster?


The point of this technology is generality. It is theoretically possible, given enough work, run Python directly in Wasm. The effort would be, though, Python specific and you would need to start from scratch for every single language.

CheerpX allows running unmodified X86 programs, it does not matter if it's Python or something else. It does not even have to be a REPL, for that matter. We use this tech in production to run the legacy Flash plugin, for example: https://leaningtech.com/cheerpx-for-flash/


> given enough work, run Python directly in Wasm

It is already possible! See https://github.com/pyodide/pyodide


That's via emscripten. I suspect by "directly" they meant something like Python bytecode -> WASM. Which I think might be possible later, when some of the proposals[1] like GC, exceptions, etc, are done. That sort of fuzzy line where WASM is somewhat like ASM, and somewhat like a language VM.

[1] https://github.com/WebAssembly/proposals


> It is theoretically possible, given enough work, run Python directly in Wasm.

To be fair that's not just theoretical, the Pyodide project does exactly that:

https://pyodide.org/en/stable/

There have been several ports of Python to JS or to wasm over the years. Pyodide is the most mature today, but another interesting one is pypy.js which ported PyPy and even included a JIT!

But I agree that CheerpX allows running more things. There will always be a use case for "just run an x86 binary" when you don't have the source code. But for things where the source is available a proper port to wasm like Pyodide will be faster.


As pointed out by another user, I was actually referring to compiling Python _directly_ to Wasm. Porting the C++ _implementation_ to Wasm is obviously possible.


Well, Pyodide does port CPython directly to wasm. You're basically running CPython on wasm instead of on x86 - it's just another architecture. That's the natural thing to do I think.

But I see what you're saying, one could also theoretically write a new implementation of the Python language that uses wasm in a more "direct" way. We are exploring exactly that right now using wasm GC, compiling Java and Dart objects directly to wasm GC objects and so forth. That approach still has a lot left to prove, and it may not make sense for all languages, and specifically, I'm not sure if Python would benefit from it at all. So that is very much unclear.

In practice, when you want to run Python on wasm today, I think Pyodide's approach of porting CPython to wasm is the natural way, and it works well. That's what I was getting at.


This is quite impressive.

I would love to see if we can have something similar that doesn't require JS at all, so we can execute x86 programs server-side just using Wasm translation (hi Wasmer).

Here's another interesting project I found recently that I think fits as well on the asm2wasm translation mechanism: https://github.com/copy/v86/


The technology can certainly be adapted to non-browser & js-less VMs, but our current focus is the browser use case.

It seems to us that in server-side scenarios it is most usually possible to execute apps natively anyway.


I would love to see if we can have something similar that doesn't require JS at all, so we can execute x86 programs server-side just using Wasm translation (hi Wasmer).

Running user input on a server would definitely be a security risk, but might be nice when everyone is trusted anyway.

Though, if it's server side what would be the benefit of using WASM instead of just running it in the actual languages? I'm always on board for doing weird things just to see if they can be done, so if that's what you mean, I get it. I don't know much about WASM though, so I honestly don't know if there would be any practical benefits of running it server side.


> Though, if it's server side what would be the benefit of using WASM instead of just running it in the actual languages?

Assuming there's some usefulness on running software through Wasm for its universality and sandboxing properties. The answer for this would be two fold:

1. Compiling programs to Wasm natively is a challenge itself (Emscripten has lowered the barrier by a big margin, but I will argue that is still harder using Emscripten than Clang)

2. You may have access to the binary, but it's still hard to have the tooling needed to compile it down to native/wasm code (meaning: having the right environment, compiler, libraries, ...)

By using a asm2wasm translator layer we will solve both the compiler toolchain and the library/environment tooling needed for running programs universally and safely.


> Running user input on a server would definitely be a security risk

What is lower than container/vms? I was curious about those online sandboxes that are great, can fire code in there/runs it. I imagine they clean out the user input but if you were to assume the container/thing would fail... can it still be considered secure/separate like a glove.


This isn't working for me in firefox. The tab hangs, then crashes.


It loads fine for me, but has severe rendering issues. There is no text, just weird, colored diagonal lines.


If anyone else was confused and was too embarrassed to say it out loud, I was stubbornly typing exit() to try the other languages before I actually read the menu. You have to click on the different REPL names to switch between them. And this link automatically starts you in python3.

This is really great though. Are you planning on adding any other languages? or alternative REPLs? For Python, the option to run bpython or ipython would be fantastic. As far as other languages, I could see this as being a savior when I have to fix something written in PERL or PHP once every 3-5 years, and I need to relearn the syntax. Being able to choose arbitrary older versions of languages would be good for that use case too.

I was going to ask more questions, but then I looked around the site a bit. Is it fair to say this is essentially a demo for the CheerpX emulator? Which started as a proprietary way to run legacy flash applications on modern browsers? If so, then my next question is if there is any chance of some or part of this becoming open source?


CheerpX is really not just limited to REPLs, this demo was intended to showcase the technology and to measure the public interest for this specific use case. We don't plan to make this a product in itself, but we are happy to partner with a third party interested in doing so.

The next demo we are planning is actually more ambitious, we will drop users into a bash shell and give them control of the machine. The experience will be something like having your own zero-cost cloud machine always available and with total data privacy.

At this time we don't plan to make CheerpX open source, although this might change in the future. We are still working to figure out exactly what use cases we should bring to market first, so we prefer to keep our options open.


Very impressive stuff. Even after 20 years of developing, I still have trouble wrapping my mind about the complexities involved here.

If anyone hasn't read the blog post about this, it's a great read:

https://medium.com/@apignotti?p=3306e1b68f06


I would like to register the security issue: "import os; os.system('cat /etc/shadow')" demonstrates that without a kernel to check permissions (and without a user ID to be checked against), all filesystem safety guarantees go out the window. </joke>


Oh no! You have pwned a non-existing cloud VM :-)

More seriously though, you are right. The current implementation of the Linux syscalls interface does not enforce permissions and users. We plan to implement those correctly at some point.

This said, most use cases we are currently considering do not have any shared state and are mostly "throw away" execution environments. As such the missing access control checks do not seem to be a significant problem.


It's a shame that even the underlying compiler tech seems to be proprietary and intended to be commercially licensed once 'finished'. I feel like a lot of creativity (and optimization?) could be possible if tech like this were available to the commons rather than a years-long pre-announcement of a future commercial product.


Works fine in firefox. The backspace key doesn't work correctly, which is a fun throwback.


Hey, luajit is fun! It should be the only demo affected by the backspace issue (if it's not, please let us know). While all the other REPLs expect raw terminal input (with all the control characters, for example newline and backspace), luajit expects cooked - or line-edited - terminal input.

This means that we had to support keystroke echo and we had to remap ENTER to get the basic functionality, and that full line-editing support is needed to get complete functionality.


Oh, that's fun. You're right, I went straight for luajit and typed in some praise for Mike Pall. Everyone is weird..


Obligatory reference [0].

[0] The Birth & Death of JavaScript. https://www.destroyallsoftware.com/talks/the-birth-and-death...


No mobile version of chrome, Firefox etc though


Hopefully Safari will support this on one day.


The blocking feature is SharedArrayBuffer, but it should become available in the not-too-far future.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: