Hacker News new | past | comments | ask | show | jobs | submit login
PyPy.js – PyPy in Your Browser (talkpython.fm)
126 points by mikeckennedy on Nov 9, 2015 | hide | past | favorite | 73 comments



Instead of yet another .js to run * in the browser, maybe we should return the web and browser back to it's pure days of marked up text documents with images and move "web apps" to a generalized, native runtime designed explicitly to run dynamically fetched untrusted code. Might be better than the current mess.

Edit:

And to all those thinking "but that's the browser", that's my point. We have the browser and while it works, it's incredibly broken in a lot of ways. I'm proposing a system designed from the ground up to do the job that the browser has currently been hacked and add-on'd into doing.


> a generalized, native runtime designed explicitly to run dynamically fetched untrusted code.

You mean like a browser?


Yeah, except not completely broken with shit like "let's run python in the browser with javascript!" and "let's animate UI components originally designed to be static with an overloaded scripting language!" and a million other things. The browser and web were originally designed to view webpages (read documents) and not run arbitrary untrusted code...

tl;dr you are making the point I refuted in my original post.


Browsers have evolved into exactly what people wanted Java for all along. Except it evolved from the wrong base. Kind of like if an elephant evolved to have intelligence and communication like ours instead of monkeys. That's pretty much the DOM & JS.


When you look at browsers, web, html, http ... the whole stack. It's pretty horrible. How did we get to this point?

...could be worse. At least it's not E-mail.


There's always Notes.. ;-)

Honestly, I don't think the browser is so bad. And once wasm takes hold and tooling catches up, you'll have lots of options.


>> It's pretty horrible. How did we get to this point?

Because ads... are more important to get right than other stuff /sarcasm


I think taking this sentiment to its conclusion is a bit more of a drastic change than that.

It would mean the end of the web, not a return to its beginnings. You wouldn't be dealing with marked-up text, you'd be dealing with binary files (text markup, of course, being a subset thereof). The internet (and whatever emerged on top of it to replace the web) would then be concerned with content, sharing, and identity management only, and applications would be built on top of that.[1]

The trick, as you said, is how to securely run untrusted code. That's what the web (well, I suppose more technically, browsers) are trying to solve; the problem is that the same origin policy, which is at the core of web security, just doesn't work very well. There are a lot of examples where this breaks; here are three off the top of my head:

1. Tracking using javascript to look for unique browser identifiers

2. XSS, CSRF, etc

3. Legitimate cross-site communication

So clearly our existing solution to the "how do you securely execute dynamic, untrusted code" problem isn't cutting it. There's a lot of research (and production code, for the record) going into potentially better approaches though, so I think it's likely we'll see change, if we can get past the inertia behind today's hairball.

[1] Incidentally this is exactly the approach we're taking with implementations of the Muse protocol (https://github.com/Muterra/doc-muse), which uses encryption to provide private content/sharing/identity management on untrusted servers.


While I completely agree, since it will never happen, why not strip javascript out of a browser (FF?), come up with a new catchy name, and market it as a safe browser. Just need a catchy name..., hmm, maybe Lynx? ;


Lynx actually had its fair share of vulnerabilities due to bugs in the parsing code. Ain't that easy.


NetSurf?


Links, perhaps...


Links used to support JavaScript some time ago [1].

[1]: http://links.twibright.com/user_en.html#ap-javascript


>maybe we should return the web and browser back to it's pure days of marked up text documents with images

Good luck with that. You should bring back Gopher while you're at it.


Honestly, Gopher is clean and pleasant.


That would be fantastic.


I think WebAssembly is more or less going to solve this problem in the long run.


You mean like Flash and Java?


Its ironic.


No, he means a properly done sandbox but native and 2D accelerated platform without the DOM cruft.

Not sure why it's so hard, probably because it's in nobody's interests but the users (as every major just wants to promote their own platform and have the web play second fiddle to it).


Doesn't sound so different from Java applets.


The core concept, maybe.

Then again, a top tier BMW doesn't sound so different than an Edsel either. They both have internal combustion engines, 4 wheels, a steering wheel, and they take you from place A to place B in similar speeds (in orders of magnitude).

That's because you're focusing on the similarities which don't say anything -- what's important in all cases are the differences.

Which in this case are mostly about the implementation. - the UI wouldn't be a part of a page, it would be THE page. - it shouldn't be an over-engineered mess of a GUI toolkit like Swing. - it should offer modern and capable widgets, instead of the bizarro, limited and uncanny value one's that Swing offers (if you don't get into intricate customization of your own). - it should be faster than the web app it replaces -- applets were often slow to load AND slow to run. - it shouldn't be controller by a single company. - it shouldn't be proprietary. - it should, if possible, cater to many programming languages. - it should play natively with the HTTP model, AJAX etc.


So like a canvas element then?


In the sense that Cocoa or Windows top level window hierarchy is also like a "canvas element".

In other words, like a canvas element which has a full, comprehensive, standard UI library with tons of standard, pre-built widgets (not just the meagre web form controls), including a full featured text widget with all the trimmings, plus a free drawing component of its own (for totally custom UIs and controls).


Maybe WebAssembly and WebComponents will finally fix this.

However I started to feel the same way a few years ago when the HTML 5 wave started, after several years of doing web development as well.


What you're describing is exactly what Java Applets were designed to do 20 years ago.

And yet they never achieved a fraction of what "the current mess" has achieved.


I think java apps failed to achieve more mainly because they didn't have decent access to the DOM, and the HTML rendering engine that goes with it.

IMHO that's the killer app of the browser... for all it's faults, the HTML+CSS rendering engine is a superior UI description api than anything previously devised.

I remember the brief period I wrote some java apps. The main pain was choosing between rendering a native java gui, or establishing hooks to have it instrument an html interface from behind the scenes. The native gui had better connection to the code... whereas using the DOM was insanely flexible, but talking to the DOM was like trying to drive a car by reaching through the tailpipe.

That I think is why java apps never prospered (outside of the whole "loading" issue... which could have been optimized away if the motivation was there).

---

JS currently has the best of both worlds, and not because of anything native to the language (though it does have many positive points).

I feel like WebAssembly, and things like PyPy.js, are a step in the right direction. Making the DOM and HTML+CSS just another ui library available to the language of the moment. Having PyPy.js is great, because (I assume) where PyPy's python goes, all the other pypy interpreters (Hippy, etc) can follow.


> IMHO that's the killer app of the browser... for all it's faults, the HTML+CSS rendering engine is a superior UI description api than anything previously devised.

Except it's horrifically slow. It was never designed for widespread dynamic update, and to work around it we have to go to great lengths like having a virtual DOM. Combine that with the widespread incompatibility of different implementations, and I have trouble seeing it as a good UI framework at all, let alone a superior one.


As a communications platform, sure. As an application platform, I'm not sure I agree. I have used many actual applications packaged as .jar files. They're generally more featureful and performant than we have come to expect from web apps[1]. As a platform-independent, single-file distribution format, Java actually works very well.

Java wasn't ever intended to compete with the Web; it was the Web that got ideas.

[1] I know, citation needed. Okay. Show me the web app that looks like this, that doesn't kill my EeePC: https://upload.wikimedia.org/wikipedia/commons/2/27/Gephi-07...


Honestly, the things people did with applets probably haven't been replicated by JS in the browser, especially when native got involved (if I'm remembering correctly).


Nobody is stoping you from making a browser that doesn't run JS at all.. nobody will use it, but nobody is stopping you from doing it... there's several decent rendering codebases you could strip JS out of and disable by default. Of course you won't be able to get to most content, but hell, it will be PURE.


I agree that there is a mess, but what kind of mess is it? What about it makes it a mess? Is perhaps because the JS interpreter is embedded (the DOM calls it)? Is it that the DOM is poorly suited for what we used to call "rich" internet applications? Or is it Javascript itself that is the problem?


For starters, the DOM API is horrible.


Beyond the unpleasant experience of using it, what is actually wrong with it?


The main problem with the DOM API is that when used from JS it's a pain, because it was designed largely by people wanting to do DOM in XML documents in Java on the server side; the use on the web was considered less important at the time. Many of the resulting issues (e.g. lack of use of options objects when it would be appropriate, numeric "enums" etc), flow from this historical accident.


Very easy to have all kinds of memory leaks right now for one. It is really a combination of Javascript and the DOM. Unpleasant to use is definitely a big thing too. If it was easy and pleasant, nobody would complain.


Is it really easier to have memory leaks here than in any other garbage-collected language with first-class functions and closures? If you stick references to objects somewhere permanent, you will leak them. If you don't, you won't....


Ok, it's not that simple:

http://www.javascriptkit.com/javatutors/closuresleak/index3....

The last example is a good demonstration that. Sure, at the end of the day it's all about not cleaning the references. All I'm saying is that it is super easy to accidentally not clean the references. In other languages or even in javascript itself without the DOM, it's not nearly that easy.


The exhibits on that page seem to be quoted from https://msdn.microsoft.com/en-us/library/bb250448%28v=vs.85%... which is a 10-year-old article that was marked as obsolete 4 years ago. And it's specific to old IE versions; the problems are fixed in newer IE versions. And even in IE6 as of 2012; see https://support.microsoft.com/en-us/kb/929874

Back in 2005 Gecko (Firefox) had issues like this too, but a cycle collector was added in 2006 that handles this sort of problem.

For DOM nodes issues like this basically don't happen in modern browsers.

There _are_ still some existing bugs of this sort in browsers for objects other than DOM nodes (especially WebKit and Blink; modern IE and Gecko are way ahead here, though Blink is actively working on fixing this in their implementation). But you typically have to work quite a bit to hit these cases; all the common cases are handled correctly because otherwise pretty much any web site you left open for an hour or two would use up all the RAM on your machine.

I do agree that if you have to worry about compat with old/unpatched browser versions then you run into some serious problems with this sort of thing. If you have to deal with that, you have other serious problems too, of course.

And, again, the source of all the "leak while the website is live" behaviors I have seen recently is websites doing things like sticking more and more stuff in global arrays. That's just a basic no-no in a garbage collected language, obviously.


IIRC it was particularly hard for MS to improve on (got much better starting with IE8)... I worked on an extjs app around 09 that would just eat itself in IE6 (the corporate browser at the time) if the app was open more than 3-4 hours. It was pretty bad.. FF didn't have any issues at the time, and could run open all day. Even then, still much better than the late 90's, when the NN/IE4 browsers were still hanging on, and needed to be supported... that was anti-fun.


Given any abstraction on a turing machine you will find it is easy to have memory leaks.


WebAssembly is a generalised, native runtime for dynamic untreated code. You can do things that wouldn't be possible in JS.


Are you perhaps talking something like seif project from Crockford/Paypal?


seems like the browser is becoming another virtual machine.


I may get criticism from some, but sometimes I wish we did get other languages supported by most browsers such as maybe Lua, of course Python, as well as maybe a few others as an alternative to the JS. It would be useful for those whose workflow doesn't fully involve JS.


WebAssembly should be a game-changer for that by removing the current need to compile and actually run JavaScript:

https://github.com/pypyjs/pypyjs/issues/145


In its Pycon 2015 presentation, the main author (from Mozilla) answers why isn't so simple to just include python in FF: https://youtu.be/PiBfOFqDIAI?t=1611


What's sad is that Firefox already had Python working inside <script> tags (for addons only, though) back in 2005: https://bugzilla.mozilla.org/show_bug.cgi?id=255942


A few notes on that implementation, since I have some firsthand experience with both the implementation and its removal:

1) "For addons only" is an important qualification. It means you don't have to worry about sandboxing at all, since addons are already all-powerful (or at least were at the time).

2) It added a _lot_ of complexity throughout the DOM. There are tons of places in the web platform that basically assume you have a "JS value" and things fall apart if that's not the case. Most simply, what happens if python code sets document.body.onclick and then JS code reads it, or vice versa? It's possible that this complexity and the resulting performance impact is worth carrying around if the use cases are compelling enough, but they just weren't.

3) The python implementation in question did not have a good solution for the problem of cross-language cycles. Those were solved for JS and C++ in Firefox by creating a cycle collector that tightly integrates with the JS GC and the refcounting system the C++ side uses. But that was only possible due to full control over the JS GC implementation (and hence the ability to change it to flag objects as "only owned by C++ stuff" and whatnot). Doing the same thing with Python would be a nontrivial undertaking, just for the C++ and Python bits. Dealing with cycles that involve all three languages would be _quite_ hard.



    <script type="text/vbscript">
       Sub link_onClick
          ' Do some processing!
       End Sub
    </script>


What's funny, is I wrote and used a lot of JS in classic ASP... the only PITA was dealing with COM iterators in JS was just not intuative at all. But it did save me a lot in being able to reuse validation logic client and server.. it also allowed for some nice serialization/communications... probably why I like node so much now.. it's just less of a disconnect...

Closest I've come to JS client and server was with ActionScript via Flex/Flash and VB.Net on the server... having XML literals on both sides was really nice, though I prefer JS on both ends... I don't miss C# much at all.


I had done a little bit of vbscript back in the day when it was on IE. I probably just copied and pasted sample code though.


How does it compare with Brython? http://www.brython.info/


PyPy.js literally starts with PyPy compiled to asm.js, so you intrinsically get relatively comprehensive Python compatibility right out of the box. Brython, as I understand it, is an independent attempt at implementing a subset of Python in JavaScript.

Edit: rfk addresses this directly around 46:00 in the interview. Paraphrased: "The big trade-off Brython does to get good performance is the number model. Things there map pretty directly to JavaScript. But if you overflow or do something different, you suddenly won't get Python's semantics. If you're writing new code for Brython, it's a good set of trade-offs, and in some ways you get tighter integration with the browser and the DOM. But if you're trying to take some existing code off the shelf and run it... you're more likely to run into issues."


However, he is wrong about the state of Brython https://groups.google.com/d/msg/brython/NRHEaoxAfgw/xMipOAYN...


He addressed Brython at PyCon '15. I don't see the slides online, but the video is at https://www.youtube.com/watch?v=PiBfOFqDIAI.

Abstract:

> PyPy.js is an experiment in building a fast, compliant, in-browser python interpreter. By compiling the PyPy interpreter into javascript, and retargeting its JIT compiler to emit asmjs code at runtime, it is possible to run python code in the browser at speeds competitive with a native python environment. This talk will demonstrate the combination of technologies that make such a thing possible, the results that have been achieved so far, and the challenges that still remain when trying to take python onto javascript's home turf.

> We'll cover: an overview of PyPy and why it's a good fit for this type of project; an introduction to asmjs and the rise of javascript as a compile target; what it looks like when you smoosh these two technologies together; a comparison with other approaches such as brython and PythonJS; and some concrete suggestions for how the result might be useful in practice.


I had a conversation about Brython vs PyPy.js on Reddit 6 months ago[1].

The short of it is that I wouldn't trust Brython to handle edge-cases that PyPy.js will, and that PyPy.js is at least as fast for me when warmed up. That said, I've lots of praise for the Brython project, and it's a way lighter dependency (both in size and warm-up times).

[1] https://www.reddit.com/r/Python/comments/33m7io/comparing_th...


In the future we might be able to compile to Web Assembly.

https://github.com/WebAssembly/design/blob/master/README.md


I'm amazed that it works, but not sure it's all that useful. Python is OK, but it's not really much better than Javascript for the things people do client-side.

Why do so many people hate the DOM? You need some kind of representation of the page, and the DOM is just a tree representation of it. It's not hard to access or modify from a program.

And why this sudden enthusiasm for running programs in the browser? Is it because Java applets and Flash have finally been removed from most browsers, and people needed those capabilities?


It comes down to people do need to do things in a browser context... Java and Flash just had horrible security track records. In the end, if Adobe had made flash more standardized. If they could have gotten the format closer to xapp/silverlight packages, around JS, SVG and MP3 native in the browsers, they could have kept making money on the tooling, and been better off. That was my hope when Adobe bought out Macromedia... didn't work out that way.

In the end, people are learning what works best... In the end it depends.. I think it comes down to being that people want cleaner application development targetting the browser.. I think modern node/babel/webpack apps are much better as a workflow (I also like React). However, I don't begrudge people wanting to target other environments.. I think source maps go a long way to making it nicer to do in practice.

I'm also appreciating the resurgence of more FP concepts over OO everywhere... Outside of controls/components, I don't think classes work well for most JS needs.


Java and Flash were supposed to be sandboxed, but weren't. Is there a convincing demonstration that "asm.js" is more secure? Or is it just that nobody has done a big "asm.js" exploit yet. "asm.js" exploits have been found.[1][2] There are probably others already being exploited.

[1] http://www.scip.ch/en/?vuldb.12180 [2] https://www.mozilla.org/en-US/security/advisories/mfsa2015-2...


asm.js should have the same sandbox model that JS in general has. Which, while not perfect, has been a bit better than Flash/Java... Removing all programatic functionality from the browser simple isn't a reasonable option. And imho it's better than the binary plugin option.


Seriously... do we really need to invent yet another implementation of Python that runs on the web in Javascript? I mean it with heart. How about we just have one language and one implementation forget about it? Wouldn't that be super ideal?


Even if it's not extremely useful it's an interesting project. People tend to like doing interesting things for fun.


Per comments above, pypy.js is not presented as an trivial but interesting thing, but as something useful.


It's easy to see how this could be useful for somebody though.

Imagine you're a scientist, and you have a large, pure python codebase - let's say for intelligently predicting the number of foos in a bar - and you want to make that available via the browser, in order to visualize the progress of your foo-hunter.

This makes it a trivial task - and means you can use cool web-based visualization tools (D3, etc), and combine with other useful APIs.


I think most scientific apps written in Python rely on native non-pure-python (written in C, Fortran etc...) modules which I'd expect to not work in this context.


If you wanted to go crazy you could use Emscripten but I could easily see pushing the C+Python parts to a server-side web service and then using Pypy.js for the rest.


Except that for historical reasons, js is a very rushed/badly designed language. More precisely it mixes moment of brillance and very stupid design choices.

Ruby, Python, Lua are in term of language design way superior. Coffeescript and co emerges because it was more reliable to have a complex stack than to use the JS language directly.

That's the only high-level language that produced this situation. Python/Ruby & co are not perfect but good/reliable enough. You don't program in a another language that compile to python to avoid its verbosity and its traps.

If python had been choosen for the browser, we would have used for the last 15 years : module, sane scoping rule, classes, iterators, etc. Yep, that's something that it ecma6 without the ugly baggage.

15 years, damn it.

So no, please give at least Python in the browser and let people experiments with other language design choices and let them find something superior to Python. At one moment, it will be obsolete like any other tool. I don't want to be forced to use it because it is the only choice.


> Wouldn't that be super ideal?

No!

Languages (and language implementations) have different strengths and weaknesses. No one language may hope to serve all needs equally well. While for what you do your single language may be sufficient, there are people and use cases where it's better to use something different.

I'd rather embrace the diversity than try to artificially limit available choices.


I wonder if this is a toy/technology demo, or a really solid Python execution environment.

Tech demos/toys are an interesting distraction for a minute. Deep efforts to build a full execution environment warrant deeper attention.


A browser is not a decent platform for anything.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: