Hacker News new | past | comments | ask | show | jobs | submit login
What is WebAssembly? (2015) (medium.com/javascript-scene)
140 points by VeilEm on Jan 11, 2016 | hide | past | favorite | 66 comments



What's the difference between asm.js and WebAssembly?

"The initial implementation of WebAssembly support in browsers will be based on asm.js and PNaCl". PNaCl? https://en.wikipedia.org/wiki/WebAssembly , https://en.wikipedia.org/wiki/Asm.js

So is WebAssembly mere an asm.js v2 where all browser vendors agreed on a standard?


More specifically, asm.js can be seen as WebAssembly alpha, and will exist as a shiv until most browsers can run WebAssembly. Both WebAssembly and asm.js can be compiled through emscripten.


No, it's a (LLVM-based) bytecode representation. http://nativeclient.googlecode.com/svn/data/site/pnacl.pdf


WebAssembly doesn't actually have anything to do with LLVM. There's a LLVM backend that can generate it, but that's no different from how emscripten acts as a LLVM -> asm.js pipeline.


PNaCL is based on LLVM - if WebAssembly doesn't use that technology then what's the connection to PNaCL?


A lot of useful information from PNaCl has informed WebAssembly, and a bunch of the same developers are involved.


Note, this is from last June. It was submitted a few times, but apparently only one had a comment: https://news.ycombinator.com/item?id=9742556


I just want a web framework that works like QT or GTK, or even a framework as a desktop that can be styled/themed independently of function.

WebAssembly should give us opportunity for this.



No, that's not really what it's about. Perhaps you're think of something more like NW.js or Electron.


I think they mean a framework that is like QT to build web apps, not a web framework to generate native apps like QT.


I read this article several times, however I still have't understood how JS and WASM will be live together?

If I can use C/C++ with static typing(of course if I can do DOM manipulation from my C/C++ program) why I should use JS?


Why would you ever use any language over C/C++? Different abstraction levels, type systems (or lack thereof), ecosystems of libraries, even syntaxes all contribute to a decision in choosing a language.

You might not like JS personally but there are plenty of things to like about it and many other people do.

In practical terms, browsers will support JS forever for backwards compatibility even if all new code is written in WASM.


Personally I really like JavaScript and I use it many years.


What happens when a monopoly is broken? A healthy competition ensues. The "misunderstood" language will still be used by devoted fans, the rest of developers will be able to have their pick as it should have been to begin with.


There was an article on the history of Internet standards that I really liked that noted the most effective and adaptable standards had all essentially been API-like rigidly-defined-but-simple boundaries.

The relationship between WASM and JS seems to be "js/js virtual machines are a de facto standard, so let's not repeat the mistakes of Java plugins and instead build off that foot-in-the-door."


C/C++ doesn't have any sensible way to interact with the DOM and the JS GC heap directly, so that's one reason you'll want javascript code to stick around. Your website/application will probably need access to some JS libraries for things like analytics, as well.


You could mix and match modules and library perhaps.

You could make the same argument about any higher level language, even those with as many quirks as JS, like PHP ;)


Because using C or C++ for a website frontend would make your code 10x longer and more likely to have bugs.


Is this assertion actually based on any data?


FTR, Turbo Pascal also allowed you to write in assembler - the built-in assembly was well integrated with the language, too, doing things like referencing record offsets symbolically. If you could live with the limitations, it was pretty nice.


> No built-in automatic garbage collector

This is cool, because this means people will implement their own garbage collection mechanism, which could lead to interesting innovations.


> interesting innovations

Modern computers getting slow at 20 tabs of typical news sites articles is already pretty interesting. But I can't wait to see how can we innovate this even further!


> 20 tabs of typical news sites articles

I'm impressed you can read 20 articles at once! (Said only half jokingly. I do the same thing, but optimizing my behavior is probably the better solution to slowness)


That reminds me of an awully old joke: "I wish I had enough disc space for 2-gigabyte swap file. — Why the hell do you need a swap file so large? — I don't. I just need the space."


The set of people who have enough expertise to innovate in GC above and beyond what current browser JS engines do would fit in one small conference room. And most of those folks already work on browser JS engines. :)


This is absolutely true, and I wish I could send this to everyone who claims "just use [deferred] reference counting instead of tracing GC" :)


They plan to build a standardised garbage collector at a later stage so support other languages that require this so they can compile straight to web assembly.


1. If people can implement their own GC, then they can make it fit exactly their specific use-case. For example, a GC for Haskell might be totally different from a GC for JS.

2. If web-assembly incorporates a GC, then it will become needlessly complicated.

3. Unnecessary complication in web-assembly means also unnecessary room for security flaws.

(4. What good is a web assembly if you can't implement an efficient garbage collector in it?)


In addition to the other commenter mentioning it will use the JS GC of the browser for #2 (which has tons of manpower behind), the issue with #4 is that you would need threads w/ shared memory [1] which seems as far out as a native GC. Also, for #3, it's not "unnecessary" for a multitude of developers and languages. Web development without (at least optional) GC is going the wrong direction.

1 - https://github.com/WebAssembly/design/issues/104


> which has tons of manpower behind

What is wrong with open-sourcing GC code as independent libraries, so that web-app makers or (more likely) compiler developers (with less skill/time) can use them at will?

> the issue with #4 is that you would need threads w/ shared memory

But this is exactly what we need for other applications as well. For example, how would you send a large immutable data-structure across two threads? By copying it? Of course not, you just share pointers, meaning that the address space should be shared. If this is not possible, then that is a major flaw in WASM's design.

I think you have the impression that incorporating a GC into WASM makes life easier. But it doesn't. It is the exact opposite. WASM should be as simple as possible.


It would be fine if it was an independent library. I am not under the impression of incorporating a GC into WASM makes life easier. I am under the impression that having a GC available makes life easier. To think that you can magically recompile the existing GC's that V8 or WhateverMonkey are using to WASM is naive. Exposing the ability to request memory, add roots, and handle weak references, etc will make it more usable. Waiting until WASM has the ability to do threaded memory management at the same level as the existing GCs and have them rewritten (or refactored) for it may be too long. I would also support a polyfill approach where a GC shared-lib-sort of interface is exposed and tied to existing implementations at first to keep it out of WASM.


> To think that you can magically recompile the existing GC's that V8 or WhateverMonkey are using to WASM is naive.

I don't think it is. What I would like to see is that compiler writers keep control over their GC.

> Waiting until WASM has the ability to do threaded memory management at the same level as the existing GCs and have them rewritten (or refactored) for it may be too long.

I don't see the big problems. We already have assembly without GCs, and it is called VirtualBox (or VMWare). Why not use something like that? (Yes, there are still some security issues, but these can be solved much easier than when the hairy GC code becomes part of the game).

In other words, let's first create a real assembly language with a simple but adequate instruction set, and make it secure. This is what I would call "assembly language" anyway.


I think the value in having a standardized GC is simpler interop between wasm languages.

There's nothing that would prevent writing your own GC if you felt the need.


Not to mention the horror of inventing and deploying an efficient binary packed web execution format, only to slow things down even further in 5 years by seeing webpages load myGC_v1.2.5 + myGC_v1.3.1 + nuGC_v2.1.3 etc etc


This can be approached the same way as shared binaries. Just load the relevant libraries once, then cache.


WebAssembly's GC isn't just some arbitrary new GC, it's the JavaScript GC. The benefit there is so wasm apps can integrate with JS and the DOM directly the same way things like Python or Ruby plugins written in C integrate with those GCs. Nothing prevents wasm apps from using their own GC as well.


The built-in WebAssembly GC could just be the native JS GC, which is very, very good for most dynamic languages. There's really not that much difference between GCing JS, Lua, Python, and MyDynamicLanguageWithRubySyntaxAndLuaSemanticsAndNewFeatureX. I don't see why they'd need to do anything other than just expose the GC that's already there. Browser makers are smart enough to know "invent new GC that solves every conceivable language's memory problems" isn't going to work.

Since WebAsm is still Functional languages would still have the freedom to implement their own more efficient GC (avoid the write barrier in a lot of cases, scan only the young heap most of the time, etc).


> There's really not that much difference between GCing JS, Lua, Python, and MyDynamicLanguageWithRubySyntaxAndLuaSemanticsAndNewFeatureX

This sounds a lot like "640k ought to be enough for everyone" :)

Also, it only addressed my first point.


> 4. What good is a web assembly if you can't implement an efficient garbage collector in it?

There's nothing stopping you from implementing your own GC. But having a standardized GC (the JS engine's GC) means much simpler interop between wasm languages.


I read the news back in June, but haven't followed WebAssembly since, any interesting developments lately?


The V8 team recently announced that they will be implementing WASM. https://groups.google.com/forum/#!topic/v8-users/PInzACvS5I4


>A new language: WebAssembly code defines an AST (so does JavaScript) represented in a binary format. You can author and debug in a text format so it’s readable.

Heh, since every binary executable has a non-compiled source code somewhere out there, can we call them readable too?


I don't think that's what they're saying; they're saying that wasm has a somewhat-readable textual representation. Not a precompiled/decompiled program, a one-one mapping from the AST to the text format. Or something.

Think of it like how assembly is to machine code, but probably more readable.


The intent is exactly like assembly/machine code, though they haven't really nailed down the format yet. Current implementation(s?) just use s-exprs, though I don't think they want that to be the final form.


I am still a little skeptical about this new language. It is not so hard to to write optimizable JS. And when you need a real performance boost, most of such tasks are parallelizable, so you can use low-level WebGL or WebCL.


This isn't meant to replace the code you're writing, but rather serve as a target architecture for compilers. It may also end up speeding up your JavaScript if you compile it down, but the biggest difference will be to the current "compile-to-javascript" languages (TypeScript, Elm, Dart, etc.)


No browsers support WebCL. I believe Mozilla has stated that they prefer to wait to get WebGL compute shaders so that they can end up with one system to maintain instead of both.

In my experience, carefully optimized JavaScript is still 10x slower than equivalent C++. C++ to JS brings that down to 2x.


The overhead of creating and transferring data between web workers can be very limiting, especially when we are talking about low level, high performance operations.


Sounds like it's opening up a whole new world of potential exploits.


No more than the typical bounds checking done for asm.js byte buffers. It's nothing like NaCl.


Another thing to block in my browser I suppose.


Why? It's functionally equivalent (at least for now) to asm.js, which is just a subset of JavaScript.


This sounds great, but it's going to make it so, so much easier to spy on people and hide all kinds of nastiness in web applications.

Worse is that tons of web applications that get exploited won't be understood by the people who run them. Wordpress is a total minefield right now and it gets hacked all the time. What'll happen once they have a module that compiles up web assembly output for all the other modules running to speed things up?

EDIT: wordpress the thing you install yourself, not wordpress the hosted app


First:

WebAssemly will have a readable text format for view-source functionality: https://github.com/WebAssembly/design/blob/master/TextFormat...

Second:

"...so much easier to spy on people and hide all kinds of nastiness in web applications."

What kind of spying and nastiness? WebAssembly will not have more access to your data than the JavaScript API. It will have access to the same API as JavaScript. For example if it wants your location it will call the same HTML5 location API and with the same restrictions, with the permission popup. Or setting cookies will also call the same API. And thanks to the built-in developer tools in browsers you can check the outgoing requests to see what is sent and to where.


Readable text format at a much lower level of abstraction than javascript currently is at right now. Different is different.

It's not so much about spying on your computer and gaining extra access, it's obvious (short of implementation bugs) that you won't gain any additional privileges that way.

But what you will gain is a way to obfuscate extremely well "report such and such to some webserver" in a way that's difficult to detect. For example, you can hide the entropy inside of a fairly innocent looking URL and without a lot of digging you won't know what that entropy represents. It can look like just a plain jane resource request and the webserver can serve up the exact same resource no matter what the entropy is, but also record that entropy for a back-channel way of exfiltrating information from your browser.

Finally, it opens up a whole new world of compiler attack. Right now the attacks against wordpress involve writing some information into a file and making it look "weird but I don't know what it does so I'd better not touch it".

What happens when breaking into a wordpress install means that you can execute the equivalent of the untraceable compiler login exploit insertion attack? You can't perform this attack without 1) a compiler and 2) a low level target that's hard to understand. You don't even need to perform a stage 3 attack which is the most sophisticated, a stage 2 would do fine.

https://www.ece.cmu.edu/~ganger/712.fall02/papers/p761-thomp...


By this logic, you could argue that because the code is compiled it will be hard to understand and find exploits.

I don't think this is going to be an issue, or be any different than where we are now.


How is this at all different than now?


Right now most javascript is readable-ish and if you see something totally unreadable, that's a good indication it's probably malware.

Once you have machine code that's not terribly human readable it gets a lot easier to hide things.


if you see something totally unreadable, that's a good indication it's probably malware.

That's definitely not true. Minified JS is everywhere.


I take it you don't know what a minifier is or how it can save upwards of 50% of your payload size.


I absolutely do, but thanks for being condescending nonetheless.

Read C source. Then go read the machine code that C compiles into. I assure you that the C is far, far more readable even if you've HEAVILY obfuscated it.

In order to better understand this, I present you with a StackOverflow answer: http://stackoverflow.com/a/331474

This:

int get_int(int c);

int main(void) {

    int a = 1, b = 2;

    return getCode(a) + b;
}

Might yield this:

00000000 <main>:

int get_int(int c);

int main(void) { /* here, the prologue creates the frame for main /

   0:   8d 4c 24 04             lea    0x4(%esp),%ecx

   4:   83 e4 f0                and    $0xfffffff0,%esp

   7:   ff 71 fc                pushl  -0x4(%ecx)

   a:   55                      push   %ebp

   b:   89 e5                   mov    %esp,%ebp

   d:   51                      push   %ecx

   e:   83 ec 14                sub    $0x14,%esp

    int a = 1, b = 2; /* setting up space for locals */

  11:   c7 45 f4 01 00 00 00    movl   $0x1,-0xc(%ebp)

  18:   c7 45 f8 02 00 00 00    movl   $0x2,-0x8(%ebp)

    return getCode(a) + b;

  1f:   8b 45 f4                mov    -0xc(%ebp),%eax

  22:   89 04 24                mov    %eax,(%esp)

  25:   e8 fc ff ff ff          call   26 <main+0x26>

  2a:   03 45 f8                add    -0x8(%ebp),%eax
} / the epilogue runs, returning to the previous frame */ 2d: 83 c4 14 add $0x14,%esp

  30:   59                      pop    %ecx

  31:   5d                      pop    %ebp

  32:   8d 61 fc                lea    -0x4(%ecx),%esp

  35:   c3                      ret
I don't know why people find this notion that web assembly probably will make it easier to hide nefarious payloads so offensive. It's demonstrably true! People find out about open source projects "calling home" much, much faster than they do closed source projects.

Go look at the spec. It's at a much lower level of abstraction than JavaScript is. https://github.com/WebAssembly/design/blob/master/AstSemanti...


There's no real size difference if you gzip it at the webserver.


That is incorrect. Theoretically, it should be true, but in practice it is not. You get the most savings by minifying and gzipping.


In practice, the difference has been negligible, and I've seen it be entirely absent. Chasing small fractions of a percent of total CSS file size is a waste of time and effort.


This is already true for plenty of "compile-to-javascript" languages. The textual format of wasm should be as easy to read as the output of those languages, if not easier.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: