> Generating audio samples via JS is an escape hatch, the same way rasterizing to a raw framebuffer is an escape hatch.
But that escape hatch is where all the interesting innovation happens! It's great that canvas exists, and it's much more widely used than WebGL is, because it's more flexible and depends on less legacy cruft. You don't have to use it, but I do, and I'd like a "canvas for audio" too.
To make matters worse, the Web Audio stuff is much less flexible than OpenGL. You can at least write more or less arbitrary graphics code in OpenGL: it's not just an API for playing movie clips filtered through a set of predefined MovieFilter nodes. You can generate textures procedurally, program shaders, render arbitrary meshes with arbitrary lighting, do all kinds of stuff. If this were still the era of the fixed-function OpenGL 1.0 pipeline, it'd be another story, but today's OpenGL at least is a plausible candidate for a fully general programmable graphics pipeline.
Web Audio seems targeted more at just being an audio player with a fixed chain of filter/effect nodes, not a fully programmable audio pipeline. How are you going to do real procedural music on the web, something more like what you can do in Puredata or SuperCollider or even Processing, without being able to write to something resembling an audio sink? Apple cares about stuff like Logic Express, yes, but that isn't a programmable synth or capable of procedural music; while I care about is the web becoming a usable procedural-music platform. One alternative is to do DSP in JS; another is to require you to write DSP code in a domain-specific language, like how you hand off shaders to WebGL. But Web Audio does the first badly and the 2nd not at all!
> Audio is a solved problem outside JS
Yeah, and the way it's solved is that outside JS, you can just write a synth that outputs to the soundcard...
WebGL is less used than Canvas for the most part, because 3D and linear algebra are much more difficult to work with for most people than 2D. Also, people work with raw canvas image arrays much more rarely than they do the high level functions (stroke/fill/etc)
OpenGL was still a better API than a raw framebuffer even when it was just a fixed function pipeline. Minecraft for example is implemented purely with fixed-function stuff, no shaders. It isn't going to work if done via JS rasterization.
Yes, there are people on the edge cases doing procedural music, but that is a rare use case compared to the more general case of people writing games and needing audio with attenuation, 3D positional HRTF, doppler effects, etc. That's the sweet spot that the majority of developers need. Today's 3D hardware includes features like geometry shaders/tessellation, but most games don't use them.
OpenSL/AL would work a lot better if it had "audio shaders". Yes. But if your argument is that you want to write a custom DSP, then you don't want Data Audio API, what you want is some form of OpenAL++ that exposes an architecture neutral shader language for audio DSPs, that actually compiles your shader and uploads it to the DSP. Or, you want OpenCL plus a pathway to schedule running the shaders and copying the data to the HW that does not involve the browser event loop.
That said, if there was a compelling need for the stuff you're asking for, it would have been done years ago. None of the professional apps, nor game developers, have been begging for Microsoft Direct Sound, Apple, or Khronos to make audio shaders. There was a company not to long ago, Aureal 3D, which tried to be the "3dfx of audio", but failed, but it turns out, most people just need a set of basic sound primitives they can change together.
I have real sympathy for your use case. For years, I dreamed of sounds being generated in games ala PhysX, really simulating sound waves in the environment, and calculating true binaural audio, the way Oculus Rift wants to deliver video to your senses, taking into account head position. To literally duplicate the quality of binaural audio recordings programmatically.
But we're not there, the industry seems to have let us down, there is no SGI, nor 3dfx, nor Nvidia/AMD "of audio" to lead the way, and we certainly aren't going to get there by dumping a frame buffer from JS.
Right now, the target for all this stuff, Web GL, Web Audio, et al, it exposing APIs to bring high performance, low latency games to the web. I just don't see doing attenuation or HRTF in JS as compatible with that.
I agree that for games the market hasn't really been there, and they're probably served well enough by the positional-audio stuff plus a slew of semi-standard effects. And I realize games are the big commercial driver of this stuff, so if they don't care, we won't get the "nVidia of audio".
I'm not primarily interested in games myself, though, but in computer-music software, interactive audio installations, livecoding, real-time algorithm and data sonification, etc. And for those use cases I think the fastest way forward really just is: 1) a raw audio API; and 2) fast JS engines. Some kind of audio shader language would be even better perhaps, but not strictly necessary, and I'd rather not wait forever for it. I mean to be honest I'd be happy if I could do on the web platform today what I could do in 2000 in C, which is not that demanding a level of performance. V8 plus TypedArrays brings us pretty close, from just a code-execution perspective, certainly close enough to do some interesting stuff.
Two interesting things I've run across in that vein that are starting to move procedural-audio stuff onto the web platform:
There are already quite a few interactive-synth type apps on mobile, so mobile devices can do it, hardware-wise. They're just currently mostly apps rather than web apps. But if you can do DSP in Dalvik, which isn't really a speed demon, I don't see why you can't do it in V8.
Edit: oops, the 2nd one is in Flash rather than JS. Take it instead then as example of the stuff that would be nice to not have to do in Flash...
But that escape hatch is where all the interesting innovation happens! It's great that canvas exists, and it's much more widely used than WebGL is, because it's more flexible and depends on less legacy cruft. You don't have to use it, but I do, and I'd like a "canvas for audio" too.
To make matters worse, the Web Audio stuff is much less flexible than OpenGL. You can at least write more or less arbitrary graphics code in OpenGL: it's not just an API for playing movie clips filtered through a set of predefined MovieFilter nodes. You can generate textures procedurally, program shaders, render arbitrary meshes with arbitrary lighting, do all kinds of stuff. If this were still the era of the fixed-function OpenGL 1.0 pipeline, it'd be another story, but today's OpenGL at least is a plausible candidate for a fully general programmable graphics pipeline.
Web Audio seems targeted more at just being an audio player with a fixed chain of filter/effect nodes, not a fully programmable audio pipeline. How are you going to do real procedural music on the web, something more like what you can do in Puredata or SuperCollider or even Processing, without being able to write to something resembling an audio sink? Apple cares about stuff like Logic Express, yes, but that isn't a programmable synth or capable of procedural music; while I care about is the web becoming a usable procedural-music platform. One alternative is to do DSP in JS; another is to require you to write DSP code in a domain-specific language, like how you hand off shaders to WebGL. But Web Audio does the first badly and the 2nd not at all!
> Audio is a solved problem outside JS
Yeah, and the way it's solved is that outside JS, you can just write a synth that outputs to the soundcard...