Hacker News new | past | comments | ask | show | jobs | submit login
Using the Web Audio API to Make a Modem (martinmelhus.com)
307 points by matsemann on Oct 14, 2017 | hide | past | favorite | 63 comments



Having worked on a similar project, my experience is that signal-to-noise is a very challenging problem when you have an air gap. Inevitably, it opens up room for minute errors caused by accoustics. The only "reliable" workarounds I devised were to increase broadcast duration and rebroadcasting the same data with simple checksums.

This was what I came up with: https://github.com/Ravenstine/airsocket

Note it's extreme crappiness is due to a combination of my inexperience and also my desire to build as many parts from scratch as possible. The Web Audio API is only used to play audio, but the oscillator and decoding functions were written by me. It barely works. It's probably better to go full Web Audio all the way since I'm sure there's hardware acceleration(particularly with AnalyserNode). The author's method of encoding bytes is also far superior to what I came up with.

I wish such an article was published when I was trying to devise my own audio modem. He does an excellent job going over every step and explaining what's going on. Documentation and tutorials around Web Audio used to be too convoluted, with basic examples too few and far between.


Even better than rebroadcasting is Forward Error Correction[1].

It's the basis of pretty much all modern radio based telecommuncation protocols(WiFi/3G/LTE/etc).

Basic FEC can go a long way, esp if your packet sizes are large, latency is high or you have a lot of channel congestion.

[edit]

To give you an idea how awesome FEC can be quite a few ham operators do EME(Earth->Moon->Earth) contacts via JT65 with FEC[2].

[1] https://en.wikipedia.org/wiki/Forward_error_correction

[2] https://en.wikipedia.org/wiki/Earth%E2%80%93Moon%E2%80%93Ear...


I have a WebAudio modem that works reasonably well, and I went a different direction with it. Mine uses a C DSP/SDR library with good error correction, run through emscripten to generate the samples.

You can try it here https://quiet.github.io/quiet-js


Actually, I think it was your project that gave me the idea to attempt near-ultrasonic transmission. I could only get to ~15000hz before it would just not work and I never figured out why that was. Maybe there's a flaw in how I used the Goertzel algorithm. (reason I went with Goertzel is because I built it for a previous project and wanted to see how far I could take it)

Is that example page new? That's brilliant. Your ultrasonic transmission seems to work pretty consistently.


A lot of microphones and speakers have built in hardware low pass filters that kill anything above 15kHz


Someone pointed out low-pass filters to you, but that's not the entire story. Built-in speakers in phones and laptops generally don't have a naturally extended frequency response.

A while ago, I made audio measurements of the built-in speakers on the iPhone 7 Plus - https://imgur.com/gallery/DRbu5

The iPhone's extension is not entirely atypical of such devices, and leaves you with a reliable passband of 500 Hz to 10 kHz - above and below those frequencies, you're getting too much noise.

If you're airgapping with audio cables, you'll have a reliable passband of 20 - 20 000 Hz or thereabouts, with some caveats for built-in audio on devices such as a Raspberry Pi that fizzle out at 15 kHz.

As noted by brian-armstrong's comment, you'll only get reliable near-ultrasonic airgapping over a wired connection.


This isn't correct, by the way. Quiet's near-ultrasonic mode (~19kHz) works fine from several feet away using phone hardware. As long as you're using relatively narrowband communication, it works fine.


I created this, so please let me explain myself.

After stumbling over a really cool Web Audio API synthesizer I wanted to play around with the API myself. I used my work situation for inspiration, but I just wanted a fun pet project and didn't really bother to deep dive into the theory of modems. Hence the naive implementation and lack of error correction.

So, do I actually use this at work? Truth be told, I've only ever used it to show of to my collegues. USB-drives are way more practical in every situation where I might want to use this. But the modem is able to provide the tool I set out to make: to copy ascii text between two computers!

Glad to see that you guys liked it!


Combine it with WebRTC and implement T.30. Find a WebRTC/SIP/PSTN gateway. Send a Fax from the browser!!

You might laugh, but we do have an API for sending DTMF tones in WebRTC..


So why can't your development machine be internet connected?


What was the cool synthesizer you found?



May not be this one but was on HN some time ago:

https://inverted3.gitlab.io/drum-machine/


A discussion about the Web Audio API from a couple of weeks ago with the title "I don't know who the Web Audio API is designed for": https://news.ycombinator.com/item?id=15240762

I'm still not convinced that this API is actually useful for any application that needs to play sound or music, but it definitely seems to have the "interesting synthesizer experiments" segment covered.


https://inverted3.gitlab.io/drum-machine/ is a working multi channel drum machine using Web Audio.


There are a few trackers too:

https://github.com/steffest/BassoonTracker

https://github.com/mbitsnbites/soundbox

https://github.com/pgregory/wetracker

Also, I wrote a WebExtension that lets you apply pan and gain to audio in YouTube videos (and other videos that are not cross-domain) using Web Audio https://addons.mozilla.org/en-US/firefox/addon/soundfixer/


Yes, the API lets you build 1979-style synths out of a toybox of generators and filters. However it lacks support for the most common task in real-world digital audio: computing samples into a buffer and queueing it for playback. (There is ScriptProcessorNode but it’s deprecated and performance is not up to audio rendering standards.)


>There is ScriptProcessorNode but it’s deprecated and performance is not up to audio rendering standards.

On the web, "deprecated" doesn't mean you can't use it. It can take a long time for "officially deprecated by the standards" to become "I can't use it anymore" (obsoleted).

The replacement is the "AudioWorklet" interface and the purpose is to fix those exact synchronization issues. I wouldn't expect ScriptProcessorNode to disappear from the browser until the replacement is actually rolled out.

The Web Audio API started as the Audio Data API in Firefox, which was nothing but a buffer to blast samples into, then the Web Audio API was a refinement/competing standard above that extremely simplistic API (I adapted a Javascript-based .mod player to use AudioData instead of sticking an <audio> tag in the page circa 2011).

You've been able to generate and play back samples in browser (without Flash) for ~6-7 years now, they're simply making official changes to the underlying API in the hopes of improving it. "Deprecated" is just bureaucracy.


It's not reasonable to call it a refinement when it removed the central feature of the Audio Data API (feeding samples). Web Audio was a competing API, designed by a former developer of Core Audio (at Apple) once he moved to Google. They shipped it without a standardization process, at which point it became the de-facto audio API for the web.


> it removed the central feature of the Audio Data API (feeding samples)

Again, I've written actual code with both. With all due respect, I'm not sure what parallel universe you're from. (Do they spell it Berenstein on your side?)

I can pull up my Github commit from 2014 and look at where I changed my moz-compatible > dynamicAudio.write(modPlayer.getSamples(bufferLength));

to

>var processor = audioContext.createScriptProcessor(bufferSize, 0, 2); > [......] >processor.connect(audioContext.destination);

etc, and it still played sound.

If you're unaware how Amiga .mod/.xm works, it is a REQUIREMENT that you can provide raw samples to the output source, because the samples are embedded into the file.

You can feed samples to the browser and have it play them, right here, right now, just like you've been able to for years. If you don't like the status of ScriptProcessor as deprecated and think AudioWorklet sucks, that's a whole separate issue, but you can feed samples now and you'll STILL be able to feed samples when AudioWorklet obsoletes ScriptProcessor.

>Web Audio was a competing API

I already mentioned it was a competing standard in my comment.

>They shipped it without a standardization process

The Audio Data API was one dude hacking on Firefox (David Humphreys) and then other people got interested. One of the requirements of the standards process is that there are multiple implementations in the wild. You need to throw it in your browser if you want it to ever be standardized.

And they DID standardize it early on. You can go read old copies of the W3C standard submitted by Google.

I was sad to see Mozilla's version go away but browser vendors eventually decided it was the losing implementation - including Mozilla.

Mozilla didn't HAVE to submit to the whims of others, they could have dug in their heels if they thought their version was better (see: The Video tag, and everyone who just so happens to be part of the MPEG-LA patent pool saying no to a patent-unencumbered, open video standard baked in to the browser)


which does fall under "interesting synthesizer experiments" umbrella

I suppose Web Audio API is just badly named.


A friend and I once entered a competition to see who could transmit a text message first between two computers using DAQ boards. The judges walked around the room giving every team two wires. By the time they got to us they had run out, so we had to use paperclips. Our implementation did single-bit handshaking. Most of the other teams were trying to use a clock-based solution. Or throughout was super slow, but because our solution was stupid-simple we got it working first and won!


Could you elaborate? Did you use something like Morse Code?


coreboot has something similar to provide an early boot console in a device without "simple" output devices, using the PC speaker[0][1]. I doubt many people use it, but it's one of the things that can save the day.

Although in case of "this corporate computer is air gapped", I wonder what corp policies that little project violated.

[0] Sender (highly hardware specific): https://review.coreboot.org/cgit/coreboot.git/tree/src/drive...

[1] Receiver: https://review.coreboot.org/cgit/coreboot.git/tree/util/spkm...


Woah, I had no idea this existed. This is pretty neat!


That's a cool project.

For others to experiment with, there is also quiet-js that is similar: https://github.com/quiet/quiet-js


Ah hey, I made this. I was going to post it but it looks like I don't have to :)

Here's the live demo of it https://quiet.github.io/quiet-js


That's cool. I managed to whistle:  00 @€@`@€€@@@€@€€€€@``€€€€€`@@€€@ €€@@€€`€€@€€` `€ €€`


While we're talking about Air Gaps, it's probably worth mentioning GSMem (an {x86,} internal bus as a GSM cellular transceiver (modem)); from Wikipedia:

https://en.wikipedia.org/wiki/Air_gap_malware


Maybe an "acoustic coupler" of some kind would help? They look so cool, I've always wanted an excuse to use one for something.

https://en.wikipedia.org/wiki/Acoustic_coupler


Ah yes..my fond memories of 300 baud and being able to read my program listing as it printed on the teletype or when lucky, video terminal. Thankfully I missed 110 baud.



Cool demo! From the title, I mistakenly thought it might be to make a dial-up compliant modem in JavaScript. But nope, it's a custom toy protocol instead.

The encoder (sound playback) works in Chrome and Firefox, but the decoder (microphone recording) works in Chrome but not Firefox. Also, the encoder/decoder can't handle non-ASCII text; maybe it is only grabbing the low 8 bits of each UTF-16 code unit.


Lots of weird anecdotes.

Why is the computer unable to be connected to the internet?

Why are you pasting so much code from stack overflow to visual studio?

Cool implementation - should try ultrasonic...


I used to work for Sophos, the British anti-virus company. We had three networks in the office: green, purple and red. Green was the corporate network that everyone had access to, mostly used to read email in Lotus Notes. Purple was an ad hoc network used by developers for various testing machines. Red was connected to the Internet. A few privileged people had a PC on their desk connected to the red network, but most developers did not. There were a couple of red-network PCs around the office that we could use of we needed to look something up.

If we actually needed to download a file (say, an open source project, or some tutorial or documentation), we'd download it there and run a command that emailed it to us. The email would arrive in Notes with the attachment stripped off for virus checking. Some time later, we'd get another email saying that the attachment had been scanned and was available in another Notes database.

It wasn't the best learning environment.


At least you learned what kind of jobs to avoid.


That seems harsh. It was an anti-virus company. Imagine the mayhem if they didn't have that kind of isolation. Nor is that the most isolated that working environments get, not by a long measure; there exist employers who will destroy your phone if you accidentally bring it to work.


Given the track record of security vulnerabilities in anti-virus products, they clearly put the emphasis on the wrong things.


I think he meant Lotus Notes.


It seems like it was just a story. Like tethering his cell phone would have been a more realistic option also.


Guessing some contractual clause that the device not be connected to any other network.


So he connected it to another network.


One limited to 7 bit ASCII and cut and paste.



If you work in a SCIF or similar office environment you have to place your phone, and other personal electronic devices in a locker before entering the main office.


Setting up a modem to get around an air gap seems like it would violate that.


Maybe he's working onboard a navy ship?


Would the signal from the headphones need to be boosted to make the connection more reliable?

It might be useful / fun to implement a 2-wire-protocol or other IoT protocol. I'm thinking you should be able to implement up to 2-wire given the R and L channels of the headphones.


Also worth checking: minimodem http://www.whence.com/minimodem/ (and an audio cable if your laptop has a line-in minijack!)


All I can think of with this is another way to bridge air gaps but accessible to more people.


So did the author manage to successfully load a website using this modem? How did they teach their operating system to use the modem?


Can you please add an export function to the encoder? It's a nice way to create electronic music!


Very cool. Funny that a Norwegian would settle for an implementation limited to 7 bit ascii though.


Er, because many Nordic characters wouldn't pass, if that wasn't clear.


Interesting security implications for air gapped systems, gotta yank the 3.5mm too!


And the bus... https://news.ycombinator.com/item?id=15473481

Good luck patching THAT hole.


Fun project. But what about an ethernet cable, bluetooth, NFC, wifi,...


...USB drive, USB keyboard HID driven by the laptop, QR code to webcam, CD-ROM, floppy...


IP over Avian Carriers



[flagged]


> Did they purposefully violate the policies put in place on them? Yes!

Unproven. The _motivation_ is that they have a dev. machine w/o Internet access, leading to them implementing this. There's nothing to suggest they went beyond seeing if it was feasible or that they actually used it on the dev. machine (the photo of it "in action" shows the decoder being loaded from martme.github.io, strongly suggesting that machine already has functional Internet access).


Yeah, and fire everyone else, too, because they didn't forget everything they knew before touching the computer, so they transmitted information from the internet to the computer by typing it in, thus connecting it to the internet, you can't have that!


You do understand this is just a cool hack and proof of concept?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: