Hacker News new | past | comments | ask | show | jobs | submit login
Extracting hidden data from an audio file with the FFT (wolchok.org)
52 points by swolchok on May 24, 2010 | hide | past | favorite | 27 comments



Reminds me of this article on Aphex Twin (and others) embedding images in spectrograms: http://www.bastwood.com/aphex.php


Facscinating, especially the quality of the spiral, the way it stands out.


Aphex Twin does a lot of interesting stuff. I recall watching an interview with him where he claimed that not only does he make all his music using software, but he actually has written some programs which compose the music for him. Can't remember where I saw that though.


Why would you believe him?


He got loads of software from the French organisation IRCAM (which Ted Nelson says he visited in 2006).


As someone who has created many puzzle-hunts and participated in some, I really hope that the article is abbreviated because if they went from FFT to Wikipedia article in that few steps they are a truly amazing team (and I would feel bad).

Then again, it is DEFCON CTF they're trying to get into, so it does require some amazing problem-solving skills.


I elided the hour or so spent trying to figure out what Power PAQ was. Google has no results for PAQ804; as the writeup says, a teammate eventually Binged it. I also elided the debugging needed to get the FFT to work (but you need to know what the decompressor is before deciding that it doesn't work!). I had commented out the line that applies a Hamming window to the FFT because I didn't understand it, but it didn't work right until I added the window. My theory is that it worked OK until the "background" (i.e., the song) got louder.


PAQ is a series of compression algorithm tests led by Matt Mahoney and a ton of compression enthusiasts who keep tweaking PAQ algorithm (be it for better memory use/speed or most importantly: compression ratio on a corpus).

Checkout their fabulous journey at

http://mattmahoney.net/dc/

PAQ804 says that it is fourth modification to the initial PAQ8 archiver.


Well there weren't really many more steps to take. Between FFT and wikipedia article was only PAQ compression, which didn't require prior knowledge. And they had to ignore a few red herrings, but the spectrogram helped there. I wonder how the 30ms rate was ascertained, though - pure guess?


The 30ms rate could be ascertained by zooming in on the spectrogram and looking at the minimum width of a mark or space in the encoded data.


That's exactly what I did -- the spectrogram is from sndfile-spectrogram, not baudline. baudline helpfully draws a line parallel to the frequency axis and shows you what point on the time axis you're at, so measuring times is easy.


the article could use a little more context/backstory too...


There's a tiny bit and an index of other writeups now at http://scott.wolchok.org/ctf2010/


articles like this remind me that I'm just a web app kiddie, copying and pasting code from google, and that theres way more complex hackery going on in the world which I'll never really understand.


tip: get an arduino or a DSP board and a soldering iron and some basic electronics if you can afford it, spend a few months with it as a hobby.

It will increase your insights in to programming tremendously.


Dude, chill, I was in the same position when I was 13. It is important to remember to never stop learning. Remember to learn the foundation first: how to program, how computers work, Unix, TCP/IP networking. For electronics, the basics are circuit theory, Boolean algebra, and so forth. With the foundation, you'll be able to understand all this.


But a little less now than there was before reading the article?


and your comment reminds me that I'm not alone ;)


Did anybody else find it shocking that a utility to decompress a small file would use 1.7G of memory to do so ?


I was assuming it was bytes allocated and freed; this system has 4 GB of memory, I had a few VMs plus chrome with many tabs open running at the time, and I didn't notice any thrashing.


Does anyone have the file in question mirrored somewhere? It's hosting seems down from here


Local mirror should be up at http://scott.wolchok.org/ctf2010/t500_47b7c308e6d24e14c5.bin (also now linked from the article).


Interesting, but the way the article is presented... Feels like it is written by Scr33p7 K1ddi3.


I'm not a script kiddie, I'm a grad student in security. I did, however, bang out that article in one go after only partially recovering from the 50+-hour contest, so it's not the most coherent.


You may think it sounds like its written by a script kiddie, but he solved one of the most difficult challenges in the game and did all of this in only a few hours -- likely on little to no sleep.

Half of the competition requires exploitation of network services. This year included x86, x86-64, sparc64, ppc, compiled python, a java interpreter statically linked into a x86 binary, among a bunch of other random ones. You can't be a script kiddie and do well in this competition.


Actually, I had just woken up from sleeping ~8 hours about 3-4 hours before the question was released, and we didn't do any of the exploitation of networked services except Pwtent Pwnables 200 (compiled Python) and Binary L33tness 100 and 200 (trivial patches in a debugger). I was tired by the time the rest came out, and the other reversers on the team burned themselves out on the PP400 (the PPC question) and Binary L33tness 300 (a huge red herring that actually printed the answer, encoded in hex when it was run). Thanks for defending me, though.


That's because the kiddies had to learn chaos text from us in order to read .nfo files and the like.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: