Hacker News new | past | comments | ask | show | jobs | submit login

Some of it is people ask the wrong questions. On a loudness-war-wrecked pop song I may not be able to tell 128Kbps from the original, but on specific content I have been able to tell. I'm not even claiming golden ears or anything; some specific audio content is the audio equivalent of visual confetti [1], and anyone can hear the difference, because the codec isn't even close. And let me underline, I mean, anyone. No special claims being made here.

But all in all, that content is relatively rare, and generally transient even in the music they appear in.

[1]: https://www.youtube.com/watch?v=r6Rp-uo6HmI




The giveaway for low-mid nitrate MP3 is the high hats. The lower the nitrate, the more you get a sort of temporal ghosting that sounds like an almost “crunchy” swishy sizzle sort of sound, a bit like a jazz player using brushes, but more lo fi.


I agree, and my hypothesis is that it's exacerbated by the combination of three particular things:

1. It's a high frequency complex waveform with a fast envelope, so it demands bitrate.

2. Drum miking often involves multiple mics spaced apart, so more than one typically picks up any given cymbal with a phase offset, and those mics are panned quite differently, leading to a very "wide" result, i.e., left and right output is fairly uncorrelated as seen on a vectorscope [0].

3. A perceptual codec at a given total bitrate often sounds better when stored as a mid-side transformation (instead of storing a left channel and a right channel, store a L+R "mid" a.k.a. sum channel and a L-R "side" a.k.a. difference channel), also known as "joint stereo" which is a common flag on MP3 encoders, because it allows for assigning more bits to the mid channel (correlated signals) and fewer bits to the side channel (uncorrelated signals). More bits for mono center-panned stuff like vocals is the goal, which is generally for the best, but fewer bits remain available for wide stuff like those cymbals! Contrast with regular stereo mode where half of the total bitrate is assigned to each channel. MP3 below 256kbps typically needs joint stereo mode enabled in order to sound decent.

[0] https://en.m.wikipedia.org/wiki/Vectorscope#Audio


A realistic take on the issues invovled. I never knew what “joint stereo” meant. Great explanation.

If anyone has a good cymbal crash sample at 24/96 or better that they can provide, it seems like it would be a great example for intentional differentiation of various compressed versions.


Vinyl actually encodes stereo much the same way.


FM radio as well: originally one channel, then a second was added, and the second is used for the difference signal (the first already being the sum signal, as with any conversion to mono). Overhauling the whole thing to broadcast left and right discretely would destroy backwards compatibility, and the newly-added subcarrier had worse SNR (thus receivers ignore it until reception is sufficiently strong) so it only made sense to use it for the difference.


Low nitrate MP3 is a fantastic typo.


I was going to say this - cymbals are often very noticeably bad on MP3 recordings.


Well, most classical songs are very well compressible, because not much is going on. Punk Rock or any other music were a lot is happening, at the same time, can suffer very audible from 128 kbit lossy compression. So you can hear lossy compression better in a loud pop song than other music.


I don't agree with this from my own experience. To me, classical music at high compression suffers far worse than modern bands.


My unscientific guess would be that classical music might have wider dynamic range than “normal” music. So the same compression amount affects the one with more range first (classical).


Higher dynamic range and typically also more 'pure'. The introduced compression artifacts stand out more in simpler waveforms than in wavevorms that are an addition of many more layers of sound.


Your guess would be correct [1]. Some samples [2].

[1] https://www.mdpi.com/2624-599X/4/3/42

[2] http://www.harmjschoonhoven.com/mp3-quality.html


A typical classical symphony requires 50-80 instruments, several dozen of which might play at the same time, while a typical punk song has maybe four.


Compression algorithms don't care about instrument counts, they care about the complexity of the signal. So there being a total of maybe hundreds of instruments doesn't matter, it's only about how many are playing at any given time. There might be a few dozen instruments playing at any given moment in a classical recording, but they're all highly tuned instruments with several of each one playing just about the exact same note.

Grungy rock music might only have a few instruments, but they're often purposefully highly distorted and have people pretty much screaming and shouting, leading to the actual sinal being closer to literally noise.

So the closer you are to literally noise, the less compressible your signal is.

Imagine a an image with a dozen sharp, clear, colorful squares. Now imagine a similar resolution image with only 5 colors, but they're different shapes and they're kind of fuzzy and they're really more like gradients instead of a pure color. Which is going to compress easier?


This is an interesting argument, but not convincing. You are essentially claiming that the information content of a classical recording is low, and could be replicated with fewer, simpler instruments. However, I suspect that many of the interesting subtleties that make a recording sound “beautiful” would be lost that way. Surely such nuances are why classical musicians are so passionate about their instruments.

(FWIW, I’m way more of a punk fan myself, and usually find most classical music pretty boring.)


> You are essentially claiming that the information content of a classical recording is low, and could be replicated with fewer, simpler instruments.

No, I do agree having multiple instruments does lead to a wider sound than just a single one. Plus multiple instruments will probably help balance out a single one being out of tune or not quite hitting the note right. But still, these instruments usually are way more tuned to produce closer to pure tones and their harmonic overtones than a guitar going through half a dozen different distortion and effect pedals then through a compressor along with a guy screaming all over the place into a microphone.

Also, when strumming a guitar you're almost always playing essentially six strings at once, playing a whole chord with only one instrument. Meanwhile on a flute or a trumpet or a clarinet or a violin a single player is only playing a single note at a time, a single string of the guitar. So during strumming sections a guitar is almost like 6 instruments, in terms of signal complexity. So a rhythm guitar strumming and a lead guitar picking strings is really almost like 7 instruments played by two people compared to many orchestral instruments.

Just look at these two spectrograms. Look at the rock song where there's a lot of distorted guitar, bass, drums, and singing going on and compare that to an active part of the classical recording. See how the classical recording has a lot more clean, straight lines while the rock song is a lot more fuzzy? Imagine if these were images, which would be more complicated to accurately compress? That's not really a great analogy, but it is touching on the same concept.

Rock song: https://youtu.be/BVsp23B8dWo?t=62

Classical song: https://youtu.be/Txp-pHU2K6w?t=210




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: