I'm glad I'm not the only one that uses youtube for music.
I often wonder how much google would save on bandwidth if they let me stream audio only versions of things. The vast majority of the time, I'm not even looking at the visual component of the "video", just listening to the song.
I suppose that this would add computational complexity, as well as additional storage, though. I'm not terribly familiar with the way that youtube stores video.
Can anyone chime in on this? "Extracting" audio from the video containers that youtube uses; how hard is this to do on the server side?
I mentioned this the opposite way around a little while ago (mute the audio in countries where it isn't licensed, rather than blocking the video outright), but the conclusion was generally that the video and audio are muxed together into a single deliverable. The file is almost certainly pre-generated (once for each resolution) to avoid the server-side costs of merging them together for streaming.
I guess in theory you could generate a 'demux mapping' and the client could request byte-ranges corresponding to only one channel, but that seems incredibly complicated, would generate huge requests, and is probably bypassable on the client-side anyway (more of an issue for my idea than yours).
I suspect the reason why they don't is because of legal concerns. It seems like the vast majority of the videos uploaded to youtube are videos that people recorded themselves, whereas the vast majority of mp3s that people share in almost any context are pirated music. There are probably certain small communities where there are plenty of user created mp3s shared, but it doesn't seem to be the case for a sharing platform of any significant size.
Grooveshark manages to get licenses for all these "gray" mp3s after the fact, but I doubt youtube would be able to do so as easily considering their larger market position would be more threatening to the labels.
It wouldn't add computational complexity, or use extra storage. Surely the audio is already stored as an mp3 or some similar format, after all wasn't mp3 invented to be the audio part of video (mpeg) files.
My question comes down to how they're stored. [Obviously] I'm not an expert on video encoding or containers.
FLV is a container, yes? Usually, I'm guessing, MP3 and H.264. H.264 works out well for youtube because they can then also use this on their HTML5 video players.
So are the H.264 blobs and the mp3 blobs stored as discrete files, then packaged when a video is loaded and sent down the tubes? If so, then yeah, obviously, it would be really easy or youtube to serve "audio only"; probably "a few lines of code".
But if the MP3 and H.264 (again h.264 is an assumption) are stored as one [flv] package, they would have to be unpacked before being able to be sent down as their individual components.
Again, this is a shortcoming in my understanding of video containers, so maybe I'm missing the point on this completely. (As in: maybe "unpacking" an FLV is trivial)
Most videos uploaded in the past couple of years would be compressed as MPEG4 (h.264, specifically), but regardless of the video format the audio/video would be muxed into one file. If the video track consists of one still image, the impact (serverside compression, storage, delivery) of the video track would be minimal.
I often wonder how much google would save on bandwidth if they let me stream audio only versions of things. The vast majority of the time, I'm not even looking at the visual component of the "video", just listening to the song.
I suppose that this would add computational complexity, as well as additional storage, though. I'm not terribly familiar with the way that youtube stores video.
Can anyone chime in on this? "Extracting" audio from the video containers that youtube uses; how hard is this to do on the server side?