Haha, from the title I expected this to be a solution for generating MP4s from arbitrary data that you could then upload to YouTube for backup.
This is definitely really cool, too. I like that it shows that a node in a filesystem is just another way to represent some resource, but in a way that's a little more familiar to most people than something like procfs.
That's a fascinating idea. Unlike gmail, there isn't an upper bound on how much you can store on YouTube. If you could encode data in a way that's resilient against re-encoding, you could use YouTube for steganography as well. Or, for that matter, see http://dataliberation.blogspot.com/2012/09/your-youtube-orig... .
That said, it's also amazingly wasteful, and abusing a free service for a purpose it isn't intended for.
Speaking of corruption, I found this[1] video a while back. On mobile the video shows up fine, but on desktop it's just a gray screen, although the thumbnails work.
I can confirm that the mp4 I downloaded is not (detectably by my desktop video player) corrupt. I am not sure what it is, but it is not corrupt. (Except for the filename, but that is because javascript is brain-damaged as regards strings.)
Unless what you were encoding wasn't meant to be consumed as a bytestream. If you encoded a resilient optical format (like UPC or QR codes), transcoding the format shouldn't be a deal breaker. Obviously it's not optimized for backing up a harddisk, though.
Interesting - say video @ 60fps, encode 1 QR code per frame, would be highly resistant against transcoding errors, very easy to extract information from again given the standard format.
Wouldn't be terribly efficient though. Wikipedia says max bytes per code is 2953 per QR code [1]. So 2953Bps * 60fps = 177KBps data encoding. I guess that's what you get for encoding it in a visual (human readable) format instead of a datastream directly.
you have sound too. Must be an audio equivalent that would have a similar level of durability to a qr code. I dont know if youtube ever drops frames during compression. Perhaps using there 4k support would help get a bit more data
Maybe using Fourier/wavelet/whatsoever transform would be the way to go, just like in digital watermarking techniques. Both high capacity and robustness would seem easier to achieve.
Interesting. I guess we can experiment with how many bits can be compressed to a video frame. Is there a guarantee that YT doesn't change the frame rate?
Another idea is YT as code repo: essentially one makes a movie that shows code files. On retrieval, OCR can be applied to transform the movie back to code in text.
I know that YouTube used to support only up to 30fps video, but IIRC they now support 60fps. This became a thing because people making videos of themselves playing the newest generation of consoles (PS4 / XBOne) want to upload in high quality, and the consoles now do 1080p at 60fps.
If this is a concern for people (recording at 60fps to upload for 60fps), I doubt that Google would downgrade the framerate except for maybe the lower quality versions of the video (does 60fps really matter for 240p video?).
What they meant was taking arbitrary data and turning it into a valid video file that plays back. For instance, you could read each bit off as audio (zero, one, zero, zero...). The quest is to find the most efficient, yet resilient way to do so.
They said they would like to have some encoding technique (completely new) which would get transcoded without any data loss. So, my point was that such _new_ encoding techniques will be rejected by YT in the first step itself, before even transcoding.
No, they mean new encoding within the video and audio. A watermark is encoded in video, even though it's just visual data. Encoding can mean different things at different levels.
I've toyed with the idea of it before- it's actually pretty easy to transcode files into images, and if you were more clever than I you could probably figure out how to use color to represent larger chunks of data instead of just black and white squares. It's also not size friendly mind you. The files I generated were about 2x bigger than the original file.
If there was a way to do massive changes in existing videos it would be feasible to use them as storage - that would be a definitely interesting approach.
Anyway, advanced data encoding would be needed to prevent it from damage after format changes. Well, it's not impossible and even not that hard to achieve.
Neat! Any plans to make the different video qualities quickly available? Maybe as a subdir named with the resolution, eg search/movie/480p.mp4 or maybe movie.1080p.mp4?
Oh, I don't want to clutter it too much, it's better to keep it simple.
It definitely needs a mount option to set the default quality. Changing it later might be tricky, maybe passing options in directory names would be a good idea? for example: mkdir "foobar [q:720p]"
Some time ago Youtube started to provide audio and video separately. Due to that, for full videos whole data needs to be downloaded before reading (it's hard to merge them on-the-fly). For just audio or just video data streaming is intended, but needs reimplementation.
I think these monolithic files are provided only for up to 720p quality, mainly for compatibility with older Flash players. To download the maximum quality with youtube-dl, you have to use "-f bestvideo+bestaudio" - then, youtube-dl will download both streams, and merge them for you with an ffmpeg invocation.
yes. you can force the streamable mpg4 download with the -f option (-f22), but it seems not all videos are encoded that way. livestreamer might be the tool of choice if you don't want to use your browser to watch a youtube video.
Theres a chrome extension called alientube that shows reddit comments instead (useful for finding obscure niche subreddits too). Or there's plenty of extensions to hide comments.
kudos, that's an excellent idea, combining out-of-the box thinking, data structuring, creating value all using basic tools. A tip to the hat. Nice work!
I have to think that through. Search pages are tricky as you can't simply go to a certain page. They're identified by tokens, which aren't known a priori - you just get adjacent ones with search results.
A reasonable solution would be ability to set max results and disabling next/prev. All desired results would show up in the search directory (if you can assume that results beyond some point are useless).
I think the most obvious way to expose that remote behaviour in the filesystem is to have the directory /search/2/ appear only after the directory /search/1/ has been stat'd.
side-note: I find it very very interesting to think about a filesystem of unknown content and size. It's basically an infinite tree through FUSE. Nothing crazy, but interacting with it directly is inspiring.
Back in the kernel 2.2 days there was an explosion of experimental filesysyems, like ftpfs, cdfs/cdripping fs, and more. Now that we have widespread FUSE bindings it seems we'll see a filesystem renaissance.
Right but ftp/cd are finite data sets. So far nobody has access to youtube full index and the virtual fs has to generate/navigate on the fly. Imagine a GoogleSearchFS, makes things a little different.
Storing a pdf or zip file is not a requirement nor definition of a filesystem. You can do neither of those with a CDROM or DVD and both those contain filesystems.
This is definitely really cool, too. I like that it shows that a node in a filesystem is just another way to represent some resource, but in a way that's a little more familiar to most people than something like procfs.