Nice article! The motion compensation bit could be improved, though: > The only ...

erydo · on Nov 4, 2016

Your example seems to assume translation only. I wonder how difficult/useful it would be to identify other kinds of time-varying characteristics (translation, rotation, scale, hue, saturation, brightness, etc) of partial scene elements in an automated way.

Along the same lines, it would be interesting to figure out an automated time-varying-feature detection algorithm to determine which kinds of transforms are the right ones to encode.

Do video encoders already do something like this? It seems like a pretty difficult problem since there are so many permutations of applicable transformations.

Animats · on Nov 4, 2016

I wonder how difficult/useful it would be to identify other kinds of time-varying characteristics (translation, rotation, scale, hue, saturation, brightness, etc) of partial scene elements in an automated way.

That's how Framefree worked. It segments the image into layers, computes a full morph, including movement of the boundary, between successive frames for each layer, and transmits the before and after for each morph. Any number of frames can be interpolated between keyframes, which allows for infinite slow motion without jerk.[1] You can also upgrade existing content to higher frame rates.

This was developed back in 2006 by the Kerner Optical spinoff of Lucasfilm.[2] It didn't catch on, partly because decompression and playback requires a reasonably good GPU, and partly because Kerner Optical went bust. The segment-into-layers technology was repurposed for making 3D movies out of 2D movies, and the compression product was dropped. There was a Windows application and a browser plug-in. The marketing was misdirected - somehow, it was targeted to digital signs with limited memory, a tiny niche.

It's an idea worth revisiting. Segmentation algorithms have improved since 2006. Everything down to midrange phones now has a GPU capable of warping a texture. And it provides a way to drive a 120FPS display from 24/30 FPS content.

[1] http://creativepro.com/framefree-technologies-launches-world... [2] https://web.archive.org/web/20081216024454/http://www.framef...

danieltillett · on Nov 5, 2016

John do you know where all the patents on Framefree ended up?

Animats · on Nov 5, 2016

Ask Tom Randoph, who was CEO of FrameFree. He's now at Quicksilver Scientific in Denver.

Animats · on Nov 5, 2016

Some venture IP company in Tokyo called "Monolith Co." also had rights in the technology.[1] "As of today (Sept. 5, 2007), the company has achieved a compression rate equivalent to that of H.264 and intends to further improve the compression rate and technology, Monolith said."[2] (This is not Monolith Studios, a game development company in Osaka.) Monolith appears to be defunct.

The parties involved with Framefree were involved in fraud litigation around 2010.[3] The case record shows various business units in the Cayman Islands and the Isle of Jersey, along with Monolith in Japan and Framefree in Delaware. No idea what the issues were. It looks like the aftermath of failed business deals.

The inventors listed on the patents are Nobuo Akiyoshi and Kozo Akiyoshi.[4]

[1] https://www.youtube.com/watch?v=VBfss0AaNaU [2] http://techon.nikkeibp.co.jp/english/NEWS_EN/20070907/138905... [3] http://www.plainsite.org/dockets/x8gi572m/superior-court-of-... [4] http://patents.justia.com/inventor/nobuo-akiyoshi

danieltillett · on Nov 6, 2016

Great dectective work. I suspect the IP is now a total mess - with luck nobody has been paying the patent renewal fees and everything is now free.

TD-Linux · on Nov 4, 2016

Most codecs split the image into prediction blocks (for example, 16x16 for MPEG-2, or from 4x4 to 64x64 for VP9). Each of these blocks has its own motion vector. All of the transformations you mentioned look like a translation if you look at them locally, so they can all be fairly well represented by this. Codecs have, in the past, attempted global motion compensation, which tries to fully model a camera (rotating, translating, lens distortion, zooming) but all of those extra parameters are very difficult to search for.

Daala and AV1's PVQ is an example of a predictor for contrast and brightness (in a very broad sense).

astrange · on Nov 4, 2016

Yes, H.264 has brightness/fade compensation for past frames. It's called "weighted prediction".

The previous codec MPEG4 part 2 ASP (aka DivX&XviD) had "global motion compensation" which could encode scales and rotation, but like most things in that codec it was broken in practice. Most very clever ideas in compression either take too many bits to describe or can't be done in hardware.

userbinator · on Nov 5, 2016

It seems like a pretty difficult problem since there are so many permutations of applicable transformations.

That's part of why video encoding can be very slow --- with motion compensation, to produce the best results the encoder should search through all the possible motion vectors and pick the one that gives the best match. To speed things up, at a slight cost in compression ratio, not all of them are searched, and there are heuristics on choosing a close-to-optimal one instead: https://en.wikipedia.org/wiki/Block-matching_algorithm

amluto · on Nov 4, 2016

Now I'm out of my depth, but I think motion compensation does okay at rotation and scaling. The motion vector varies throughout the frame, and I think codecs interpolate it, so all kinds of warping can be represented.

nitrogen · on Nov 4, 2016

As evidence of this, sometimes when an I-frame is dropped from a stream or you jump around in a stream you can see the texture of what was previously on the screen wrapped convincingly around the 3D surface of what's now supposed to be on the screen, all accomplished with 2D motion vectors.