I'm Not Bruce: Google's WebM Video Codec .webm

Wednesday, May 19, 2010

Google's WebM Video Codec .webm

A really nice breakdown of VP8 is here if you're interested. But a summary is thus:

"The spec consists largely of C code copy-pasted from the VP8 source code — up to and including TODOs, “optimizations”, and even C-specific hacks, such as workarounds for the undefined behavior of signed right shift on negative numbers. In many places it is simply outright opaque. Copy-pasted C code is not a spec. I may have complained about the H.264 spec being overly verbose, but at least it’s precise. The VP8 spec, by comparison, is imprecise, unclear, and overly short, leaving many portions of the format very vaguely explained. Some parts even explicitly refuse to fully explain a particular feature, pointing to highly-optimized, nigh-impossible-to-understand reference code for an explanation. There’s no way in hell anyone could write a decoder solely with this spec alone."

"WebM includes:

VP8, a high-quality video codec we are releasing today under a BSD-style, royalty-free license
Vorbis, an already open source and broadly implemented audio codec
a container format based on a subset of the Matroska media container"

====

Google has announced WebM (not a huge surprise given all the rumors) but you can check out the latest FFMpeg version to support it from Subversion here which is a bit surprising.

====

"WebM is an open, royalty-free, media file format designed for the web.

WebM defines the file container structure, video and audio formats. WebM files consist of video streams compressed with the VP8 video codec and audio streams compressed with the Vorbis audio codec. The WebM file structure is based on the Matroska container."

====

I'll goof around with it later on (possibly next week) and see what's what but I don't think we'll be editing with it very soon.

====

"Reversed complexity encoding / Z-frames: A type of frame that in encoding complexity is lower than a regular P-frame but is more compact. The catch is decoding computation needs to be higher. Specific technical approaches might include:

a combination of multi-frame processing with Wyner-Ziv coding
mixed quality encoding, where the Z-frames are non-reference frames interspersed with regular P-frames and coded at lower quality than the target quality. The decoder recovers a higher quality version of the Z-frames by multi-frame processing using information in the neighboring higher quality P-frames.
supplement the information transmitted for the Z-frame with additional helper information to regularize the reconstruction process on the decoder side. This could be a Wyner-Ziv layer, or something different."

====
Also of interest:
====

VP8 uses 14 bits for width and height, so the maximum resolution is 16384x16384 pixels. VP8 places no constraints on framerate or datarate.

and

The Developer Preview releases of browsers supporting WebM are not yet fully optimized and therefore have a higher computational footprint for screen rendering than we expect for the general releases. The computational efficiencies of the VP8 codec are more accurately measured today using codec-level development tools in the SDKs. Optimizations of the browser implementations are forthcoming.

====
Higher Computational footprint means it's really inefficient at the moment and eat a lot of CPU. Although, they do say in their FAQ:
====

"If I have a video card that accelerates video playback, will it accelerate VP8?
The performance of VP8 is very good in software, and we’re working closely with many video card and silicon vendors to add VP8 hardware acceleration to their chips."