You can download the Bali libvpx snapshot (version 0.9.6) from the WebM Project Downloads page or clone it from our Git repository.
For Bali we focused on making the encoder faster while continuing to improve its video quality. Using our previous releases (our initial 0.9.0 launch release and "Aylesbury") as benchmarks, we’ve seen the following high-level encoder improvements:
- "Best" mode average encoding speed: On x86 processors, Bali runs 4.5x as fast than our initial release and 1.35x faster than Aylesbury.
- "Good" mode average encoding speed: Bali is 2.7x faster than our initial release and 1.4x faster than Aylesbury.
- On ARM platforms with Neon extensions, real-time encoding of video telephony content is 7% faster than Aylesbury on single core ARM Cortex A9, 15% on dual-core and 26% on quad core.
- On the NVidia Tegra2 platform, real time encoding is 21-36% faster than Aylesbury, depending on encoding parameters.
- "Best" mode average quality improved 6.3% over Aylesbury using the PSNR metric.
- "Best" mode average quality improved 6.1% over Aylesbury using the SSIM metric.
- Implemented a new "constrained quality" (CQ) data rate control mode. Within a large set of videos, this mode better allocates bits from videos where they can't provide significant visual benefit to videos where they can.
- Achieved more consistent high video quality across entire video clips. We now use a better two-pass rate control option that no longer favors early sections of videos.
- Greatly improved quality on "noisy" source videos through temporal filtering of alternate reference frames.
- Improved visual quality of scene transitions by allocating fewer bits to the transition itself and more to the frame immediately after the transition occurs.
- Achieved much faster encoding by better predicting motion vectors, improving algorithms for selecting predictors for small blocks.
- Added or rewrote assembly code for functions related to alternate reference (alt-ref) frame, noise reduction, quantization and sum absolute difference to improve performance.
- Improved usage of multiple processor cores by cutting the overhead related to thread synchronization.
- Made multi-threading optimizations on ARM platforms to improve real-time encoding speed.
John Luther is Product Manager of the WebM Project.