VP8 "Constrained Quality" (CQ) Encoding Explained
Thursday, March 17, 2011 | 2:20 PM
In the Bali release post, we mentioned a that we've added a new encoding mode called "constrained quality" (CQ) to the VP8 Codec SDK (libvpx).
The idea for CQ mode arose as we began testing approaches for encoding WebM versions, in multiple resolutions, of every file in the YouTube corpus. Approaching video encoding on such an immense scale sets one to thinking very carefully about how every bit is used; wasting even small amounts of data across many millions of files adds up very quickly, translating to higher storage and bandwidth costs.
After trying a few approaches it became apparent that we needed not a better way to allocate bits within each WebM file, but rather a better way to distribute them across all the WebM files. The result was CQ mode.
I presented the slides below at the February WebM Summit to explain CQ in general terms and summarize its benefits to content publishers when applied across large collections of WebM files. I hope you find it informative and welcome your feedback in the comments.
The idea for CQ mode arose as we began testing approaches for encoding WebM versions, in multiple resolutions, of every file in the YouTube corpus. Approaching video encoding on such an immense scale sets one to thinking very carefully about how every bit is used; wasting even small amounts of data across many millions of files adds up very quickly, translating to higher storage and bandwidth costs.
After trying a few approaches it became apparent that we needed not a better way to allocate bits within each WebM file, but rather a better way to distribute them across all the WebM files. The result was CQ mode.
I presented the slides below at the February WebM Summit to explain CQ in general terms and summarize its benefits to content publishers when applied across large collections of WebM files. I hope you find it informative and welcome your feedback in the comments.
Paul Wilkins is a Senior Software Engineer for the WebM Project.


12 comments:
Tim said...
Apparently the page has a redirect loop, but I'm guessing from the text the tl;dr is "libvpx can now encode at a specific quality rather than bitrate."?
I.e. like "x264 --crf " and "oggenc -q "?
Or is this something more awesome? :-)
March 17, 2011 2:33 PM
Tim said...
Ah, you can open the link in a new tab and it works:
https://docs.google.com/present/embed?id=dcd3zw7n_156wdtnjc7&interval=5&size=l&pli=1
Apparently it isn't constant quality. I think what they're saying is that in a collection of videos with very different sizes (such as youtube which has videos all the way from 240 to 1080p), the marginal increase in quality per bit varies a lot.
So if you add an extra 10 kb/s to a 240p video it will make much more difference to the quality than adding 10 kb/s to a 1080p video. In this case constant quality is probably not the best way to spend all your bits. Instead they are trying to take bits from places where they make little difference in quality, and use them in other videos where they make a lot of difference.
They've implemented it (roughly) by having a maximum per-frame quality (I think).
March 17, 2011 2:48 PM
blog said...
Could you briefly explain why constant quantizer causes a visible quality drop on easier clips? Is this true generally for all DCT-based codecs, or a "feature" of libvpx? Does constant quant correspond worse to visual quality at low bitrates than high bitrates, or how should one explain it?
March 17, 2011 4:04 PM
Astrophizz said...
Hmm, a new feature primarily for large video collections. Wonder who was behind this...
March 17, 2011 11:14 PM
Paul said...
So… several issues here.
@Tim… You are sort of right and sort of wrong in your assessment. The redeployment of
bits is not from say 720p to a 240p but rather within the set of clips coded at a particular
resolution.
You are right though that that the marginal increase in quality for a given number of bits
varies a lot and this is why CQ mode helps. However, it is not just the case that the quality
gain for a hard clip or a larger format clip is less for a given number of bits. It is also the case
that as you get to higher and higher quality levels the number of extra bits required for the
same perceptual increment in quality rises sharply.
Bits are taken from easier clips in a way that (hopefully) doesn’t have any visible impact on
quality and redistributed to other clips to bring their quality up a bit. However, for the
hardest clips, the trade off in terms of improvement in quality for a few extra bits is poor
and in any case really hard clips tend to be close to the limits in terms of the our maximum
bandwidth target (important from perspective of stream-ability). Consequently, for
YouTube we have chosen settings that apply most of the extra bits preferentially in the
mid range. Given the distribution of clips within the YouTube corpus this makes a big
difference.
@Blog… The problem with a constant quantizer or constant quality measured in some
other way using PSNR or SIMM for example, is that subjectively speaking you actually need
the quality to be better on simpler and slower moving content. To take an extreme
example, defects that would be disturbing in a still image may go un-noticed or be quite
acceptable in a high motion car chase. The objective was to recode a corpus such that the
average bandwidth remains the same (while improving subjective quality for users). To get
the same average rate with a constant quantizer you would have to pick quite a high value,
which would cause some of the easier clips to look a lot worse than they did before. Even
so, some of the hardest clips would be way too big. Slide 6 sort of illustrates this, though in
fact it is worse than the slide shows.. there was one really hard clip in our test set which
went out to 6 megabits.
Slides 9 shows the behavior of the CQ mode. The easiest clips show a sizeable drop in
bandwidth but are still hovering around 45-55 db and look fine. Clips in the mid range get
the most boost and there far fewer clips down below 40db, but as can be seen even more
clearly in slide 10 (where I plotted a much larger random set), the algorithm still does a
good job of imposing an upper bandwidth constraint (here set at 800K). Once a clip hits this
level it pretty much reverts to VBR behavior and will hit the target if the maximum and
minimum quantizer set by the user allow.
@Astophizz.
We may have had a particular large video collection in mind when we developed this, but I
hope it will help others to get a better trade off across their collections too, with less need
for clip by clip intervention.
Lots of words but I hope this answers your questions.
March 18, 2011 1:07 PM
Kristoff said...
I've been playing around with CQ mode for a bit, but I'm having some issues. For some reason, 2-pass with CQ is giving me A/V sync problems (1-pass is fine). Also, how come 2-pass mode with CQ produces a stream that is over half the size of the 1-pass CQ stream with all settings the same?
March 19, 2011 2:56 PM
Kristoff said...
Actually, I revise that statement a little bit. One-pass in CQ mode seems to revert to VBR even if I set the cq-level to say 60. Two-pass in CQ mode seems to work as expected, but I still don't know why 2-pass messes up the A/V sync.
March 19, 2011 4:08 PM
Kristoff said...
Sorry, one last post. I get A/V sync issues on all 2-pass modes, including VBR. And, it doesn't revert to VBR always in 1-pass it's just if I encode a DVD resolution source with target-bitrate something huge, like 8000, and cq-level really high, like 63, I get something that has a bitrate of about 2500 kbps, which seems odd. I'd expect a much lower bitrate than that for 480p video with a 63 quantizer. If I use 2-pass with settings like these, the bitrate is infinitesimal as you'd expect, but, again, A/V sync problems.
March 19, 2011 6:16 PM
Kristoff said...
Ok, I lied. One more post. Just thought I'd mention I've figured out the sync problem. Still genuinely interested in learning more about what causes the great difference between 2-pass and 1-pass with CQ mode.
March 20, 2011 12:38 AM
verb3k said...
@Kristoff: The comment section is not the place for usage questions/extended discussions. Post your questions here: http://groups.google.com/a/webmproject.org/group/webm-discuss
March 20, 2011 1:03 AM
Kristoff said...
It is neither my intention to post a question on usage nor discuss the matter at length. I am simply noting the performance disparity between 1-pass and 2-pass in the new CQ mode, and questioning whether this is a feature or a bug.
March 20, 2011 2:09 AM
Paul said...
@Kristoff.
There was a problem relating to CQ mode in one pass that I fixed last week. You might want to check out the latest build.
If it is still not behaving as you might expect can you file a problem report with more details so I can follow it up.
Note, however, that CQ mode is always likely to work better in two pass modes.
Thanks Paul
March 21, 2011 1:56 PM
Post a Comment