Posted: Tue 18 May 2004, 2:41 Post subject: I, P and B picture questions
I was reading about this in another post, but thought I would start a new one for this question.
In the other post you seem to imply that I and P pictures, without Bs, should be used only when you have the room for higher bitrates, 6500+ (and, of course, detect scene changes is on).
This may sound like a "duh" question, but do I P and B pictures each take up the same amount of room? If I use 1 I and 14 P, does that take up more room in the end than say 1 I 5 P 2 B?
I found that information a bit confounding. I've read so many different things here that I cannot keep up. I thought B pictures should only be used when scene changes weren't being detected, otherwise they should never be used? And they should be used for fast motion, too?
Joined: 04 Feb 2003 Posts: 587 Location: Lisboa, Portugal
Posted: Thu 20 May 2004, 0:21 Post subject:
The actual space a group of frames takes up depends on the stream's bitrate. If your average bitrate is set to 6500 kb/s, then 1 second of video will take up (approximately) 6500 kb, regardless of the GOP structure. What the picture type influences is how those bits are distributed within the group of frames.
Typically, if an I-picture uses 1500 bits, a P-picture will use about 500 and a B-picture will use about 250.
If a frame is similar to the previous one, a P- or B-picture will produce the same quality as an I-picture while taking up much less space (around 1/4th). Since there is a limit to the overall bitrate, using more P- and B-pictures lets you "save" some bits for the I-pictures. This will improve the quality of the I-pictures and will also end up improving the quality of the P- and B-pictures (since they are based on the I-pictures).
Let's say you have to keep the bitrate below 6500 kb/s and are only using I-pictures. Since there are 25 frames per second (PAL), this means a maximum of 260 Kb per frame.
Now, if instead of using just I-pictures, you use a 5-frame GOP with one I-picture and 4 P-pictures, you can use about 560 Kb for the I-pictures, and 185 Kb for the P-pictures. Increase the GOP to 15 frames (the maximum for PAL) and you (or rather, the encoder) can use 690 Kb for the I-pictures and 230 Kb for the P-pictures.
If there is no limit to your bitrate (or if there's a very high limit, such as 30 Mb/s or more), you can use I-pictures only.
B-pictures are a special case. They do give you better compression than P-pictures (because they can "look into the future"), but they have one very undesirable feature: they cannot be used as a basis for another frame (that's the price of looking into the future). For example, if you have a GOP like this:
I P P P P P \
The 4th frame can be based on parts of the 3rd frame. So even if it is very different from the first frame in the GOP, it can still have good quality (as long as it's similar to the previous frame). However, if your GOP looks like this:
I B B B B B B B P \
The B-pictures can only be based on the first or last frames (the I- and P-pictures). Even if the 4th frame is very similar to the 3rd frame, it doesn't benefit the least bit from that. It has to try to use parts of the (possibly very different) first and / or last frames.
The example above uses a "closed GOP" to make the explanation simpler; if your encoder supports open GOPs you could use something like this instead:
I B B B B B B B \
And the B-pictures would be based on the first frame of this GOP and the first frame of the next GOP. Anyway, that's not relevant for the point, right now.
If you add a few P-pictures between the B-pictures, things get much better, since B-pictures can be based on P-pictures. For example:
I B P B P B P B P B P B \
Will produce pretty good results, since the B-pictures can always look at nearby P- and I-pictures for "useful" data (i.e., similar areas of the image).
So, great, bring on the B-pictures, right? Not exactly.
Even in a situation like the one described above, B-pictures still "throw away" part of your bitrate, since no frames can be based on them. And the more B-pictures you use, the further apart each P-picture is from the others, which makes their quality worse. And since the B-pictures are based on the surrounding P-pictures... you guessed it, they end up looking worse, too.
So it's all a matter of balance. Usually, putting a single B-picture between P-pictures won't cause a significant loss of quality, but two or more start to bring the quality down. If you are limited to a very low bitrate, then your quality will never be brilliant anyway, and using a few B-pictures (2 or even 3) between the P-pictures might actually end up improving things, especially if there isn't a lot of motion.
Scene change detection will improve the quality of the movie (at the cost of encoding time) whenever you use P- or B-pictures. If you are encoding with I-pictures only, scene change detection is useless (what scene change detection does is insert I-pictures at strategic frames; if all your frames are already I-pictures, that's pointless).
The overall GOP length (which is basically the distance between I-pictures) can (should) be tweaked according to the type of footage. If you have a lot of fast, complex motion, you will benefit from more I-pictures (even though this forces the encoder to allocate less bits for each). If most of the motion is slow or regular (meaning frames are similar to their "neighbours"), then P-pictures will be able to handle things fine, which leaves more bits for the I-pictures.
Generally, if your encoder has a good "scene change detection" algorithm, you should use very long GOPs (ex., I=1 and P=14) and let the encoder insert I-pictures when it feels that's necessary. If the encoder isn't smart enough to do that, follow the advice above (use shorter GOPs when there's a lot of motion, and longer when the motion is slower).
Note that a good "scene change detection" algorithm isn't one that actually detects scene changes. It's one that detects when one frame is different enough from the previous one to justify using an I-picture. This can happen several times during a scene.
Also, keep in mind that you'll only be able to set chapter marks or cut the resulting MPEG stream at points where a GOP starts, so using shorter GOPs will give you a bit more control later. If you are creating the "final" MPEG stream, though, it's best to use long GOPs and ensure that there are GOPs starting at the places where you'll want to create chapter marks (in TMPGEnc, you do this with the "force picture type setting" option). If you don't care if a chapter starts exactly on that frame, then just let the encoder handle GOP placement. At worst, you'll have to move a chapter mark 1/2 a second back or forward.
Now, that was simple, wasn't it?
RMN
~~~
Note: This message has been edited to clarify the restrictions imposed by closed GOPs.
Last edited by RMN on Tue 4 Mar 2008, 20:27; edited 5 times in total
Holy cow, that was the most thorough and easy to understand explanation I have ever read anywhere! This definitely gets added to my bookmarks to reference again and point others with the same question in the right direction.
I suppose that only took you about two hours to write, too
On the subject of using all I-pictures only... why does the bitrate have to be very high to use exclusively I-pictures? A standard 8000 CBR wouldn't cut it? I'm thinking about those cases where I'm only mastering a 10 minute clip, for example, which would leave plenty of room on a DVD in a normal situation.
I use TMPGEnc for my encoding, which supposedly has a great scene change detection algorithm. I always default to 1 I 14 P 0 B. I take it if the "detect scene changes" algorithm is good, there should never be a need for B pictures? Unless the bitrate is a really low amount.
Joined: 04 Feb 2003 Posts: 587 Location: Lisboa, Portugal
Posted: Fri 21 May 2004, 3:26 Post subject:
Presence wrote:
On the subject of using all I-pictures only... why does the bitrate have to be very high to use exclusively I-pictures? A standard 8000 CBR wouldn't cut it?
Not if you want good quality. At 8000 kb/s, in NTSC, each frame is limited to about 270 kb (that's kilobits, which corresponds to 33 kB, or kilobytes).
If you take a typical 720x480 "photographic" image and save it as a JPEG in Photoshop, adjusting the compression level so the resulting file is 33 kB, you'll see this corresponds to quality level "1" (which is scientifically known as "crap") .
To get good quality at 720x480, you'll usually need about 100 kB per frame (depends on the actual image, of course, some things need more, some less, interlaced frames are harder to compress, etc.). Doing the same test as above, you'll see that 100 kB corresponds to Photoshop's JPEG quality level "8" (scientific term: pretty good).
100 kB is 800 kb. Multiply that by 29.97 (for NTSC) and you end up with 24 Mb/s. It's not a coincidence that DV (which uses a compression method very similar to I-picture MPEG) uses 25 Mb/s for the video.
Presence wrote:
I take it if the "detect scene changes" algorithm is good, there should never be a need for B pictures? Unless the bitrate is a really low amount.
The use of B-pictures isn't really related to scene change detection. What scene change detection (or its lack) influences is how long your GOPs can (should) be. Without scene change detection, long GOPs will cause problems when the scene changes (because, all of a sudden, no data from the previous frame can be used in the new frame, and the algorithm panics and you get a lot of "dancing blocks", until the next GOP starts).
The decision to use B-pictures (or not) depends on your bitrate and on how much motion a given scene has. If your bitrate is low and there is little or no motion, B-pictures will improve the overall quality (because the bits they save can improve the other frames). If there's a lot of motion, B-pictures will increase the distance between P-pictures, which makes it harder for a P-picture to re-use parts of the previous one, which in turn worsens the quality of the whole movie.
So, basically, stick to I and P only as long as your bitrate is above 6500 kb/s. When going below that, add one B-picture. When going below 3500, add another. And if your clip doesn't have any fast motion, add another one (as long as the bitrate is under 8000 or so). So that's 3 B-pictures maximum (for a "calm" clip at less than 3.5 Mb/s).
It's not an exact science, but for most types of footage these values work well.
As to the number of P-pictures, keep it as high as possible if you have scene detection enabled on the encoder. If not, adjust it depending on how much motion your movie has (more motion => shorter GOPs).
Joined: 04 Feb 2003 Posts: 587 Location: Lisboa, Portugal
Posted: Fri 22 Sep 2006, 21:55 Post subject:
compusic wrote:
if the B-pictures look into future P-pictures, then the encoder will encoder those P-pictures first, and later the B-pictures ?
Correct. If a GOP looks like this:
123456789
IBBBPBBBP
It will actually be encoded (and stored in the file) like this:
159234678
IPPBBBBBB
Or like this:
152349678
IPBBBPBBB
In other words, all the I- and P-pictures required to decompress a certain B-picture are stored (and compressed, and decompressed) before that B-picture. They are still played back in the correct sequence, but they have to be decodeded out of sequence, otherwise the player wouldn't be able to reconstruct the B-pictures.
During playback, the decoder will typically do this:
- Decode frame 1 (I)
+ Display frame 1
- Decode frame 5 (P)
- Decode frame 2 (B, based on 1 and 5)
+ Display frame 2
- Decode frame 3 (B, based on 1 and 5)
+ Display frame 3
- Decode frame 4 (B, based on 1 and 5)
+ Display frame 4
+ Display frame 5
- Decode frame 9 (P)
- Decode frame 6 (B, based on 5 and 9)
+ Display frame 6
- Decode frame 7 (B, based on 5 and 9)
+ Display frame 7
- Decode frame 8 (B, based on 5 and 9)
+ Display frame 8
+ Display frame 9
Technically, the encoder could base frame 7 (for example) on frame 1 as well, and it could base frame 2 (for example) on frame 9, but I'm not sure this is allowed by MPEG-2 (I'm pretty sure it isn't allowed by MPEG-1). MPEG-4 does allow it. So, for MPEG-1 (and probably MPEG-2), the encoder will only look only at the nearest I- or P-pictures. The decoder doesn't really need to worry about that; it's up to the encoder to ensure that all the data necessary to decode frame X is present in the file before the description of frame X itself.
compusic wrote:
"no frames can be based on B-pictures."
for such a GOP---- \I P B P\. even the 4th frame - P-frame is very similar to the B-frame, it still has to look into the 2nd frame--the P-frame?
Correct. Since frame 3 (B) can be based on frame 4 (P), naturally frame 4 cannot be based on frame 3 (otherwise you'd have a cyclic dependency; a sort of "infinite loop" ).
RMN
~~~
Last edited by RMN on Tue 4 Mar 2008, 20:33; edited 1 time in total
Joined: 04 Feb 2003 Posts: 587 Location: Lisboa, Portugal
Posted: Sun 24 Sep 2006, 1:30 Post subject:
P-pictures can only be based on preceding frames in the video, they cannot "look into the future" for similarities. That is the fundamental difference between P- and B-pictures.
B-pictures are (or can be; it's up to the encoder) based on multiple different frames (usually just two, but I think it's "legal" to base them on macroblocks from any frame in the GOP - don't quote me on that, though; I'm not sure about all the restrictions that apply to DVD-compliant MPEG-2). In any case, the encoder will always look at (at least) two frames when encoding a B-picture.
The "B" in "B-picture" stands for "bidirectional" (or "bidirectionally predicted").
RMN
~~~
Last edited by RMN on Tue 4 Mar 2008, 20:35; edited 1 time in total
Joined: 04 Feb 2003 Posts: 587 Location: Lisboa, Portugal
Posted: Wed 21 Feb 2007, 3:19 Post subject:
According to a couple of papers I read recently, in MPEG-2 the encoder is indeed restricted to basing B-pictures on the two nearest non-B-pictures. In newer standards, like MPEG-4, B-pictures can be based on as few or as many frames as the encoder decides (and can even be based on other B-pictures - as long as they are encoded before the current one, naturally).
I have updated a couple of the messages above to make this clearer.
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum