THE TRICKS TO MPEG'S SUCCESS
OxfordEnglish4IT
The most common system for the compression of video is MPEG. It works like this. The single data stream off the CD-ROM is split into video and audio components, which are then decompressed using separate algorithms. The video is processed to produce individual frames as follows. Imagine a sequence of frames depicting a bouncing ball on a plain background. The very first is called an Intra Frame (I-frame). I-frames are compressed using only information in the picture itself just like conventional bitmap compression techniques like JPEG.
Following I-frames will be one or more predicted frames (P-frames). The difference between the P-frame and the I-frame it is based on is the only data that is stored for this P-frame. For example, in the case of a bouncing ball, the P picture is stored simply as a description of how the position of the ball has changed from the previous I-frame. This takes up a fraction of the space that would be used if you stored the P-frame as a picture in its own right. Shape or colour changes are also stored in the P-frame. The next P-frame may also be based on this P-frame and so on. Storing differences between the frames gives the massive reduction in the amount of information needed to reproduce the sequence. Only a few P-frames are allowed before a new I-frame is introduced into the sequence as a new reference point, since a small margin of error creeps in with each P-frame.
Between I and P-frames are bi-directional frames (B-frames), based on the nearest I or P-frames both before and after them. In our bouncing ball example, in a B-frame the picture is stored as the difference between the previous I or P-frame and the B-frame and as the difference between the B-frame and the following I or P-frame. To recreate the B-frame when playing back the sequence, the MPEG algorithm uses a combination of two references. There may be a number of B-frames between I or P-frames. No other frame is ever based on a B-frame so they don't propagate errors like P-frames.
Typically, you will have two or three Bs between Is or Ps, and perhaps three to five P-frames between Is.