RFC 2429 (rfc2429) - Page 2 of 17


RTP Payload Format for the 1998 Version of ITU-T Rec



Alternative Format: Original Text Document



RFC 2429                         H.263+                     October 1998


   The 1998 version of ITU-T Recommendation H.263 added numerous coding
   options to improve codec performance over the 1996 version.  The 1998
   version is referred to as H.263+ in this document.  Among the new
   options, the ones with the biggest impact on the RTP payload
   specification and the error resilience of the video content are the
   slice structured mode, the independent segment decoding mode, the
   reference picture selection mode, and the scalability mode.  This
   section summarizes the impact of these new coding options on
   packetization.  Refer to [4] for more information on coding options.

   The slice structured mode was added to H.263+ for three purposes: to
   provide enhanced error resilience capability, to make the bitstream
   more amenable to use with an underlying packet transport such as RTP,
   and to minimize video delay.  The slice structured mode supports
   fragmentation at macroblock boundaries.

   With the independent segment decoding (ISD) option, a video picture
   frame is broken into segments and encoded in such a way that each
   segment is independently decodable.  Utilizing ISD in a lossy network
   environment helps to prevent the propagation of errors from one
   segment of the picture to others.

   The reference picture selection mode allows the use of an older
   reference picture rather than the one immediately preceding the
   current picture.  Usually, the last transmitted frame is implicitly
   used as the reference picture for inter-frame prediction.  If the
   reference picture selection mode is used, the data stream carries
   information on what reference frame should be used, indicated by the
   temporal reference as an ID for that reference frame.  The reference
   picture selection mode can be used with or without a back channel,
   which provides information to the encoder about the internal status
   of the decoder.  However, no special provision is made herein for
   carrying back channel information.

   H.263+ also includes bitstream scalability as an optional coding
   mode.  Three kinds of scalability are defined: temporal, signal-to-
   noise ratio (SNR), and spatial scalability.  Temporal scalability is
   achieved via the disposable nature of bi-directionally predicted
   frames, or B-frames. (A low-delay form of temporal scalability known
   as P-picture temporal scalability can also be achieved by using the
   reference picture selection mode described in the previous
   paragraph.)  SNR scalability permits refinement of encoded video
   frames, thereby improving the quality (or SNR).  Spatial scalability
   is similar to SNR scalability except the refinement layer is twice
   the size of the base layer in the horizontal dimension, vertical
   dimension, or both.





Bormann, et. al.            Standards Track