RFC 2035 (rfc2035) - Page 2 of 16
RTP Payload Format for JPEG-compressed Video
Alternative Format: Original Text Document
RFC 2035 RTP Payload Format for JPEG Video October 1996 in one or more passes. Each pass (called a frame in the JPEG standard) is further broken down into one or more scans. Within each scan, there are one to four components,which represent the three components of a color signal (e.g., "red, green, and blue", or a luminance signal and two chromanince signals). These components can be encoded as separate scans or interleaved into a single scan. Each frame and scan is preceded with a header containing optional definitions for compression parameters like quantization tables and Huffman coding tables. The headers and optional parameters are identified with "markers" and comprise a marker segment; each scan appears as an entropy-coded bit stream within two marker segments. Markers are aligned to byte boundaries and (in general) cannot appear in the entropy-coded segment, allowing scan boundaries to be determined without parsing the bit stream. Compressed data is represented in one of three formats: the interchange format, the abbreviated format, or the table- specification format. The interchange format contains definitions for all the table used in the by the entropy-coded segments, while the abbreviated format might omit some assuming they were defined out-of-band or by a "previous" image. The JPEG standard does not define the meaning or format of the components that comprise the image. Attributes like the color space and pixel aspect ratio must be specified out-of-band with respect to the JPEG bit stream. The JPEG File Interchange Format (JFIF) [4] is a defacto standard that provides this extra information using an application marker segment (APP0). Note that a JFIF file is simply a JPEG interchange format image along with the APP0 segment. In the case of video, additional parameters must be defined out-of-band (e.g., frame rate, interlaced vs. non-interlaced, etc.). While the JPEG standard provides a rich set of algorithms for flexible compression, cost-effective hardware implementations of the full standard have not appeared. Instead, most hardware JPEG video codecs implement only a subset of the sequential DCT mode of operation. Typically, marker segments are interpreted in software (which "re-programs" the hardware) and the hardware is presented with a single, interleaved entropy-coded scan represented in the YUV color space. 2. JPEG Over RTP To maximize interoperability among hardware-based codecs, we assume the sequential DCT operating mode [1,Annex F] and restrict the set of predefined RTP/JPEG "type codes" (defined below) to single-scan, interleaved images. While this is more restrictive than even Berc, et. al. Standards Track



