RTP load format for MPEG-4 audio-visual flow

xiaoxiao2021-03-06  39

Organization: China Interactive Publishing Network (http://www.china-pub.com/)

RFC Document Chinese Translation Program (http://www.china-pub.com/compters/emook/aboutemook.htm)

E-mail: Ouyang@china-pub.com

Translator: Li Chao (LICC_LI, LICC_LI @ sina.com)

Translation time: 2001-4-26

Copyright: This Chinese translation copyright belongs to China Interactive Publishing Network. Can be used for non-commercial use free reprint, but must

Keep the translation and copyright information of this document.

Network Working Group Y. Kikuchi

REQUEST for Comments: 3016 TOSHIBA

Category: Standards TRACK T. NOMURA

NEC

S. Fukunaga

Oki

Y. Matsui

Matshita

H. Kimata

NTT

NOVEMBER 2000

RTP load format for MPEG-4 audio-visual flow

(RRC3016 RTP PAYLOAD FORMAT for MPEG-4 AUDIO / VISUAL STREAMS)

The state of this memo

This document tells the Internet standard tracking protocol of an Internet community, which requires further discussion and suggestions

Get improved. Please refer to the latest version of the Internet Formal Protocol Standard (STD1) to get the standardization of this agreement.

And state. The release of this memo is not restricted.

Copyright Notice

Copyright (c) The Internet Society (2000). All Rights Reserved.

Summary

This article describes the RTP load format carrying MPEG-4 audio and visual code stream without using the MPEG-4 system. In order to be able to

The MPEG-4 audio / visual code stream is mapped directly to the RTP package, which provides the usage specification and fragmentation rules for the RTP Package field. with

The time of use of MIME Type Registration and Session Description Protocol (SDP) is also specified.

table of Contents

Status of this memo 1

Copyright Notice 1

Abstract 1

Introduction 2

1.1 MPEG-4 Vision RTP Load Format 3

1.2 MPEG-4 audio RTP load format 3

2. Requirements Terms 4

3. MPEG-4 visual stream RTP group package 4

3.1 MPEG-4 use of RTP header fields 4

3.2 MPEG-4 Visual Code Flow Split 5

3.3 MPEG-4 Visual Code Flow Group Pack Example 6

4. MPEG-4 audio stream RTP group package 7

4.1 RTP Packet Format 7

4.2 Use of RTP header fields in MPEG-4 audio 8

4.3 MPEG-4 audio code 3

5. MPEG-4 audio-visual flow MIME type registration 9

5.1 MPEG-4 Vision MIME Type Registration 9

5.2 MPEG-4 Vision SDP Usage 10

5.3 MPEG-4 Audio MIME Type Registration 11

5.4 SDP USAGE OF MPEG-4 AUDIO 12

6. Safety considerations 13

7. Reference 13

8. Author address 13

9. Copyright Notice 14

Acknowledgments 14

1 Introduction

The RTP load format described herein specifies how MPEG-4 audio flow [3] [5] and MPEG-4 visual flow [2] [4]

The film is mapped to the RTP package.

By defining these RTP load formats, applications can be directly directly used without using MPEG-4 system synchronization

Transmit MPEG-4 audio / visual flow. The RTP load format of this article can be applied to those that have stream management functions and do not require MPEG-4.

A system similar to a system in the system. For example, H.323 terminals, the management of MPEG-4 sound / video streams does not manage by the MPEG-4 system object descriptor, but H.245 is used. The stream is directly mapped to the RTP package and does not use the MPEG-4 system synchronization layer. other

Examples include SIP and RTSP, which use MIME and SDP. The RTP load format described herein defines the MIME type and SDP.

Method, the sound / visual flow properties when not using MPEG-4 systems (such as media type, package format, and encoding configuration) are directly specified.

This clear advantage is that it can be used as a universal approach to these non-MPEG-4 encoding formats.

MPEG-4 audio / visual RTP load format is processed. The disadvantage is that interoperability based on MPEG-4 system environments may be compared

Difficulties, other load formats are more suitable for these applications.

In this case, the semanticity of the RTP Baotou must define very clear, including the MPEG-4 tone / video data element.

system. In addition, in order to enhance the error recovery ability, an error recovery tool is provided inside the MPEG-4 video stream, it is best to be MPEG-4

The video stream defines the shard rules of the RTP package.

1.1 MPEG-4 Vision RTP load format

MPEG-4 vision is a visual coding standard, which has the following nature: high coding efficiency; high error recovery;

Diverse, arbitrary object coding; etc. [2]. The rate range is between several kbps to a few Mbps. And it can adapt to the difference from different

Various network types such as mobile networks to high error rates.

We should note that the shard rules of the MPEG-4 visual stream defined herein should be noted that because MPEG-4 visuals will be used more

The type of network, so there should not be too many restrictions in terms of fragmentation. Such as "single video package needs to be mapped to a single RTP package"

Split rules are unreasonable. On the other hand, general, and unknown media fragmented can also result in error recovery rates and bandwidth utilization

The decrease in the rate. The shard rules described herein are very flexible, but in order to avoid meaninglessness when applying MPEG-4 visual error recovery.

Split also defines a minimal rule set.

Split Rules It is recommended not to map multiple VOPs in a RTP package, which ensures that the RTP timestamp can uniquely represent VOP

Different frame time. Conversely, since the MPEG-4 video can generate very small VOP, such as an empty VOP containing only a VOP header

(VOP_CODED = 0) or a single-shaped VOP having only a small amount block. In order to reduce overhead, the fragmentation rules should allow multiple VOPs.

Connect to an RTP package. (See 3.2 Section Split Rules (4) and 3.1 Signals and Timestamps)

In video coding tools such as H.261 or MPEG-1/2, it is often helped by the defined additional media RTP header.

Restoring damaged pictures, and MPEG-4 visual has provided error recovery capabilities, which can be used for RTP / IP networks,

It can also be used in other networks (H.223 / Mobile, MPEG-2 / TS, etc.). Therefore, there is no need to be in the MPEG-4 visual RTP load format.

Extra RTP Baotou.

1.2 MPEG-4 audio RTP load format

MPEG-4 audio is an new audio standard integrating a variety of audio encoding tools. LATM (low burden MPEG-4 audio biography

Transferring) Manages audio data sequences by fairly small cost. For those applications with only audio, do not use MPEG-4 systems

It is worthwhile to use directly to map LATM-based MPEG-4 audio code streams to the RTP package.

Latm has the following multiplexing features:

- Carrying configuration information in audio data,

- Connect multiple audio frames into an audio stream,

- Multi-object (program) multiplexing

- Reburable layer multiplexing,

No last two properties are required in RTP transmission. Therefore, the two properties cannot be used based on the application of the RTP group package principles specified herein. Since LATM is developed for the natural audio coding tool, not the development of synthetic tools,

Configuration Audio (SA) Data and Documentary Conversion Interface (TTSI) data are difficult. So can't pass the RTP group package method through this document

Transmit SA data and TTSI data.

In order to transmit the telescopic flow, the audio data of each layer should be packaged to different RTP packets, so that it can be guaranteed in the IP layer.

The same level has different processing, such as through some distinction services. On the other hand, all configuration data of the telescopic stream is included in one

LATM Configuration Data "SteamMuxConfig" and each layer is shared to share the streammuxconfig. Layer and its configuration data

Template is done by the LATM header information attached to the audio data. In order to represent the dependent information of the scalable stream, it also targets the load type

(PT) Value (see Section 4.2) Dynamic allocation rules use a limit to limit measures.

For MPEG-4 audio encoding tools, if the load is a single audio frame, the loss of the package does not affect the solution of the neighboring package.

code. This is also applicable to other audio encoders. Therefore, MPEG-4 audio does not require additional media specific heads for erroneous recovery.

You can use some of the existing RTP protection mechanisms to increase the error recovery rate, such as universal forward error correction (RFC 2733) and redundant audio

Data (RFC 2198).

2. Required terms

The keyword "must", "must not", "should", "should not", "will not", "will not", "will not", "will not", ""

"Recommendations", "maybe", "optional" explained in RFC 2119.

3. MPEG-4 visual code stream RTP group package

This section specifies the RTP group rules for MPEG-4 visual content. A MPEG-4 visual code stream can be mapped directly to the RTP package

Need to add additional headers or delete any visual syntax elements. In order to put the basic flow configuration information in the same RTP port

Upload, you must use the merge configuration / basic flow mode. (See ISO / IEC 14496-2 [2] [9] [4] in 6.2.1 "Start Code")

The configuration information can be specified by the external manner. For H.323 terminal, H.245 code must be used

Point "DecoderConfigurationInformation". If the system uses MIME content types and SDP parameters such as SIP and RTSP,

The configuration information must be specified with an optional parameter "config" (see 5.1 and 5.2).

When short video head mode is used, the RTP load format of H.263 (recommended using RFC2429 defined format, but also

The RFC2190 format can be used to achieve compatibility with the old system).

0 1 2 3

0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1

- - - - - - - - - - - - - - - -

| V = 2 | P | x | CC | M | PT | Sequence Number | RTP

- - - - - - - - - - - - - - | TimeStamp | Header

- - - - - - - - - - - - - - - -

| SYNCHRONIZATION SOURCE (SSRC) IDENTIFIER |

= = = = = = = = = = = = = = = = = = = = = = = = = = = =

Contributing Source (CSRC) Identifiers |

| ....

= = = = = = = = = = = = = = = = = = = = = = = = = = = =

| | | RTP

| MPEG-4 Visual Stream (Byte Aligned) | PAY-

| | LOAD

- - - - 5

| ... OPTIONAL RTP Padding |

- - - - - - - - - - - - - - - - - Figure 1 - RTP package for MPEG-4 visual flow

3.1 Use of RTP header fields in MPEG-4 visual

Load Type (PT): Assigning the RTP load type for the new package format beyond the category of this article, not described here. Particular class

The RTP framework of the model should be responsible for the assignment of load type. If it is not possible, if it is not possible, it should pass the external signaling protocol (eg,

H.245, SIP, etc. Select a load type in the dynamic range.

Extension-x bit): Definitions used by the RTP framework.

Sequence Number: In order to be safe from a random initial value, each of the RTP packets plus 1 is sent.

Marker-m) bit: Sign bit is set to 1 Sign This is the last one of the VOP (or only one) RTP package. If one

A plurality of VOPs carrying multiple VOPs in an RTP package are also set to 1.

Timestamp: Timestamp represents the VOP sampling time in the RTP package. For security, plus a random constant

Offset.

- When an RTP package carries multiple VOP, the timestamp represents the time of the earliest VOP. Other VOP timestamp

Interest through the timestamp field of the VOP header (MODULO_TIME_BASE and VOP_TIME_INCREMENT).

- If the RTP package contains only configuration information or group_of_videobjectPlane () field, use the encoding queue in the next

VOP timestamp.

- If the RTP package contains only Visual_Object_SEQUENCE_END_CODE information, use the coding queue in the previous VOP

Timestamp.

The timestamp resolution is set to the default value 90kHz unless otherwise specified by the out-of-band.

The use of other headers is shown in RFC 1889 [8].

3.2 MPEG-4 Visual Code Flow Split

Use the merge configuration / basic stream mode, and the shard MPEG-4 visual code stream is directly mapped to the RTP load without adding any

Additional headers or delete visual syntax elements. The following rules can be applied when fragmentation.

Hereinafter, header may represent the following information:

- Configuration information (visual object sequence head, visual object head and video object layer)

Visual_Object_sequence_end_code

- Basic stream entry point function (group_of_videobjectPlane (),

Video_Plane_with_short_header (), Meshobject () or faceObject ())

- Video Baotou (VIDEO_PACKET_HEADER (), except NEXT_RESYNC_MARKER ()

- GOB_LAYER () header

The definition of configuration information and entry point functions See ISO / IEC 14496-2 [2] [9] [4] "6.2.1" Start Code "

(1) Configuration information and group_of_videoObjectPlane () fields should be located in the beginning of the RTP load or on the syntax

After the upper layer function head.

(2) If one or more heads exist in the RTP load, the RTP load should start from the highest function header on the syntax.

Note: Visual_Object_sequence_end_code acts as the lowest function.

(3) One head should not be in multiple RTP packets.

(4) Different VOP should be divided into different RTP packets, one RTP package only includes data related to the only VOP time (pointed out in the timestamp field of the RTP header). The exception is that if the VOP is small, a single RTP package carries a plurality of decoding sequence.

Continued VOP.

Note: When an RTP load carries a plurality of VOP, the first VOP timestamp after decoding is calculated when decoding.

This operation is only necessary only when the RTP packet sign bit is 1 and the RTP load begins when the start code is conforming. (See 3.1 Timestamps and Sign

Bits)

(5) It is recommended that a video package forms an RTP package for sending. The size of the video package should be decided as follows, ie, knot

If the RTP package is large, it must not exceed the size of the path MTU.

Note: Rule (5) does not apply to the following occasions, encoder configuration prohibiting video packages (by putting the Vol header)

RESYNC_MARKER_DISABLE is set to 1), or the encoding tool does not support video packages. In this case, a VOP may get

After fragmentation at any byte position, it can be sent.

The video package begins with the VOP head or video header, followed by motion_shape_texture (),

Next_RESYNC_MARKER () or next_start_code () ends.

3.3 MPEG-4 Visual Code Flow Group Package

Figure 2 shows an example of the RTP package generated by the standards of 3.2.

(a) Example indicates the first RTP package or random access point in the MPEG-4 visual code stream containing configuration information. According to the rules (1),

The visual object sequence header should be located at the beginning of the RTP load, visual object head and video object homes (Vo header, Vol header)

prior to. 3.2 The shard rules defined in 3.2 guarantees configuration information from Visual_Object_sequence_start_code.

All in the start position of the RTP load, the RTP receiver can be checked by check whether the head 32 bit field of the RTP load is

Visual_Object_sequence_start_code to detect a random access point.

(b) is another example of an RTP packet containing configuration information. It is also the same (1) to include the configuration information of the RTP package in the VOP.

Contains a video package. Since the configuration information is very short (generally several trimes), an RTP package is included if only configuration information will be included.

The rise of the system overhead, so the configuration information and the subsequent GOV and / or (part) VOP can be packaged into the same RTP package, so

The example shown in.

(c) is an example of Group_OF_VideoBjectPlane (GOV) in the RTP package. According to Rules (1), GOV is located in RTP

The start position of the load. A RTP package size with only GOV fields is only 7 bytes, which is a great waste of RTP / IP header.

Therefore, subsequent VOPs (or partially) can be hit in the same RTP package as shown in this example.

(D) In ​​the case, a video package is packaged into an RTP package. This method is recommended when the packet loss rate is high in the network. very

When the RTP package containing the VOP header is discarded, the other RTP packets can be decoded by using HEC information in the video clamp. No need

Any additional RTP header field.

(e) Cases of playing a plurality of video packs in an RTP package. This group of packages can be efficiently used when the bottom layer network rate is very low.

Save RTP / IP header overhead. However, since a RTP package will cause multiple video packages to be lost at the same time, this method will drop

Low packet recovery rate. The ideal video package number in the RTP package and the RTP packet length can be transmitted by the packet loss rate and the base network.

The rate is determined.

(f) Example is set to 1 to 1 in the VOL header to disable the use of the video package. In this situation,

A VOP can be divided into multiple RTP packets in any byte location. For example, a VOP is sharper according to the fixed length. This coding

Configuration methods and RTP shards can be applied to networks that provide extremely low error rate assurance. On the other hand, since its packet loss recovery rate is very

Poor, it is recommended not to use in an Error-Prone environment.

Figure 3 shows the RTP package established by 3.2 rules. According to (a), a head slide to multiple RTP packets not only caused an increase in RTP / IP header, but also causes error recovery capabilities.

Decline. Therefore, doing this in the rules (3).

When multiple video packages are connected in series to an RTP package, the VOP head or video_packet_header () should not be placed in the RTP load

in the middle. The assembly method in (b) is violated the rules (2) based on the purpose of error recovery. Compared with the example 6 in Figure 2,

Both are mapped two video packages to two RTP packets, and their packet loss recovery rate is different. That is to say, suppose the second RTP

The package is lost, and two video packs in Figure 3 (b) will be lost, while only the video package 2 is lost in Figure 2 (d).

---- ------ ---- ----

(a) | RTP | VS | VO | VOL |

| Header | Header | Header | Header |

---- ------ ---- ----

---- ------ ---- ------ ----------

(b) | RTP | VS | VO | VOL | Video Packet |

| Header | Header | Header | Header | |

---- ------ ---- ------ ----------

------ ----- ----------------

(c) | RTP | GOV | Video Object Plane |

| Header | | | | |

------ ----- ----------------

---- ------ ---------- ---- ------ ------- -----

(d) | RTP | VOP | Video Packet | | RTP | VP | Video Packet |

| Header | Header | (1) | | Header | Header | (2) |

---- ------ ---------- ---- ------ ------- -----

---- ------ ---------- ---- ------------ - ---- ------------

(e) | RTP | VP | Video Packet | VP | Video Packet | VP | Video Packet |

Header | Header | (1) | HEADER | (2) | Header | (3) |

---- ------ ---------- ---- ------------ - ---- ------------

---- ------ ---------- ---- ------------

(f) | RTP | VOP | VOP FRAGMENT | | RTP | VOP FRAGMENT |

| Header | Header | (1) | | Header | (2) | ___ ------ ---- ---------- ---- - ------------

Figure 2 - MPEG-4 visual code flow example of RTP group package

---- ------------- ---- ---------- ------- -----

(a) | RTP | First Half of | | RTP | Last Half of | Video Packet |

| Header | VP header | | HEADER | VP HEADER | |

---- ------------- ---- ---------- ------- -----

---- ------ ---------- ---- -------- ------ ----------

(b) | RTP | VOP | First Half | | RTP | Last Half | VP | Video Packet |

Header | Header | Of VP (1) | | Header | Of VP (1) | Header | (2) |

---- ------ ---------- ---- -------- ------ ----------

Figure 3 - MPEG-4 visual code stream for prohibiting RTP group packs

4. MPEG-4 audio stream RTP group package

This section specifies the RTP group rules for MPEG-4 audio streams. The MPEG-4 audio stream must be formatted through the LATM tool.

The LATM-based stream will then be mapped to the RTP packet according to the description below.

4.1 RTP package format

The LATM-based stream consists of an AudiomuxElements sequence containing one or more audio frames. A complete or

Some complete AudiomuxElements can be mapped directly to an RTP load without having to delete any AudiomuxElement syntax

Element (see Figure 4). The first byte of each AudiomuxElement should be located in the position where the RTP package is located.

0 1 2 3

0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1

- - - - - - - - - - - - - - - -

| V = 2 | P | x | CC | M | PT | Sequence Number | RTP

- - - - - - - - - - - - - - - -

| TIMESTAMP | Header - - - - - - - - - - - - -

| SYNCHRONIZATION SOURCE (SSRC) IDENTIFIER |

= = = = = = = = = = = = = = = = = = = = = = = = = = = =

Contributing Source (CSRC) Identifiers |

| ....

= = = = = = = = = = = = = = = = = = = = = = = = = = = =

| | | RTP

: AudiomuxElement (byte aligned): payload

| | |

- - - - 5

| ... OPTIONAL RTP Padding |

- - - - - - - - - - - - - - - -

Figure 4 - A MPEG-4 audio RTP package is decoded to AudiomuxElement, and must be indicated by the external method to indicate MUXCONFIGPRESENT. when

When the SDP is used for this instruction, the MIME parameter "cpresent" corresponds to MUXCONFIGPRESENT information. (See Section 5.3).

MUXCONFIGPRESENT: If this value is 1 (in-band mode), AudiomuxElement should include an indicator bit.

"Usesamestreammux" and may include an audio compression configuration information "streammuxconfig".

The UseSameStreamMux bit indicates whether the StreamMuxConfig element in the previous frame is also applied to this frame. in case

The UseSameStreamMux bit indicates StreamMuxConfig for the previous frame, and the previous frame is lost, it will not be possible

The front frame is decoded. Therefore, in the in-band mode, the streammuxConfig element should be repeatedly transmitted according to the network condition. in contrast,

If MUXConfigPresent is set to 0 (out-of-band mode), the StreamMuxConfig element needs to be transmitted through bandwright. in the case of

SDP, use MIME parameters "config" (see Section 5.3).

4.2 Use of RTP header fields in MPEG-4 audio

Load Type (PT): Assigning the RTP load type for this new package format beyond the category of this article, not described herein. special

The RTP framework for customized type applications should be responsible for assigning load types for encoding, if you can't, you should pass the proparonent

(Eg, H.245, SIP, etc.) select a load type in the dynamic range. Dynamic allocation RTP load class dynamically

When the type should be assigned a different value for each layer. These values ​​should be allocated in a strong order of reliance on the relationship, the most basic

The layer has the smallest value.

The flag bit (M): The flag indicates the AudiomuxElement range. Set to 1 Description RTP bags contain complete

The last piece of AudiomuxElement or AudiomuxElement fragment.

Timestamp: Timestamp represents the sampling time of the first audio frame in the RTP package. From the security perspective, the recommended timestamp is from one

The random value begins. The resolution of the timestamp is set to the default value 90kHz unless specified.

Sequence number: In order to be safer, the sequence number should start from a random initialization value, plus one RTP data package plus 1.

The use of other headers is followed by RFC 1889 [8].

4.3 MPEG-4 audio code diversity

It is recommended that only one AudiomuxElement is placed in each RTP package. If the size of the AudiomuxElement is small enough,

There is no problem with the size of the RTP package does not exceed the size of the path MTU. Otherwise you have to sharpen AUDIOMUXELEMENT to multiple

in the bag.

5. MPEG-4 audio-visual flow MIME type registration

The next few sections describe the MIME type registration of MPEG-4 audiovisual streams. MIME type registration and SDP use of MIME type of MPEG-4 visual flow

The MIME type registration and SDP usage of the MPEG-4 audio stream are described in 5.3 and 5.4.

5.1 MPEG-4 Visual MIME Type Registration

MIME Media Type Name: Video

MIME Sub Type: MP4V-ES

Required parameters: none

Optional parameters:

Rate: This parameter is only used for RTP transmission. Represents the resolution of the RTP header timestamp field. If this parameter is not specified

Use the default value 90000 (90kHz).

Profile-Level-ID (Frame Level ID): A Table G-1 of ISO / IEC 14496-2 [2] [4] definition

The MPEG-4 Visual Framework and Level Value Decoction (Profile_and_Level_indication). This parameter can be used for performance

Exchange or transaction is established in the process of representing the level combination that can be achieved by the MPEG-4 visual frame and the MPEG-4 visual encoder. If not specified

This parameter uses the default value 1. Config: This parameter is used to represent the configuration of the corresponding MPEG-4 visual stream. Not applied to representation of performance exchange

Coding ability. It is an 8-bit string of a 16-way form, which can represent ISO / IEC14496-2 [2] [4] [9] 6.2.1

Defined MPEG-4 visual configuration information. This configuration information can be directly mapped to 8-bit bytes according to the MSB (highest significant bit) priority principle

string. The first bit of configuration information should be located in the first 8-bit group of MSB. The configuration information indicated by this parameter should be and the corresponding MPEG-4 vision

The flow configuration information is the same, except for first_half_vbv_occupancy and latter_half_vbv_occupancy, if there is,

Then it is different in terms of repeated configuration information in the MPEG-4 visual stream. (See 6.2.1 of ISO / IEC14496-2 "start

coding").

The use of this parameter is as follows:

- MPEG-4 Visual Simple Profile / Level 1:

Content-type: video / mp4v-es; profile-level-id = 1

- MPEG-4 Visual Core Profile / Level 2:

Content-type: video / mp4v-es; profile-level-id = 34

- MPEG-4 Visual Advanced Real Time Simple Profile / Level 1:

Content-type: video / mp4v-es; profile-level-id = 145

Released:

MPEG-4 visual flow specification See ISO / IEC 14469-2 [2] [4] [9]. The RTP load format is described in RFC 3016.

Code consideration:

The video bit stream must be referred to with the MPEG-4 Vision Code (ISO / IEC 14496-2). A video bit stream is binary data,

It must be encoded to be able to transmit according to non-binary transmission (for email, Base64 encoding is sufficient). This type is also defined as via RTP

transmission. The RTP package must be set in accordance with the MPEG-4 visual RTP load format defined by the RFG 3016.

Safety considerations:

See Section 6 of RFC 3016.

Interoperability considerations:

MPEG-4 Vision provides a lot of tools for visual objects. In order to efficiently implement standards, it is also specific

The application provides a subset of MPEG-4 visual tools. These children are called 'Profiles', which limits the work required to implement an encoder.

Set of size. To control computational complexity, each profile is divided into several levels. The PROFILE @ level combination is as follows:

? A codec developer, responsible for achieving the required standard subset, maintenance, and other MPEG-4 devices within the same combination

Interaction.

• Check if the MPEG-4 device meets the standard ('consistency test').

Visual flow should be compatible with the MPEG-4 Visual Profile @ Level specified in the parameter "profile-level-id".

Interoperability of the sender and the recipient, by specifying the parameter "profile-level" in the MIME content, or pass

Coordination performance exchange / declaration process sets the parameter to the same value.

Use this media type application:

Audio flow and conference tools, Internet messages, and email applications.

With information: no

Contact and its email address:

RFC 3016 Author (see section 8).

Prepected usage: Common

Author or amender:

RFC 3016 Author (see section 8).

5.2 Usage of SDP in MPEG-4 Vision

MIME Media Type VIDEO / MP4V-ES can be mapped to SDP (RFC 2327), as follows:

? MIME Type (Video) Add SDP "M =" as the media name.

• The MIME subtype is added to the SDP "A = RTPMAP" as the encoded name.

• Optional parameter "rate" Add "A = RTPMAP" as the clock rate? Optional parameter "profile-level-id" and "config" join "A = FMTP" line indicates encoder capabilities and

. These parameters are separated by a semicolon, which is represented as a MIME media type string according to the pair form of "parameter = value".

Below is an example of media representation in SDP:

Simple profile / level 1, rate = 90000 (90kHz), "Profile-Level-ID" and "config" exists

"a = fmtp" line:

? m = video 49170/2 RTP / AVP 98

? a = RTPMAP: 98 MP4V-ES / 90000

? a = fmtp: 98

Profile-level-id = 1; config = 000001B001000001B5090000010000000120008440FA282C

2090A21F

Core Profile / Level 2, Rate = 90000 (90kHz), "profile-level" exists in "a = fmtp" line:

? m = video 49170/2 RTP / AVP 98

? a = RTPMAP: 98 MP4V-ES / 90000

? a = fmtp: 98 profile-level-id = 34

Advance Real Time Simple Profile / Level 1, Rate = 90000 (90kHz), "Profile-Level-ID" exists

"a = fmtp" line:

m = video 49170/2 RTP / AVP 98

A = RTPMAP: 98 MP4V-ES / 90000

a = fmtp: 98 profile-level-id = 145

5.3 MPEG-4 Audio MIME Type Registration

MIME Media Type Name: AUDIO

MIME Sub Type: MP4A-LATM

Required parameters:

Rate: The rate parameter represents the clock rate of the RTP timestamp. The default is 90000. Only when this value is set to with audio

Other non-default rates can also be specified when the sampling frequency (number per second sampling).

Optional parameters:

Profile-level-id: a decimal form of MPEG-4 audio frame level representation value, defined

ISO / IEC 14496-1 ([6] and its revised version). This parameter indicates which MPEG-4 audio worker can be used by the decoder.

It is a subset. If this parameter is not specified during performance exchange or transaction establishment, use the default value 30 (natural

Audio PROFILE / Level 1)

Object: A decimal form of MPEG-4 audio object type value is defined in ISO / IEC 14496-3 [5].

This parameter specifies the tool used by the encoder. This parameter can be used to limit performance "Profile-Level-Id"

under.

Bitrate: The data transfer rate of the audio data stream.

Cpresent: A Boolean parameter indicating whether the audio load configuration data has been multiplexed to an RTP load

(See 4.1). 0 indicates that it has not been reused, 1 indicates that it has been reused. The default value of this parameter is 1.

Config: an 8-bit string of 16-in-form, can represent ISO / IEC 14496-3 [5] (see 4.1)

Defined MPEG-4 audio load configuration data "streammuxconfig". This configuration information can be followed by MSB (highest valid)

Bit) Priority Principle is directly mapped to an 8-bit byte string. The first bit of configuration data should be located in the first 8-bit group of MSB. in

In the last 8-bit group, if necessary, follow the fill 0 after configuring the data.

PTIME: Recommended package duration, unit milliseconds.

Released:

This article describes the load format specification.

Coding specifications follow ISO / IEC 14496-3 [3] [5].

Code consideration:

This type is only defined as being used to transmit through RTP.

Safety considerations:

See Section 6 of RFC 3016.

Interoperability considerations:

MPEG-4 audio provides a large and rich tool for audio object encoding. In order to achieve standards more efficiently,

The MPEG-4 audio tool subset (similar to MPEG-4 vision in 5.1) is provided. The audio stream tool should be

The Profile @ level specified by the "Profile-Level-ID" parameter. Interoperability between the sender and the recipient

The parameter "profile-level" is specified in the MIME content, or the parameter is set to the phase during the negotiation performance exchange process.

Similar value is achieved. In addition, parameter "object" can be used to limit the ability to specify in the specified Profile @ Level level in performance exchange

Subsequent.

Use this media type application:

Audio flow and conference tool.

Additional information: no

Contact:

See Section 8 of RFC 3016.

Prepected usage: Common

Author / amender:

See Section 8 of RFC 3016.

5.4 SDP USAGE OF MPEG-4 AUDIO

MIME Media Type AUDIO / MP4A-LATM string can be mapped to the field of SDP (RFC 2327), as follows:

• MIME Type (AUDIO) Adds SDP "M =" as a media name.

? MIME Sub Type (MP4A-LATM) Add SDP "A = RTPMAP" as encoded name

• Required parameter "rate" joins "A = RTPMAP" as the clock rate.

• Optional parameter "PTIME" joins the SDP "A = PTIME" attribute

• Optional parameter "profile-level-id" joins the "A = FMTP" line represents the encoder capability. Parameter "Object"

Join "A = FMTP" attribute, load format related parameters "bitrate", "cpresent", "config"

Join the "A = FMTP" line. These parameters are separated by a semicolon, and the MIME is indicated by "parameter = value".

Media type string.

Here is an example of media representation in SDP:

For 6 kb / s CELP code stream (audio sampling frequency is 8 kHz),

? M = Audio 49230 RTP / AVP 96

? a = RTPMAP: 96 MP4A-latm / 8000

? a = fmtp: 96 profile-level-id = 9; object = 8; cpresent = 0;

Config = 9128b1071070

? a = pTIME: 20

For 64 kb / s AAC LC stereo code stream (audio sampling frequency is 24 kHz),

? M = Audio 49230 RTP / AVP 96

? a = RTPMAP: 96 MP4A-latm / 24000

? a = fmtp: 96 profile-level-id = 1; bitrate = 64000; cpresent = 0;

? config = 9122620000

In the above two examples, the audio configuration data is described only by SDP and is not multiplexed into the RTP load.

In addition, the "Clock Rate" is also set to audio sampling rate.

If the clock rate is set to the default, and the audio sampling rate must be obtained, the parameter "config" can be obtained.

to fulfill. Examples are as follows:? M = Audio 49230 RTP / AVP 96

? a = RTPMAP: 96 MP4A-latm / 90000

? a = fmtp: 96 Object = 8; cpresent = 0; config = 9128b1071070

The following example shows the audio configuration data in the RTP load.

? M = Audio 49230 RTP / AVP 96

? a = RTPMAP: 96 MP4A-latm / 90000

? a = fmtp: 96 Object = 2; cpresent = 1

6. Security considerations

The RTP package load format described in this specification is considered from the security discussed in the RTP specification [8]. This means media

The confidentiality of the stream is to be implemented by encryption. Data compression in the load format is end-to-end, encryption can also be compressed data

Opening, there is no contradiction between the two operations.

Complete MPEG-4 systems allow transmission of various types of data, including Java applets (MPEG-J) and scripts. This negative

The format is limited to audio and video streams, so that these activities cannot be used to transmit them.

7. References

1 BRADNER, S., "The Internet Standards Process - Revision 3", BCP

9, RFC 2026, October 1996.

2 ISO / IEC 14496-2: 1999, "Information Technology - Coding of Audio-

Visual Objects - Part2: Visual.

3 ISO / IEC 14496-3: 1999, "Information Technology - Coding of Audio-

Visual Objects - Part3: Audio.

4 ISO / IEC 14496-2: 1999 / AMD.1: 2000, "Information Technology - Coding

Of Audio-Visual Objects - Part 2: Visual, Amendment 1: Visual

Extensions.

5 ISO / IEC 14496-3: 1999 / AMD.1: 2000, "Information Technology - Coding

Of Audio-Visual Objects - Part3: Audio, Amendment 1: Audio

Extensions.

6 ISO / IEC 14496-1: 1999, "Information Technology - Coding of Audio-

Visual Objects - Part1: Systems.

7 BRADNER, S., "Key Words for Use in RFCS to Indicate Requirement

Levels, BCP 14, RFC 2119, March 1997.

8 SCHULZRINNE, H., Casner, S., Frederick, R. And V. Jacobson "RTP: A

Transport Protocol for Real Time Applications, RFC 1889, January

1996.

9 ISO / IEC 14496-2: 1999 / COR.1: 2000, "Information Technology - Coding

Of Audio-Visual Objects - Part2: Visual, Technical CorriGendum 1.

8. Author address

Yoshihiro Kikuchi

Toshiba Corporation

1, Komukai Toshiba-Cho, Saiwai-Ku, Kawasaki, 212-8582, Japanemail: Yoshihiro.kikuchi@toshiba.co.jp

Yoshinoori Matsui

Matshita Electric Industrial Co., Ltd.

1006, Kadoma, Kadoma-Shi, Osaka, Japan

Email: matsui@drl.mei.co.jp

Toshiyuki Nomura

NEC Corporation

4-1-1, Miyazaki, Miyamae-Ku, Kawasaki, Japan

Email: t-nomura@ccm.cl.co.co.nec.co.jp

Shigeru Fukunaga

Oki Electric Industry Co., Ltd.

1-2-27 Shiromi, Chuo-ku, Osaka 540-6025 Japan.

Email: fukunaga444@oki.co.jp

Hideaki Kimata

Nippon Telegraph and Telephone Corporation

1-1, Hikari-No-Oka, Yokosuka-Shi, Kanagawa, Japan

Email: kimata@nttvdt.hil.ntt.co.jp

9. Copyright statement

Copyright (c) The Internet Society (2000). All Rights Reserved.

THIS Document And Translations of It May Be Copied and Furnished To

Others, And DeriVative Works That Comment On or OtherWise Explain IT

OR Assist in ITS Implementation May Be Prepared, Copied, Published

And Distributed, in Whole or in Part, WITHOUT RESTRICTION OF ANY

Kind, Provided That The Above Copyright Notice and this Paragraph Are

Included on All Such Copies and DeriVative Works. However, this

Document Itself May Not Be Modified in Any Way, Such as by Removing

The Copyright Notice Or References To The Internet Society or Other

Internet Organizations, Except As Needed for the purpose of

Developing Internet Standards in Which Case the Procedures for

Copyrights Defined in The Internet Standards Process Must Be

Followed, or as required to translate it inTo languages ​​Other Than

ENGLISH.

THE LIMITED Permissions Grand Above Are Perpetual and Will Not Be

REVOKED by the Internet society or its surcessors or associgns.

This Document and the information contained here0 is provided on an

"AS" Basis and the Internet Society and the Internet EngineeringTask Force Disclaims All Warranties, Express or Implied, Including

But not limited to any warranty That The Use of the information

Herein Will Not Infringe Any Rights or Any Implied Warranties of

Merchantability or fitness for a particular purpose.

Thank you

Funding for the RFC Editor Function Is Currently Provided by THE

Internet society.

RRC3016 RTP PAYLOAD FORMAT for MPEG-4 AUDIO / VISUAL STREAMS

RTP load format for MPEG-4 audio-visual flow

1

RFC Document Chinese Translation Program

转载请注明原文地址:https://www.9cbs.com/read-57129.html

New Post(0)