Organization: China Interactive Publishing Network (http://www.china-pub.com/)
RFC Document Chinese Translation Program (http://www.china-pub.com/compters/emook/aboutemook.htm)
E-mail: Ouyang@china-pub.com
Translator: Li Chao (LICC_LI, LICC_LI @ sina.com)
Translation time: 2001-4-26
Copyright: This Chinese translation copyright belongs to China Interactive Publishing Network. Can be used for non-commercial use free reprint, but must
Keep the translation and copyright information of this document.
Network Working Group Y. Kikuchi
REQUEST for Comments: 3016 TOSHIBA
Category: Standards TRACK T. NOMURA
NEC
S. Fukunaga
Oki
Y. Matsui
Matshita
H. Kimata
NTT
NOVEMBER 2000
RTP load format for MPEG-4 audio-visual flow
(RRC3016 RTP PAYLOAD FORMAT for MPEG-4 AUDIO / VISUAL STREAMS)
The state of this memo
This document tells the Internet standard tracking protocol of an Internet community, which requires further discussion and suggestions
Get improved. Please refer to the latest version of the Internet Formal Protocol Standard (STD1) to get the standardization of this agreement.
And state. The release of this memo is not restricted.
Copyright Notice
Copyright (c) The Internet Society (2000). All Rights Reserved.
Summary
This article describes the RTP load format carrying MPEG-4 audio and visual code stream without using the MPEG-4 system. In order to be able to
The MPEG-4 audio / visual code stream is mapped directly to the RTP package, which provides the usage specification and fragmentation rules for the RTP Package field. with
The time of use of MIME Type Registration and Session Description Protocol (SDP) is also specified.
table of Contents
Status of this memo 1
Copyright Notice 1
Abstract 1
Introduction 2
1.1 MPEG-4 Vision RTP Load Format 3
1.2 MPEG-4 audio RTP load format 3
2. Requirements Terms 4
3. MPEG-4 visual stream RTP group package 4
3.1 MPEG-4 use of RTP header fields 4
3.2 MPEG-4 Visual Code Flow Split 5
3.3 MPEG-4 Visual Code Flow Group Pack Example 6
4. MPEG-4 audio stream RTP group package 7
4.1 RTP Packet Format 7
4.2 Use of RTP header fields in MPEG-4 audio 8
4.3 MPEG-4 audio code 3
5. MPEG-4 audio-visual flow MIME type registration 9
5.1 MPEG-4 Vision MIME Type Registration 9
5.2 MPEG-4 Vision SDP Usage 10
5.3 MPEG-4 Audio MIME Type Registration 11
5.4 SDP USAGE OF MPEG-4 AUDIO 12
6. Safety considerations 13
7. Reference 13
8. Author address 13
9. Copyright Notice 14
Acknowledgments 14
1 Introduction
The RTP load format described herein specifies how MPEG-4 audio flow [3] [5] and MPEG-4 visual flow [2] [4]
The film is mapped to the RTP package.
By defining these RTP load formats, applications can be directly directly used without using MPEG-4 system synchronization
Transmit MPEG-4 audio / visual flow. The RTP load format of this article can be applied to those that have stream management functions and do not require MPEG-4.
A system similar to a system in the system. For example, H.323 terminals, the management of MPEG-4 sound / video streams does not manage by the MPEG-4 system object descriptor, but H.245 is used. The stream is directly mapped to the RTP package and does not use the MPEG-4 system synchronization layer. other
Examples include SIP and RTSP, which use MIME and SDP. The RTP load format described herein defines the MIME type and SDP.
Method, the sound / visual flow properties when not using MPEG-4 systems (such as media type, package format, and encoding configuration) are directly specified.
This clear advantage is that it can be used as a universal approach to these non-MPEG-4 encoding formats.
MPEG-4 audio / visual RTP load format is processed. The disadvantage is that interoperability based on MPEG-4 system environments may be compared
Difficulties, other load formats are more suitable for these applications.
In this case, the semanticity of the RTP Baotou must define very clear, including the MPEG-4 tone / video data element.
system. In addition, in order to enhance the error recovery ability, an error recovery tool is provided inside the MPEG-4 video stream, it is best to be MPEG-4
The video stream defines the shard rules of the RTP package.
1.1 MPEG-4 Vision RTP load format
MPEG-4 vision is a visual coding standard, which has the following nature: high coding efficiency; high error recovery;
Diverse, arbitrary object coding; etc. [2]. The rate range is between several kbps to a few Mbps. And it can adapt to the difference from different
Various network types such as mobile networks to high error rates.
We should note that the shard rules of the MPEG-4 visual stream defined herein should be noted that because MPEG-4 visuals will be used more
The type of network, so there should not be too many restrictions in terms of fragmentation. Such as "single video package needs to be mapped to a single RTP package"
Split rules are unreasonable. On the other hand, general, and unknown media fragmented can also result in error recovery rates and bandwidth utilization
The decrease in the rate. The shard rules described herein are very flexible, but in order to avoid meaninglessness when applying MPEG-4 visual error recovery.
Split also defines a minimal rule set.
Split Rules It is recommended not to map multiple VOPs in a RTP package, which ensures that the RTP timestamp can uniquely represent VOP
Different frame time. Conversely, since the MPEG-4 video can generate very small VOP, such as an empty VOP containing only a VOP header
(VOP_CODED = 0) or a single-shaped VOP having only a small amount block. In order to reduce overhead, the fragmentation rules should allow multiple VOPs.
Connect to an RTP package. (See 3.2 Section Split Rules (4) and 3.1 Signals and Timestamps)
In video coding tools such as H.261 or MPEG-1/2, it is often helped by the defined additional media RTP header.
Restoring damaged pictures, and MPEG-4 visual has provided error recovery capabilities, which can be used for RTP / IP networks,
It can also be used in other networks (H.223 / Mobile, MPEG-2 / TS, etc.). Therefore, there is no need to be in the MPEG-4 visual RTP load format.
Extra RTP Baotou.
1.2 MPEG-4 audio RTP load format
MPEG-4 audio is an new audio standard integrating a variety of audio encoding tools. LATM (low burden MPEG-4 audio biography
Transferring) Manages audio data sequences by fairly small cost. For those applications with only audio, do not use MPEG-4 systems
It is worthwhile to use directly to map LATM-based MPEG-4 audio code streams to the RTP package.
Latm has the following multiplexing features:
- Carrying configuration information in audio data,
- Connect multiple audio frames into an audio stream,
- Multi-object (program) multiplexing
- Reburable layer multiplexing,
No last two properties are required in RTP transmission. Therefore, the two properties cannot be used based on the application of the RTP group package principles specified herein. Since LATM is developed for the natural audio coding tool, not the development of synthetic tools,
Configuration Audio (SA) Data and Documentary Conversion Interface (TTSI) data are difficult. So can't pass the RTP group package method through this document
Transmit SA data and TTSI data.
In order to transmit the telescopic flow, the audio data of each layer should be packaged to different RTP packets, so that it can be guaranteed in the IP layer.
The same level has different processing, such as through some distinction services. On the other hand, all configuration data of the telescopic stream is included in one
LATM Configuration Data "SteamMuxConfig" and each layer is shared to share the streammuxconfig. Layer and its configuration data
Template is done by the LATM header information attached to the audio data. In order to represent the dependent information of the scalable stream, it also targets the load type
(PT) Value (see Section 4.2) Dynamic allocation rules use a limit to limit measures.
For MPEG-4 audio encoding tools, if the load is a single audio frame, the loss of the package does not affect the solution of the neighboring package.
code. This is also applicable to other audio encoders. Therefore, MPEG-4 audio does not require additional media specific heads for erroneous recovery.
You can use some of the existing RTP protection mechanisms to increase the error recovery rate, such as universal forward error correction (RFC 2733) and redundant audio
Data (RFC 2198).
2. Required terms
The keyword "must", "must not", "should", "should not", "will not", "will not", "will not", "will not", ""
"Recommendations", "maybe", "optional" explained in RFC 2119.
3. MPEG-4 visual code stream RTP group package
This section specifies the RTP group rules for MPEG-4 visual content. A MPEG-4 visual code stream can be mapped directly to the RTP package
Need to add additional headers or delete any visual syntax elements. In order to put the basic flow configuration information in the same RTP port
Upload, you must use the merge configuration / basic flow mode. (See ISO / IEC 14496-2 [2] [9] [4] in 6.2.1 "Start Code")
The configuration information can be specified by the external manner. For H.323 terminal, H.245 code must be used
Point "DecoderConfigurationInformation". If the system uses MIME content types and SDP parameters such as SIP and RTSP,
The configuration information must be specified with an optional parameter "config" (see 5.1 and 5.2).
When short video head mode is used, the RTP load format of H.263 (recommended using RFC2429 defined format, but also
The RFC2190 format can be used to achieve compatibility with the old system).
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
- - - - - - - - - - - - - - - -
| V = 2 | P | x | CC | M | PT | Sequence Number | RTP
- - - - - - - - - - - - - - | TimeStamp | Header
- - - - - - - - - - - - - - - -
| SYNCHRONIZATION SOURCE (SSRC) IDENTIFIER |
= = = = = = = = = = = = = = = = = = = = = = = = = = = =
Contributing Source (CSRC) Identifiers |
| ....
= = = = = = = = = = = = = = = = = = = = = = = = = = = =
| | | RTP
| MPEG-4 Visual Stream (Byte Aligned) | PAY-
| | LOAD
- - - - 5
| ... OPTIONAL RTP Padding |
- - - - - - - - - - - - - - - - - Figure 1 - RTP package for MPEG-4 visual flow
3.1 Use of RTP header fields in MPEG-4 visual
Load Type (PT): Assigning the RTP load type for the new package format beyond the category of this article, not described here. Particular class
The RTP framework of the model should be responsible for the assignment of load type. If it is not possible, if it is not possible, it should pass the external signaling protocol (eg,
H.245, SIP, etc. Select a load type in the dynamic range.
Extension-x bit): Definitions used by the RTP framework.
Sequence Number: In order to be safe from a random initial value, each of the RTP packets plus 1 is sent.
Marker-m) bit: Sign bit is set to 1 Sign This is the last one of the VOP (or only one) RTP package. If one
A plurality of VOPs carrying multiple VOPs in an RTP package are also set to 1.
Timestamp: Timestamp represents the VOP sampling time in the RTP package. For security, plus a random constant
Offset.
- When an RTP package carries multiple VOP, the timestamp represents the time of the earliest VOP. Other VOP timestamp
Interest through the timestamp field of the VOP header (MODULO_TIME_BASE and VOP_TIME_INCREMENT).
- If the RTP package contains only configuration information or group_of_videobjectPlane () field, use the encoding queue in the next
VOP timestamp.
- If the RTP package contains only Visual_Object_SEQUENCE_END_CODE information, use the coding queue in the previous VOP
Timestamp.
The timestamp resolution is set to the default value 90kHz unless otherwise specified by the out-of-band.
The use of other headers is shown in RFC 1889 [8].
3.2 MPEG-4 Visual Code Flow Split
Use the merge configuration / basic stream mode, and the shard MPEG-4 visual code stream is directly mapped to the RTP load without adding any
Additional headers or delete visual syntax elements. The following rules can be applied when fragmentation.
Hereinafter, header may represent the following information:
- Configuration information (visual object sequence head, visual object head and video object layer)
Visual_Object_sequence_end_code
- Basic stream entry point function (group_of_videobjectPlane (),
Video_Plane_with_short_header (), Meshobject () or faceObject ())
- Video Baotou (VIDEO_PACKET_HEADER (), except NEXT_RESYNC_MARKER ()
- GOB_LAYER () header
The definition of configuration information and entry point functions See ISO / IEC 14496-2 [2] [9] [4] "6.2.1" Start Code "
(1) Configuration information and group_of_videoObjectPlane () fields should be located in the beginning of the RTP load or on the syntax
After the upper layer function head.
(2) If one or more heads exist in the RTP load, the RTP load should start from the highest function header on the syntax.
Note: Visual_Object_sequence_end_code acts as the lowest function.
(3) One head should not be in multiple RTP packets.
(4) Different VOP should be divided into different RTP packets, one RTP package only includes data related to the only VOP time (pointed out in the timestamp field of the RTP header). The exception is that if the VOP is small, a single RTP package carries a plurality of decoding sequence.
Continued VOP.
Note: When an RTP load carries a plurality of VOP, the first VOP timestamp after decoding is calculated when decoding.
This operation is only necessary only when the RTP packet sign bit is 1 and the RTP load begins when the start code is conforming. (See 3.1 Timestamps and Sign
Bits)
(5) It is recommended that a video package forms an RTP package for sending. The size of the video package should be decided as follows, ie, knot
If the RTP package is large, it must not exceed the size of the path MTU.
Note: Rule (5) does not apply to the following occasions, encoder configuration prohibiting video packages (by putting the Vol header)
RESYNC_MARKER_DISABLE is set to 1), or the encoding tool does not support video packages. In this case, a VOP may get
After fragmentation at any byte position, it can be sent.
The video package begins with the VOP head or video header, followed by motion_shape_texture (),
Next_RESYNC_MARKER () or next_start_code () ends.
3.3 MPEG-4 Visual Code Flow Group Package
Figure 2 shows an example of the RTP package generated by the standards of 3.2.
(a) Example indicates the first RTP package or random access point in the MPEG-4 visual code stream containing configuration information. According to the rules (1),
The visual object sequence header should be located at the beginning of the RTP load, visual object head and video object homes (Vo header, Vol header)
prior to. 3.2 The shard rules defined in 3.2 guarantees configuration information from Visual_Object_sequence_start_code.
All in the start position of the RTP load, the RTP receiver can be checked by check whether the head 32 bit field of the RTP load is
Visual_Object_sequence_start_code to detect a random access point.
(b) is another example of an RTP packet containing configuration information. It is also the same (1) to include the configuration information of the RTP package in the VOP.
Contains a video package. Since the configuration information is very short (generally several trimes), an RTP package is included if only configuration information will be included.
The rise of the system overhead, so the configuration information and the subsequent GOV and / or (part) VOP can be packaged into the same RTP package, so
The example shown in.
(c) is an example of Group_OF_VideoBjectPlane (GOV) in the RTP package. According to Rules (1), GOV is located in RTP
The start position of the load. A RTP package size with only GOV fields is only 7 bytes, which is a great waste of RTP / IP header.
Therefore, subsequent VOPs (or partially) can be hit in the same RTP package as shown in this example.
(D) In the case, a video package is packaged into an RTP package. This method is recommended when the packet loss rate is high in the network. very
When the RTP package containing the VOP header is discarded, the other RTP packets can be decoded by using HEC information in the video clamp. No need
Any additional RTP header field.
(e) Cases of playing a plurality of video packs in an RTP package. This group of packages can be efficiently used when the bottom layer network rate is very low.
Save RTP / IP header overhead. However, since a RTP package will cause multiple video packages to be lost at the same time, this method will drop
Low packet recovery rate. The ideal video package number in the RTP package and the RTP packet length can be transmitted by the packet loss rate and the base network.
The rate is determined.
(f) Example is set to 1 to 1 in the VOL header to disable the use of the video package. In this situation,
A VOP can be divided into multiple RTP packets in any byte location. For example, a VOP is sharper according to the fixed length. This coding
Configuration methods and RTP shards can be applied to networks that provide extremely low error rate assurance. On the other hand, since its packet loss recovery rate is very
Poor, it is recommended not to use in an Error-Prone environment.
Figure 3 shows the RTP package established by 3.2 rules. According to (a), a head slide to multiple RTP packets not only caused an increase in RTP / IP header, but also causes error recovery capabilities.
Decline. Therefore, doing this in the rules (3).
When multiple video packages are connected in series to an RTP package, the VOP head or video_packet_header () should not be placed in the RTP load
in the middle. The assembly method in (b) is violated the rules (2) based on the purpose of error recovery. Compared with the example 6 in Figure 2,
Both are mapped two video packages to two RTP packets, and their packet loss recovery rate is different. That is to say, suppose the second RTP
The package is lost, and two video packs in Figure 3 (b) will be lost, while only the video package 2 is lost in Figure 2 (d).
---- ------ ---- ----
(a) | RTP | VS | VO | VOL |
| Header | Header | Header | Header |
---- ------ ---- ----
---- ------ ---- ------ ----------
(b) | RTP | VS | VO | VOL | Video Packet |
| Header | Header | Header | Header | |
---- ------ ---- ------ ----------
------ ----- ----------------
(c) | RTP | GOV | Video Object Plane |
| Header | | | | |
------ ----- ----------------
---- ------ ---------- ---- ------ ------- -----
(d) | RTP | VOP | Video Packet | | RTP | VP | Video Packet |
| Header | Header | (1) | | Header | Header | (2) |
---- ------ ---------- ---- ------ ------- -----
---- ------ ---------- ---- ------------ - ---- ------------
(e) | RTP | VP | Video Packet | VP | Video Packet | VP | Video Packet |
Header | Header | (1) | HEADER | (2) | Header | (3) |
---- ------ ---------- ---- ------------ - ---- ------------
---- ------ ---------- ---- ------------
(f) | RTP | VOP | VOP FRAGMENT | | RTP | VOP FRAGMENT |
| Header | Header | (1) | | Header | (2) | ___ ------ ---- ---------- ---- - ------------
Figure 2 - MPEG-4 visual code flow example of RTP group package
---- ------------- ---- ---------- ------- -----
(a) | RTP | First Half of | | RTP | Last Half of | Video Packet |
| Header | VP header | | HEADER | VP HEADER | |
---- ------------- ---- ---------- ------- -----
---- ------ ---------- ---- -------- ------ ----------
(b) | RTP | VOP | First Half | | RTP | Last Half | VP | Video Packet |
Header | Header | Of VP (1) | | Header | Of VP (1) | Header | (2) |
---- ------ ---------- ---- -------- ------ ----------
Figure 3 - MPEG-4 visual code stream for prohibiting RTP group packs
4. MPEG-4 audio stream RTP group package
This section specifies the RTP group rules for MPEG-4 audio streams. The MPEG-4 audio stream must be formatted through the LATM tool.
The LATM-based stream will then be mapped to the RTP packet according to the description below.
4.1 RTP package format
The LATM-based stream consists of an AudiomuxElements sequence containing one or more audio frames. A complete or
Some complete AudiomuxElements can be mapped directly to an RTP load without having to delete any AudiomuxElement syntax
Element (see Figure 4). The first byte of each AudiomuxElement should be located in the position where the RTP package is located.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
- - - - - - - - - - - - - - - -
| V = 2 | P | x | CC | M | PT | Sequence Number | RTP
- - - - - - - - - - - - - - - -
| TIMESTAMP | Header - - - - - - - - - - - - -
| SYNCHRONIZATION SOURCE (SSRC) IDENTIFIER |
= = = = = = = = = = = = = = = = = = = = = = = = = = = =
Contributing Source (CSRC) Identifiers |
| ....
= = = = = = = = = = = = = = = = = = = = = = = = = = = =
| | | RTP
: AudiomuxElement (byte aligned): payload
| | |
- - - - 5
| ... OPTIONAL RTP Padding |
- - - - - - - - - - - - - - - -
Figure 4 - A MPEG-4 audio RTP package is decoded to AudiomuxElement, and must be indicated by the external method to indicate MUXCONFIGPRESENT. when
When the SDP is used for this instruction, the MIME parameter "cpresent" corresponds to MUXCONFIGPRESENT information. (See Section 5.3).
MUXCONFIGPRESENT: If this value is 1 (in-band mode), AudiomuxElement should include an indicator bit.
"Usesamestreammux" and may include an audio compression configuration information "streammuxconfig".
The UseSameStreamMux bit indicates whether the StreamMuxConfig element in the previous frame is also applied to this frame. in case
The UseSameStreamMux bit indicates StreamMuxConfig for the previous frame, and the previous frame is lost, it will not be possible
The front frame is decoded. Therefore, in the in-band mode, the streammuxConfig element should be repeatedly transmitted according to the network condition. in contrast,
If MUXConfigPresent is set to 0 (out-of-band mode), the StreamMuxConfig element needs to be transmitted through bandwright. in the case of
SDP, use MIME parameters "config" (see Section 5.3).
4.2 Use of RTP header fields in MPEG-4 audio
Load Type (PT): Assigning the RTP load type for this new package format beyond the category of this article, not described herein. special
The RTP framework for customized type applications should be responsible for assigning load types for encoding, if you can't, you should pass the proparonent
(Eg, H.245, SIP, etc.) select a load type in the dynamic range. Dynamic allocation RTP load class dynamically
When the type should be assigned a different value for each layer. These values should be allocated in a strong order of reliance on the relationship, the most basic
The layer has the smallest value.
The flag bit (M): The flag indicates the AudiomuxElement range. Set to 1 Description RTP bags contain complete
The last piece of AudiomuxElement or AudiomuxElement fragment.
Timestamp: Timestamp represents the sampling time of the first audio frame in the RTP package. From the security perspective, the recommended timestamp is from one
The random value begins. The resolution of the timestamp is set to the default value 90kHz unless specified.
Sequence number: In order to be safer, the sequence number should start from a random initialization value, plus one RTP data package plus 1.
The use of other headers is followed by RFC 1889 [8].
4.3 MPEG-4 audio code diversity
It is recommended that only one AudiomuxElement is placed in each RTP package. If the size of the AudiomuxElement is small enough,
There is no problem with the size of the RTP package does not exceed the size of the path MTU. Otherwise you have to sharpen AUDIOMUXELEMENT to multiple
in the bag.
5. MPEG-4 audio-visual flow MIME type registration
The next few sections describe the MIME type registration of MPEG-4 audiovisual streams. MIME type registration and SDP use of MIME type of MPEG-4 visual flow
The MIME type registration and SDP usage of the MPEG-4 audio stream are described in 5.3 and 5.4.
5.1 MPEG-4 Visual MIME Type Registration
MIME Media Type Name: Video
MIME Sub Type: MP4V-ES
Required parameters: none
Optional parameters:
Rate: This parameter is only used for RTP transmission. Represents the resolution of the RTP header timestamp field. If this parameter is not specified
Use the default value 90000 (90kHz).
Profile-Level-ID (Frame Level ID): A Table G-1 of ISO / IEC 14496-2 [2] [4] definition
The MPEG-4 Visual Framework and Level Value Decoction (Profile_and_Level_indication). This parameter can be used for performance
Exchange or transaction is established in the process of representing the level combination that can be achieved by the MPEG-4 visual frame and the MPEG-4 visual encoder. If not specified
This parameter uses the default value 1. Config: This parameter is used to represent the configuration of the corresponding MPEG-4 visual stream. Not applied to representation of performance exchange
Coding ability. It is an 8-bit string of a 16-way form, which can represent ISO / IEC14496-2 [2] [4] [9] 6.2.1
Defined MPEG-4 visual configuration information. This configuration information can be directly mapped to 8-bit bytes according to the MSB (highest significant bit) priority principle
string. The first bit of configuration information should be located in the first 8-bit group of MSB. The configuration information indicated by this parameter should be and the corresponding MPEG-4 vision
The flow configuration information is the same, except for first_half_vbv_occupancy and latter_half_vbv_occupancy, if there is,
Then it is different in terms of repeated configuration information in the MPEG-4 visual stream. (See 6.2.1 of ISO / IEC14496-2 "start
coding").
The use of this parameter is as follows:
- MPEG-4 Visual Simple Profile / Level 1:
Content-type: video / mp4v-es; profile-level-id = 1
- MPEG-4 Visual Core Profile / Level 2:
Content-type: video / mp4v-es; profile-level-id = 34
- MPEG-4 Visual Advanced Real Time Simple Profile / Level 1:
Content-type: video / mp4v-es; profile-level-id = 145
Released:
MPEG-4 visual flow specification See ISO / IEC 14469-2 [2] [4] [9]. The RTP load format is described in RFC 3016.
Code consideration:
The video bit stream must be referred to with the MPEG-4 Vision Code (ISO / IEC 14496-2). A video bit stream is binary data,
It must be encoded to be able to transmit according to non-binary transmission (for email, Base64 encoding is sufficient). This type is also defined as via RTP
transmission. The RTP package must be set in accordance with the MPEG-4 visual RTP load format defined by the RFG 3016.
Safety considerations:
See Section 6 of RFC 3016.
Interoperability considerations:
MPEG-4 Vision provides a lot of tools for visual objects. In order to efficiently implement standards, it is also specific
The application provides a subset of MPEG-4 visual tools. These children are called 'Profiles', which limits the work required to implement an encoder.
Set of size. To control computational complexity, each profile is divided into several levels. The PROFILE @ level combination is as follows:
? A codec developer, responsible for achieving the required standard subset, maintenance, and other MPEG-4 devices within the same combination
Interaction.
• Check if the MPEG-4 device meets the standard ('consistency test').
Visual flow should be compatible with the MPEG-4 Visual Profile @ Level specified in the parameter "profile-level-id".
Interoperability of the sender and the recipient, by specifying the parameter "profile-level" in the MIME content, or pass
Coordination performance exchange / declaration process sets the parameter to the same value.
Use this media type application:
Audio flow and conference tools, Internet messages, and email applications.
With information: no
Contact and its email address:
RFC 3016 Author (see section 8).
Prepected usage: Common
Author or amender:
RFC 3016 Author (see section 8).
5.2 Usage of SDP in MPEG-4 Vision
MIME Media Type VIDEO / MP4V-ES can be mapped to SDP (RFC 2327), as follows:
? MIME Type (Video) Add SDP "M =" as the media name.
• The MIME subtype is added to the SDP "A = RTPMAP" as the encoded name.
• Optional parameter "rate" Add "A = RTPMAP" as the clock rate? Optional parameter "profile-level-id" and "config" join "A = FMTP" line indicates encoder capabilities and
. These parameters are separated by a semicolon, which is represented as a MIME media type string according to the pair form of "parameter = value".
Below is an example of media representation in SDP:
Simple profile / level 1, rate = 90000 (90kHz), "Profile-Level-ID" and "config" exists
"a = fmtp" line:
? m = video 49170/2 RTP / AVP 98
? a = RTPMAP: 98 MP4V-ES / 90000
? a = fmtp: 98
Profile-level-id = 1; config = 000001B001000001B5090000010000000120008440FA282C
2090A21F
Core Profile / Level 2, Rate = 90000 (90kHz), "profile-level" exists in "a = fmtp" line:
? m = video 49170/2 RTP / AVP 98
? a = RTPMAP: 98 MP4V-ES / 90000
? a = fmtp: 98 profile-level-id = 34
Advance Real Time Simple Profile / Level 1, Rate = 90000 (90kHz), "Profile-Level-ID" exists
"a = fmtp" line:
m = video 49170/2 RTP / AVP 98
A = RTPMAP: 98 MP4V-ES / 90000
a = fmtp: 98 profile-level-id = 145
5.3 MPEG-4 Audio MIME Type Registration
MIME Media Type Name: AUDIO
MIME Sub Type: MP4A-LATM
Required parameters:
Rate: The rate parameter represents the clock rate of the RTP timestamp. The default is 90000. Only when this value is set to with audio
Other non-default rates can also be specified when the sampling frequency (number per second sampling).
Optional parameters:
Profile-level-id: a decimal form of MPEG-4 audio frame level representation value, defined
ISO / IEC 14496-1 ([6] and its revised version). This parameter indicates which MPEG-4 audio worker can be used by the decoder.
It is a subset. If this parameter is not specified during performance exchange or transaction establishment, use the default value 30 (natural
Audio PROFILE / Level 1)
Object: A decimal form of MPEG-4 audio object type value is defined in ISO / IEC 14496-3 [5].
This parameter specifies the tool used by the encoder. This parameter can be used to limit performance "Profile-Level-Id"
under.
Bitrate: The data transfer rate of the audio data stream.
Cpresent: A Boolean parameter indicating whether the audio load configuration data has been multiplexed to an RTP load
(See 4.1). 0 indicates that it has not been reused, 1 indicates that it has been reused. The default value of this parameter is 1.
Config: an 8-bit string of 16-in-form, can represent ISO / IEC 14496-3 [5] (see 4.1)
Defined MPEG-4 audio load configuration data "streammuxconfig". This configuration information can be followed by MSB (highest valid)
Bit) Priority Principle is directly mapped to an 8-bit byte string. The first bit of configuration data should be located in the first 8-bit group of MSB. in
In the last 8-bit group, if necessary, follow the fill 0 after configuring the data.
PTIME: Recommended package duration, unit milliseconds.
Released:
This article describes the load format specification.
Coding specifications follow ISO / IEC 14496-3 [3] [5].
Code consideration:
This type is only defined as being used to transmit through RTP.
Safety considerations:
See Section 6 of RFC 3016.
Interoperability considerations:
MPEG-4 audio provides a large and rich tool for audio object encoding. In order to achieve standards more efficiently,
The MPEG-4 audio tool subset (similar to MPEG-4 vision in 5.1) is provided. The audio stream tool should be
The Profile @ level specified by the "Profile-Level-ID" parameter. Interoperability between the sender and the recipient
The parameter "profile-level" is specified in the MIME content, or the parameter is set to the phase during the negotiation performance exchange process.
Similar value is achieved. In addition, parameter "object" can be used to limit the ability to specify in the specified Profile @ Level level in performance exchange
Subsequent.
Use this media type application:
Audio flow and conference tool.
Additional information: no
Contact:
See Section 8 of RFC 3016.
Prepected usage: Common
Author / amender:
See Section 8 of RFC 3016.
5.4 SDP USAGE OF MPEG-4 AUDIO
MIME Media Type AUDIO / MP4A-LATM string can be mapped to the field of SDP (RFC 2327), as follows:
• MIME Type (AUDIO) Adds SDP "M =" as a media name.
? MIME Sub Type (MP4A-LATM) Add SDP "A = RTPMAP" as encoded name
• Required parameter "rate" joins "A = RTPMAP" as the clock rate.
• Optional parameter "PTIME" joins the SDP "A = PTIME" attribute
• Optional parameter "profile-level-id" joins the "A = FMTP" line represents the encoder capability. Parameter "Object"
Join "A = FMTP" attribute, load format related parameters "bitrate", "cpresent", "config"
Join the "A = FMTP" line. These parameters are separated by a semicolon, and the MIME is indicated by "parameter = value".
Media type string.
Here is an example of media representation in SDP:
For 6 kb / s CELP code stream (audio sampling frequency is 8 kHz),
? M = Audio 49230 RTP / AVP 96
? a = RTPMAP: 96 MP4A-latm / 8000
? a = fmtp: 96 profile-level-id = 9; object = 8; cpresent = 0;
Config = 9128b1071070
? a = pTIME: 20
For 64 kb / s AAC LC stereo code stream (audio sampling frequency is 24 kHz),
? M = Audio 49230 RTP / AVP 96
? a = RTPMAP: 96 MP4A-latm / 24000
? a = fmtp: 96 profile-level-id = 1; bitrate = 64000; cpresent = 0;
? config = 9122620000
In the above two examples, the audio configuration data is described only by SDP and is not multiplexed into the RTP load.
In addition, the "Clock Rate" is also set to audio sampling rate.
If the clock rate is set to the default, and the audio sampling rate must be obtained, the parameter "config" can be obtained.
to fulfill. Examples are as follows:? M = Audio 49230 RTP / AVP 96
? a = RTPMAP: 96 MP4A-latm / 90000
? a = fmtp: 96 Object = 8; cpresent = 0; config = 9128b1071070
The following example shows the audio configuration data in the RTP load.
? M = Audio 49230 RTP / AVP 96
? a = RTPMAP: 96 MP4A-latm / 90000
? a = fmtp: 96 Object = 2; cpresent = 1
6. Security considerations
The RTP package load format described in this specification is considered from the security discussed in the RTP specification [8]. This means media
The confidentiality of the stream is to be implemented by encryption. Data compression in the load format is end-to-end, encryption can also be compressed data
Opening, there is no contradiction between the two operations.
Complete MPEG-4 systems allow transmission of various types of data, including Java applets (MPEG-J) and scripts. This negative
The format is limited to audio and video streams, so that these activities cannot be used to transmit them.
7. References
1 BRADNER, S., "The Internet Standards Process - Revision 3", BCP
9, RFC 2026, October 1996.
2 ISO / IEC 14496-2: 1999, "Information Technology - Coding of Audio-
Visual Objects - Part2: Visual.
3 ISO / IEC 14496-3: 1999, "Information Technology - Coding of Audio-
Visual Objects - Part3: Audio.
4 ISO / IEC 14496-2: 1999 / AMD.1: 2000, "Information Technology - Coding
Of Audio-Visual Objects - Part 2: Visual, Amendment 1: Visual
Extensions.
5 ISO / IEC 14496-3: 1999 / AMD.1: 2000, "Information Technology - Coding
Of Audio-Visual Objects - Part3: Audio, Amendment 1: Audio
Extensions.
6 ISO / IEC 14496-1: 1999, "Information Technology - Coding of Audio-
Visual Objects - Part1: Systems.
7 BRADNER, S., "Key Words for Use in RFCS to Indicate Requirement
Levels, BCP 14, RFC 2119, March 1997.
8 SCHULZRINNE, H., Casner, S., Frederick, R. And V. Jacobson "RTP: A
Transport Protocol for Real Time Applications, RFC 1889, January
1996.
9 ISO / IEC 14496-2: 1999 / COR.1: 2000, "Information Technology - Coding
Of Audio-Visual Objects - Part2: Visual, Technical CorriGendum 1.
8. Author address
Yoshihiro Kikuchi
Toshiba Corporation
1, Komukai Toshiba-Cho, Saiwai-Ku, Kawasaki, 212-8582, Japanemail: Yoshihiro.kikuchi@toshiba.co.jp
Yoshinoori Matsui
Matshita Electric Industrial Co., Ltd.
1006, Kadoma, Kadoma-Shi, Osaka, Japan
Email: matsui@drl.mei.co.jp
Toshiyuki Nomura
NEC Corporation
4-1-1, Miyazaki, Miyamae-Ku, Kawasaki, Japan
Email: t-nomura@ccm.cl.co.co.nec.co.jp
Shigeru Fukunaga
Oki Electric Industry Co., Ltd.
1-2-27 Shiromi, Chuo-ku, Osaka 540-6025 Japan.
Email: fukunaga444@oki.co.jp
Hideaki Kimata
Nippon Telegraph and Telephone Corporation
1-1, Hikari-No-Oka, Yokosuka-Shi, Kanagawa, Japan
Email: kimata@nttvdt.hil.ntt.co.jp
9. Copyright statement
Copyright (c) The Internet Society (2000). All Rights Reserved.
THIS Document And Translations of It May Be Copied and Furnished To
Others, And DeriVative Works That Comment On or OtherWise Explain IT
OR Assist in ITS Implementation May Be Prepared, Copied, Published
And Distributed, in Whole or in Part, WITHOUT RESTRICTION OF ANY
Kind, Provided That The Above Copyright Notice and this Paragraph Are
Included on All Such Copies and DeriVative Works. However, this
Document Itself May Not Be Modified in Any Way, Such as by Removing
The Copyright Notice Or References To The Internet Society or Other
Internet Organizations, Except As Needed for the purpose of
Developing Internet Standards in Which Case the Procedures for
Copyrights Defined in The Internet Standards Process Must Be
Followed, or as required to translate it inTo languages Other Than
ENGLISH.
THE LIMITED Permissions Grand Above Are Perpetual and Will Not Be
REVOKED by the Internet society or its surcessors or associgns.
This Document and the information contained here0 is provided on an
"AS" Basis and the Internet Society and the Internet EngineeringTask Force Disclaims All Warranties, Express or Implied, Including
But not limited to any warranty That The Use of the information
Herein Will Not Infringe Any Rights or Any Implied Warranties of
Merchantability or fitness for a particular purpose.
Thank you
Funding for the RFC Editor Function Is Currently Provided by THE
Internet society.
RRC3016 RTP PAYLOAD FORMAT for MPEG-4 AUDIO / VISUAL STREAMS
RTP load format for MPEG-4 audio-visual flow
1
RFC Document Chinese Translation Program