多媒体文件格式之RMVB
[时间:2016-07] [状态:Open]
RM/RMVB是Real公司私有的封装格式,常见的后缀形式是rm、ra、rmvb。
通常封装的都是real转悠的编码格式,比如音频中的sipro、cook、atrc、ralf、raac,视频的RV10、RV20、RV30、RV40。
0. 学习多媒体容器格式的目的
主要是为了回答以下问题:
- 该容器中数据是如何组织的?
- 该容器包含哪些编码格式的数据?这些数据是如何存储的?
- 该容器包含哪些元数据信息?包含哪些节目信息?
- 对于支持多节目的容器格式,如何找到对应的音频流、视频流、字幕流?
- 如何确定该容器的节目播放时长?
- 如何从该容器中提取音频、视频、字幕数据,并交给解码器解码,有时间戳否?
- 该容器是否支持seek?有哪些辅助信息?
- 是否支持直接流化?
- 哪里可以找到该容器格式最标准的文档资料?
- 有哪些可用的工具,方便分析容器格式异常或者错误?
1. RM文件格式概述
RealMedia File Format(RMFF)是一种基于TAG的文件格式,每个TAG有四个字节(FOURCC)用于标识元素类型。
RM文件的基本构成块是chunk。每个chunk构成如下:
==============
ID(FOURCC)
--------------
size(4 byte)
--------------
data([size])
==============
每个chunk的ID决定了data域如何解析。顶层的chunk可以包含sub-chunk。
一个常见的chunk构成如下图:
通常RM文件有三部分构成:header section、data section、index section。每个section都是多个chunk构成,具体可以参考下图:
后续部分详细介绍各个section。
2. RM文件头(header section)
RMFF是基于TAG的格式,在header section中各个chunk出现的顺序并不是固定的,但RealMedia File Header(文件头)必须是第一个chunk。其他后续chunk包括:Properties Header(属性头)、Media Properties Header(媒体属性头)、Content Description Header(内容描述头)。
RealMedia文件头
RealMedia文件头通常用于识别文件格式,并且每个RM文件只有一个文件头。其中包含的字段如下:
RealMedia_File_Header
{
UINT32 object_id;
UINT32 size;
UINT16 object_version;
if ((object_version == 0) || (object_version == 1))
{
UINT32 file_version;
UINT32 num_headers;
}
}
各字段具体含义见下表:
field | type | description |
---|---|---|
object_id | UINT32 | The unique object ID for a RealMedia File (.RMF ). All RealMedia files begin with this identifier. |
size | UINT32 | The size of the RealMedia header section in bytes. |
object_version | UINT16 | The version of the RealMedia File Header object. All files created according to this specification have an object_version number of 0 (zero) or 1. |
file_version | UINT32 | The version of the RealMedia file. This member is present on all RealMedia_File_Header objects with an object_version of 0 (zero) or 1. |
num_headers | UINT32 | The number of headers in the header section that follow the RealMedia File Header. This member is present on all RealMedia_File_Header objects with an object_version of 0 (zero) or 1. |
注:后续表格中将不会出现关于object_version的限制,具体建议参考标准文档。
RM属性头
Properties Header描述RMF的一般媒体属性。
RM系统会参考这个对象中的数据处理RM文件或流中的数据。在RMF中只有一个属性头。其中包含的字段如下:
Properties_Header
{
UINT32 object_id;
UINT32 size;
UINT16 object_version;
if (object_version == 0)
{
UINT32 max_bit_rate;
UINT32 avg_bit_rate;
UINT32 max_packet_size;
UINT32 avg_packet_size;
UINT32 num_packets;
UINT32 duration;
UINT32 preroll;
UINT32 index_offset;
UINT32 data_offset;
UINT16 num_streams;
UINT16 flags;
}
}
各字段具体含义见下表:
field | type | description |
---|---|---|
object_id | UINT32 | The unique object ID for a Properties Header ('PROP'). |
size | UINT32 | The 32-bit size of the Properties Header in bytes. |
object_version | UINT16 | The version of the RealMedia File Header object. All files created according to this specification have an object_version number of 0 (zero). |
max_bit_rate | UINT32 | The maximum bit rate required to deliver this file over a network. |
avg_bit_rate | UINT32 | The average bit rate required to deliver this file over a network. |
max_packet_size | UINT32 | The largest packet size (in bytes) in the media data. |
avg_packet_size | UINT32 | The average packet size (in bytes) in the media data. |
num_packets | UINT32 | The number of packets in the media data. |
duration | UINT32 | The duration of the file in milliseconds. |
preroll | UINT32 | The number of milliseconds to prebuffer before starting playback. |
index_offset | UINT32 | The offset in bytes from the start of the file to the start of the index header object. This value can be 0 (zero), which indicates that no index chunks are present in this file. |
data_offset | UINT32 | The offset in bytes from the start of the file to the start of the Data Section. Note: There can be a number of Data_Chunk_Headers in a RealMedia file. The data_offset value specifies the offset in bytes to the first Data_Chunk_Header. The offsets to the other Data_Chunk_Headers can be derived from the next_data_header field in a Data_Chunk_Header. |
num_streams | UINT16 | The total number of media properties headers in the main headers section. |
flags | UINT16 |
RM媒体属性头(Media Properties Header)
Media Properties Header描述了RM文件中每个流的特定媒体属性。每一个流都有一个媒体属性头。其中包含的字段如下:
Media_Properties_Header
{
UINT32 object_id;
UINT32 size;
UINT16 object_version;
if (object_version == 0)
{
UINT16 stream_number;
UINT32 max_bit_rate;
UINT32 avg_bit_rate;
UINT32 max_packet_size;
UINT32 avg_packet_size;
UINT32 start_time;
UINT32 preroll;
UINT32 duration;
UINT8 stream_name_size;
UINT8[stream_name_size] stream_name;
UINT8 mime_type_size;
UINT8[mime_type_size] mime_type;
UINT32 type_specific_len;
UINT8[type_specific_len] type_specific_data;
}
}
各字段含义如下表:
field | type | description |
---|---|---|
object_id | UINT32 | The unique object ID for a Media Properties Header ("MDPR"). |
size | UINT32 | The size of the Media Properties Header in bytes. |
object_version | UINT16 | The version of the Media Properties Header object. |
stream_number | UINT16 | The stream_number (synchronization source identifier) is a unique value that identifies a physical stream. Every data packet that belongs to a physical stream contains the same STREAM_NUMBER. The STREAM_NUMBER enables a receiver of multiple physical streams to distinguish which packets belong to each physical stream. |
max_bit_rate | UINT32 | The maximum bit rate required to deliver this stream over a network. |
avg_bit_rate | UINT32 | The average bit rate required to deliver this stream over a network. |
max_packet_size | UINT32 | The largest packet size (in bytes) in the stream of media data. |
avg_packet_size | UINT32 | The average packet size (in bytes) in the stream of media data. |
start_time | UINT32 | The time offset in milliseconds to add to the time stamp of each packet in a physical stream. |
preroll | UINT32 | The time offset in milliseconds to subtract from the time stamp of each packet in a physical stream. |
duration | UINT32 | The duration of the stream in milliseconds. |
stream_name_size | UINT8 | The length of the following stream_name member in bytes. |
stream_name | UINT8[] | A nonunique alias or name for the stream. This size of this member is variable. |
mime_type_size | UINT8 | The length of the following mime_type field in bytes. |
mime_type | UINT8[] | A nonunique MIME style type/subtype string for data associated with the stream.This size of this member is variable. |
type_specific_len | UINT32 | The length of the following type_specific_data in bytes. The type_specific_data is typically used by the data type renderer to initialize itself in order to process the physical stream. |
type_specific_data | UINT8[] | The type_specific_data is typically used by the data type renderer to initialize itself in order to process the physical stream.The size of this member is variable. |
RM逻辑流属性头
RM中可以包含多节目流,一般RM文件中通过RM逻辑流(logical stream)将多个物理流构成。逻辑流包含以下信息:有哪些物理流构成的,以及一些用于识别逻辑流的属性(比如语言、包组等)。
逻辑流也是保存在Media Properties Header中,其mime type的前缀是"logical-"。举个例子,一个RealAudio流(physical stream)的mime type是audio/x-pn-multirate-realaudio
,那么对应的逻辑流(logical stream)的mime type是logical-audio/x-pn-multirate-realaudio
。下图是一个逻辑流的构成示例:
对于逻辑流对应的属性头,其type_specific_data
字段包含LogicalStream
结构。
文件中也有一个特殊的逻辑流,其MIME type是logical-fileinfo
,包含整个文件的信息,而且只能有一个类似的文件。
LogicalStream Structure
其中包含的字段如下:
LogicalStream
{
ULONG32 size;
UINT16 object_version;
if (object_version == 0)
{
UINT16 num_physical_streams;
UINT16 physical_stream_numbers[num_physical_streams];
ULONG32 data_offsets[num_physical_streams];
UINT16 num_rules;
UINT16 rule_to_physical_stream_number_map[num_rules];
UINT16 num_properties;
NameValueProperty properties[num_properties];
}
};
各字段含义如下表:
field | type | description |
---|---|---|
size | UINT32 | The size of the LogicalStream structure in bytes. |
object_version | UINT16 | The version of the LogicalStream structure. |
num_physical_streams | UINT16 | The number of physical streams that make up this logical stream. The physical stream numbers are stored in a list immediately following this field. These physical stream numbers refer to the stream_number field found in the Media Properties Object for each physical stream belonging to this logical stream. |
physical_stream_numbers | UINT16[] | The list of physical stream numbers that comprise this logical stream. The size of this structure member is variable. |
data_offsets | UINT32[] | The list of data offsets indicating the start of the data section for each physical stream. The size of this structure member is variable. |
num_rules | UINT16 | The number of ASM rules for the logical stream. Each physical stream in the logical stream has at least one ASM rule associated with it or it will never get played. The mapping of ASM rule numbers to physical stream numbers is stored in a list immediately following this member. These physical stream numbers refer to the stream_number field found in the Media Properties Object for each physical stream belonging to this logical stream. |
rule_to_physical_stream_map | UINT16[] | The list of physical stream numbers that map to each rule. Each entry in the map corresponds to a 0-based rule number. The value in each entry is set to the physical stream number for the rule. For example: rule_to_physical_stream_map[0] = 5 This example means physical stream 5 corresponds to rule 0. All of the ASM rules referenced by this array are stored in the first name-value pair of this logical stream which must be called "ASMRuleBook" and be of type "string". Each rule is separated by a semicolon. The size of this structure member is variable. |
num_properties | UINT16 | The number of NameValueProperty structures contained in this structure. These name/value structures can be used to identify properties of this logical stream (for example, language). |
properties | NameValueProperty[] | The list of NameValueProperty structures (see NameValueProperty Structure below for more details). As mentionied above, it is required that the first name-value pair be a string named "ASMRuleBook" and contain the ASM rules for this logical stream. The size of this structure member is variable. |
NameValueProperty Structure
其中包含的字段如下:
NameValueProperty
{
ULONG32 size;
UINT16 object_version;
if (object_version == 0)
{
UINT8 name_length;
UINT8 name[namd_length];
INT32 type;
UINT16 value_length;
UINT8 value_data[value_length];
}
}
各字段含义如下表:
field | type | description |
---|---|---|
size | UINT32 | The size of the NameValueProperty structure in bytes. |
object_version | UINT16 | The version of the NameValueProperty structure. |
name_length | UINT8 | The length of the name data. |
name | UINT8[] | The name string data. |
type | UINT32 | The type of the value data. This member can take on one of three values (any other value is undefined), as shown in the following table: =0 32-bit unsigned integer property =1 buffer =2 string |
value_length | UINT16 | The length of the value data. |
value_data | UINT8[] | The value data. |
RM内容描述头(Content Description Header)
Content Description Header包含了RM文件的title、author、copyright、comments information等信息。其中包含的字段如下:
Content_Description
{
UINT32 object_id;
UINT32 size;
UINT16 object_version;
if (object_version == 0)
{
UINT16 title_len;
UINT8[title_len] title;
UINT16 author_len;
UINT8[author_len] author;
UINT16 copyright_len;
UINT8[copyright_len] copyright;
UINT16 comment_len;
UINT8[comment_len] comment;
}
}
各字段含义如下表:
field | type | description |
---|---|---|
object_id | UINT32 | The unique object ID for the Content Description Header ("CONT"). |
size | UINT32 | The size of the Content Description Header in bytes. |
object_version | UINT16 | the version of the Content Description Header object. |
title_len | UINT16 | The length of the title data in bytes. Note that the title data is not null-terminated. |
title | UINT8[title_len] | An array of ASCII characters that represents the title information for the RealMedia file. The size of this member is variable. |
author_len | UINT16 | The length of the author data in bytes. Note that the author data is not null-terminated. |
author | UINT8[author_len] | An array of ASCII characters that represents the author information for the RealMedia file. The size of this member is variable. |
copyright_len | UINT16 | The length of the copyright data in bytes. Note that the copyright data is not null-terminated. |
copyright | UINT8[] | An array of ASCII characters that represents the copyright information for the RealMedia file. The size of this member is variable. |
comment_len | UINT16 | The length of the comment data in bytes. Note that the comment data is not null-terminated. |
comment | UINT8[] | An array of ASCII characters that represents the comment information for the RealMedia file.The size of this member is variable. |
3. RM数据段(Data Section)
Data Section的起始位置可以通过Properties Header的data_offset字段获取。通常RM数据段包括一个Data Chunk Header和多个交织的媒体数据包(data packet)构成。
Data Chunk Header
标记数据块的开始位置。一般RM文件只有一个数据块。特别大的文件,可能有多个数据块。其中包含的字段如下:
Data_Chunk_Header
{
UINT32 object_id;
UINT32 size;
UINT16 object_version;
if (object_version == 0)
{
UINT32 num_packets;
UINT32 next_data_header;
}
}
各字段含义如下表:
field | type | description |
---|---|---|
object_id | UINT32 | The unique object ID for the Data Chunk Header ('DATA'). |
size | UINT32 | The size of the Data Chunk in bytes. The size includes the size of the header plus the size of all the packets in the data chunk. |
object_version | UINT16 | The version of the Data Chunk Header object. |
num_packets | UINT32 | Number of packets in the data chunk. |
next_data_header | UINT32 | Offset from start of file to the next data chunk. A non-zero value refers to the file offset of the next data chunk. A value of zero means there are no more data chunks in this file. This field is not typically used. |
Data Packet
data chunk header之后紧跟着是num_packets个数据包。这些packet可能来自多个流,但是其时间戳是按照升序顺序存储的。每一个数据包的构成如下:
Media_Packet_Header
{
UINT16 object_version;
if ((object_version == 0) || (object_version == 1))
{
UINT16 length;
UINT16 stream_number;
UINT32 timestamp;
if (object_version == 0)
{
UINT8 packet_group;
UINT8 flags;
}
else if (object_version == 1)
{
UINT16 asm_rule;
UINT8 asm_flags;
}
UINT8[length] data;
}
else
{
StreamDone();
}
}
各字段含义如下表:
field | type | description |
---|---|---|
object_version | UINT16 | The version of the Media Packet Header object. |
length | UINT16 | The length of the packet in bytes. |
stream_number | UINT16 | The 16-bit alias used to associate data packets with their associated Media Properties Header. |
timeStamp | UINT32 | The time stamp of the packet in milliseconds. |
packet_group | UINT8 | The packet group to which the packet belongs. If packet grouping is not used, set this field to 0 (zero). |
flags | UINT8 | Flags describing the properties of the packet. The following flags are defined: HX_RELIABLE_FLAG=1 If this flag is set, the packet is delivered reliably. HX_KEYFRAME_FLAG=2 If this flag is set, the packet is part of a key frame or in some way marks a boundary in your data stream. |
asm_rule | UINT16 | The ASM rule assigned to this packet. |
asm_flags | UINT8 | Contains HX_ flags that dictate stream switching points. |
data | UINT8[length] | The application-specific media data. The size of this member is variable. |
4. RM索引段(Index Section)
Index Section存储了音视频关键帧相关的时间到偏移量的映射。
通常索引块包含一个Index Chunk Header和一系列的index records。
Index Chunk Header
Index Chunk Header标识索引块的开始位置,其偏移量可以通过Properties Header的index_offset字段获取。其中保存了索引段的属性信息。其中包含的字段如下:
Index_Chunk_Header
{
u_int32 object_id;
u_int32 size;
u_int16 object_version;
if (object_version == 0)
{
u_int32 num_indices;
u_int16 stream_number;
u_int32 next_index_header;
}
}
各字段含义如下表:
field | type | description |
---|---|---|
object_id | UINT32 | The unique object ID for the Index Chunk Header ("INDX"). |
size | UINT32 | The size of the Index Chunk in bytes. |
object_version | UINT16 | The version of the Index Chunk Header object. |
num_indices | UINT32 | Number of index records in the index chunk. |
stream_number | UINT16 | The stream number for which the index records in this index chunk are associated. |
next_index_header | UINT32 | Offset from start of file to the next index chunk. This member enables RealMedia file format readers to find all the index chunks quickly. A value of zero for this member indicates there are no more index headers in this file. |
index records
index record中记录时间戳到数据包偏移量的映射。其中包含的字段如下:
IndexRecord
{
UINT16 object_version;
if (object_version == 0)
{
u_int32 timestamp;
u_int32 offset;
u_int32 packet_count_for_this_packet;
}
}
各字段含义如下表:
field | type | description |
---|---|---|
object_version | UINT16 | The version of the Index Record object. |
timestamp | UINT32 | The time stamp (in milliseconds) associated with this record. |
offset | UINT32 | The offset from the start of the file at which this packet can be found. |
packet_count_for_this_packet | UINT32 | The packet number of the packet for this record. This is the same number of packets that would have been seen had the file been played from the beginning to this point. |
注意,通常情况下每个stream对应一个索引段,也就是说index section可能会出现多次。
6. RM元数据段(Metadata Section)
RealMedia元数据中只有一个tag,这个tag里面包含一系列的命名metadata,这些metadata描述了媒体文件的属性。这些metadata可以是文本、整型或二进制数据。Metadata Section包含一个Header和Tag Body。
Metadata Section Header
定义如下:
MetadataSectionHeader
{
u_int32 object_id; // The unique object ID for the Metadata Section Header ("RMMD")
u_int32 size;
}
Metadata Tag
metadata tag有多个properties构成。这些properties通过树形结构组织,每一个property包含一个类型和值,也可能包括多个sub-properties。其中包含如下字段:
MetadataTag
{
u_int32 object_id;
u_int32 object_version;
u_int8[] properties;
}
各字段含义如下表:
field | type | description |
---|---|---|
object_id | UINT32 | The unique object ID for the Metadata Tag ("RJMD"). |
object_version | UINT32 | The version of the Metadata Tag. |
properties | UINT8[] | The MetadataProperty structure that makes up the metadata tag (see "Metadata Property Structure" for more details). As mentioned above, the properties will be represented as one unnamed root metadata property with multiple sub-properties, each with their own optional sub-properties. These will be nested, as in a tree. |
Metadata Property Structure
该部分包含如下字段:
MetadataProperty
{
u_int32 size;
u_int32 type;
u_int32 flags;
u_int32 value_offset;
u_int32 subproperties_offset;
u_int32 num_subproperties;
u_int32 name_length;
u_int8[name_length] name;
u_int32 value_length;
u_int8[value_length] value;
PropListEntry[num_subproperties] subproperties_list;
MetadataProperty[num_subproperties] subproperties;
}
各字段含义如下表:
field | type | description |
---|---|---|
size | UINT32 | The size of the MetadataProperty structure in bytes. |
type | UINT32 | The type of the value data. The data in the value array can be one of the following types: MPT_TEXT The value is string data. MPT_TEXTLIST The value is a separated list of strings, separator specified as sub-property/type descriptor. MPT_FLAG The value is a boolean flag either 1 byte or 4 bytes, check size value. MPT_ULONG The value is a four-byte integer. MPT_BINARY The value is a byte stream. MPT_URL The value is string data. MPT_DATE The value is a string representation of the date in the form: YYYYmmDDHHMMSS (m = month, M = minutes). MPT_FILENAME The value is string data. MPT_GROUPING This property has subproperties, but its own value is empty. MPT_REFERENCE The value is a large buffer of data, use sub-properties/type descriptors to identify mime-type. |
flags | UINT32 | Flags describing the property. The following flags are defined these can be used in combination: MPT_READONLY Read only, cannot be modified. MPT_PRIVATE Private, do not expose to users. MPT_TYPE_DESCRIPTOR Type descriptor used to further define type of value. |
value_offset | UINT32 | The offset to the value_length , relative to the beginning of the MetadataProperty structure. |
subproperties_offset | UINT32 | The offset to the subproperties_list , relative to the beginning of the MetadataProperty structure. |
num_subproperties | UINT32 | The number of subproperties for this MetadataProperty structure. |
name_length | UINT32 | The length of the name data, including the null-terminator. |
name | UINT8[] | The name of the property (string data). The size of this member is designated by name_length. |
value_length | UINT32 | The length of the value data. |
value | UINT8[] | The value of the property (data depends on the type specified for the property). The size of this member is designated by value_length. |
subproperties_list | PropListEntry[] | The list of PropListEntry structures. The PropListEntry structure identifies the offset for each property (see "PropListEntry Structure" for more details. The size of this member is num_subproperties * sizeof(PropListEntry). |
subproperties | MetadataProperty[] | The sub-properties. Each sub-property is a MetadataProperty structure with its own size, name, value, sub-properties, and so on. The size of this member is variable. |
PropListEntry Structure
该部分包含如下字段:
PropListEntry
{
u_int32 offset;
u_int32 num_props_for_name;
}
各字段含义如下表:
field | type | description |
---|---|---|
offset | UINT32 | The offset for this indexed sub-property, relative to the beginning of the containing MetadataProperty. |
num_props_for_name | UINT32 | The number of sub-properties that share the same name. For example, a lyrics property could have multiple versions as differentiated by the language sub-property type descriptor. |
Metadata Section Footer
metadata section footer标志着RealMedia文件的metasection的结束。由于位于RM文件末尾,section footer的位置是固定的,即相对文件结尾偏移量-140字节。section footer中的size字段表示metadata tag的长度,这个字段可以用于快速定位metadata数据。其中包含的字段如下:
MetadataSectionFooter
{
u_int32 object_id; // The unique object ID for the Metadata Section Footer ("RMJE").
u_int32 object_version; // The version of the metadata tag.
u_int32 size; // The size of the preceding metadata tag
}
ID3v1 Tag
ID3v1 Tag位于metadata section最后一部分,其长度固定为128字节。格式可参考ID3v1 standard。
7. 其他问题
整个rm/rmvb文件结构相对比较简单。
比较官方的文档可以从Helix DNA Common Components找到。
目前来说只有一个工具可以分析RM/RMVB文件,名字是RM文件分析器。
参考文献
----------------------------------------------------------------------------------------------------------------------------
本文作者:Tocy e-mail: zyvj@qq.com
版权所有@2015-2020,请勿用于商业用途,转载请注明原文地址。本人保留所有权利。