Microsoft Media Foundation官方文档翻译(11)《Media Types & Audio Media Types》
官方英文文档链接:https://docs.microsoft.com/en-us/windows/desktop/medfound/media-types
基于05/31/2018
我试一下把下面几个页面全放在这一篇里,所以此篇内容较多。video内容有点多,重新开一篇
目前包含以下页面:
Media Type
About Media Type
Major Media Type
Audio Media Type
Audio Subtype GUIDs
Uncompressed Audio Media Types
AAC Media Types
media type 用来描述媒体流格式。在 Media Foundation 中,medai type 用 IMFMediaType 表示。应用程序使用 media types 来辨别媒体文件或媒体流的格式。Media Foundation pipeline 中的各种对象也使用 media type 来协商他们将传递或接收的格式。
此文主要包含一下几部分
Topic | Description |
---|---|
About Media Types | General overview of media types in Media Foundation. |
Media Type GUIDs | Lists the defined GUIDs for major types and subtypes. |
Audio Media Types | How to create media types for audio formats. |
Video Media Types | How to create media types for video formats. |
Complete and Partial Media Types | Describes the difference between complete media types and partial media types. |
Media Type Conversions | How to convert between Media Foundation media types and older format structures. |
Media Type Helper Functions | A list of functions that manipulate or get information from a media type. |
Media Type Debugging Code | Example code that shows how to view a media type while debugging. |
关于 Media Types
IMFMediaType 接口继承自 IMFAttributes 。media type 的详细信息就是一些 attributes。
要创建一个 media type 对象,调用 MFCreateMediaType 。此方法返回一个指向 IMFMediaType 接口的指针。media type 最开始没有 attributes。为了设置 media type 的细节,请设置相关的 attributes。
Major Types and Subtypes
任何媒体类型的两个重要信息就是 major type 和 subtype。
- major type 是定义媒体流大类型的 GUID。Major types 包括了视频和音频还有其他一些类型。要设置 major type,去设置MF_MT_MAJOR_TYPE attribute。IMFMediaType::GetMajorType 方法返回此 attribute 的值。
- subtype 指明了具体格式。例如 major type 为 video 时,subtype 可以是 RGB-24, RGB-32, YUY2 等等。对于音频,subtype 可以是 PCM audio, IEEE floating-point audio 或者其他。subtype 提供了比 major type 更详细的信息。但它并不一定提供了所有的信息。例如视频的 subtype 没有提供画面的分辨率以及帧率。要设置 subtype,去设置 MF_MT_SUBTYPE attribute。
所有的媒体类型都应该有一个 major type GUID 和 subtype GUID,要获取完整的GUID列表,参阅 Media Type GUIDs。
为什么用 Attributes?
与以前的技术(例如 DirectShow 和 Windows Media Format SDK )使用的格式结构相比,attribute 有几个优点。
-
更容易表示“不知道”或“不关心”的值。例如,你正在写一个视频转换程序,你可能知道要支持的RGB格式和YUV格式,但你不会知道要转换的视频的分辨率,帧率等等信息,你可能不关心这些信息。对于一个表示视频格式的结构,每个成员变量都必须有一个值(设置好的或者默认值),使用 0 作为默认值是一种很常见的做法。如果对于另一个组件来说,0 是一个合法的值,则此时可能造成错误。对于 attribute 来说,无关的值只要省略即可。
-
需求会随着时间变化,通过在结构末尾添加更多数据来支持更多格式。例如将 WAVEFORMATEXTENSIBLE 扩展为 WAVEFORMATEX 。这种做法容易出错,因为组件必须强制转换指针类型。而 attribute 可以安全地扩展。
-
定义了相互不兼容的的格式结构。例如 DirectShow 定义了 VIDEOINFOHEADER 和 VIDEOINFOHEADER2。属性彼此独立设置,因此不会出现此问题。
Major Media Types
在 media type 中, major type 对数据类型进行了总体描述,例如这是一个视频或者音频。subtype 会进一步细化(如果存在 subtype)。例如,如果 major type 是视频,则 subtype 可能是 32 位 RGB 的视频。subtype 也可以表示编码格式,例如 H.264 视频。
Major type and subtype are identified by GUIDs and stored in the following attributes:
Attribute | Description |
---|---|
MF_MT_MAJOR_TYPE | Major type. |
MF_MT_SUBTYPE | Subtype. |
The following major types are defined.
Major Type | Description | Subtypes |
---|---|---|
MFMediaType_Audio | 音频。 | Audio Subtype GUIDs. |
MFMediaType_Binary | 二进制流。 | None. |
MFMediaType_FileTransfer | 包含数据文件的流。 | None. |
MFMediaType_HTML | HTML 流。 | None. |
MFMediaType_Image | 图片流。 | WIC GUIDs and CLSIDs. |
MFMediaType_Protected | 受保护的媒体数据。 | The subtype specifies the content protection scheme. |
MFMediaType_Perception | Streams from a camera sensor or processing unit that reasons and understands raw video data and provides understanding of the environment or humans in it. | None. |
MFMediaType_SAMI | Synchronized Accessible Media Interchange (SAMI) captions. | None. |
MFMediaType_Script | Script stream. | None. |
MFMediaType_Stream | 多路流或单路流。 | Stream Subtype GUIDs |
MFMediaType_Video | 视频。 | Video Subtype GUIDs. |
第三方组件可以定义新的 majortype 和 subtype。
Audio Media Types
本节介绍如何创建和操作描述音频数据的媒体类型。
Topic | Description |
---|---|
Audio Subtype GUIDs | 音频 subtype GUID 列表。 |
Uncompressed Audio Media Types | 如何创建一个描述未压缩音频格式的 media type。 |
AAC Media Types | Advanced Audio Coding (AAC) 流的 media type。 |
Audio Subtype GUIDs
下面是已经第定义的音频 subtype GUID。要指定 subtype,在 media type 上设置 MF_MT_SUBTYPE attribute。除非另有说明,这些常量都定义在 mfapi.h 中。
使用这些 subtypes 时,设置 MF_MT_MAJOR_TYPE 为 MFMediaType_Audio。
GUID | Description | Format Tag (FOURCC) |
---|---|---|
MEDIASUBTYPE_RAW_AAC1 | Advanced Audio Coding (AAC). This subtype is used for AAC contained in an AVI file with an audio format tag equal to 0x00FF. For more information, see AAC Decoder. Defined in wmcodecdsp.h |
WAVE_FORMAT_RAW_AAC1 (0x00FF) |
MFAudioFormat_AAC | Advanced Audio Coding (AAC).
[!Note] The stream can contain raw AAC data or AAC data in an Audio Data Transport Stream (ADTS) stream. For more information, see: |
WAVE_FORMAT_MPEG_HEAAC (0x1610) |
MFAudioFormat_ADTS | Not used. | WAVE_FORMAT_MPEG_ADTS_AAC (0x1600) |
MFAudioFormat_ALAC | Apple Lossless Audio Codec Supported in Windows 10 and later. |
WAVE_FORMAT_ALAC (0x6C61) |
MFAudioFormat_AMR_NB | Adaptative Multi-Rate audio Supported in Windows 8.1 and later. |
WAVE_FORMAT_AMR_NB |
MFAudioFormat_AMR_WB | Adaptative Multi-Rate Wideband audio Supported in Windows 8.1 and later. |
WAVE_FORMAT_AMR_WB |
MFAudioFormat_AMR_WP | Supported in Windows 8.1 and later. | WAVE_FORMAT_AMR_WP |
MFAudioFormat_Dolby_AC3 | Dolby Digital (AC-3). Same GUID value as MEDIASUBTYPE_DOLBY_AC3, which is defined in ksuuids.h |
None. |
MFAudioFormat_Dolby_AC3_SPDIF | Dolby AC-3 audio over Sony/Philips Digital Interface (S/PDIF). This GUID value is identical to the following subtypes:
|
WAVE_FORMAT_DOLBY_AC3_SPDIF (0x0092) |
MFAudioFormat_Dolby_DDPlus | Dolby Digital Plus. Same GUID value as MEDIASUBTYPE_DOLBY_DDPLUS, which is defined in wmcodecdsp.h. |
None |
MFAudioFormat_DRM | Encrypted audio data used with secure audio path. | WAVE_FORMAT_DRM (0x0009) |
MFAudioFormat_DTS | Digital Theater Systems (DTS) audio. | WAVE_FORMAT_DTS (0x0008) |
MFAudioFormat_FLAC | Free Lossless Audio Codec Supported in Windows 10 and later. |
WAVE_FORMAT_FLAC (0xF1AC) |
MFAudioFormat_Float | Uncompressed IEEE floating-point audio. | WAVE_FORMAT_IEEE_FLOAT (0x0003) |
MFAudioFormat_Float_SpatialObjects | Uncompressed IEEE floating-point audio. | None |
MFAudioFormat_MP3 | MPEG Audio Layer-3 (MP3). | WAVE_FORMAT_MPEGLAYER3 (0x0055) |
MFAudioFormat_MPEG | MPEG-1 audio payload. | WAVE_FORMAT_MPEG (0x0050) |
MFAudioFormat_MSP1 | Windows Media Audio 9 Voice codec. | WAVE_FORMAT_WMAVOICE9 (0x000A) |
MFAudioFormat_Opus | Opus Supported in Windows 10 and later. |
WAVE_FORMAT_OPUS (0x704F) |
MFAudioFormat_PCM | Uncompressed PCM audio. | WAVE_FORMAT_PCM (1) |
MFAudioFormat_QCELP | QCELP (Qualcomm Code Excited Linear Prediction) audio. | None |
MFAudioFormat_WMASPDIF | Windows Media Audio 9 Professional codec over S/PDIF. | WAVE_FORMAT_WMASPDIF (0x0164) |
MFAudioFormat_WMAudio_Lossless | Windows Media Audio 9 Lossless codec or Windows Media Audio 9.1 codec. | WAVE_FORMAT_WMAUDIO_LOSSLESS (0x0163) |
MFAudioFormat_WMAudioV8 | Windows Media Audio 8 codec, Windows Media Audio 9 codec, or Windows Media Audio 9.1 codec. | WAVE_FORMAT_WMAUDIO2 (0x0161) |
MFAudioFormat_WMAudioV9 | Windows Media Audio 9 Professional codec or Windows Media Audio 9.1 Professional codec. | WAVE_FORMAT_WMAUDIO3 (0x0162) |
T此表第三列中的格式标记,是在 WAVEFORMATEX 结构中使用,并在头文件 mmreg.h 中定义。
给定一种格式,你可以用以下步骤创建一个 subtype GUID :
- 从定义在 mfaph.i 中的 MFAudioFormat_Base 这个值开始。
- 使用格式标记(fourcc?)替换 GUID 中的第一个 DWORD。
你可以使用 DEFINE_MEDIATYPE_GUID 宏定义一个遵循此模式的新的 GUID 常量。
Uncompressed Audio Media Types
要创建一个完整的描述未压缩音频格式的 media type,要在 IMFMediaType 接口指针上设置 至少 以下 attribute。
Attribute | Description |
---|---|
MF_MT_MAJOR_TYPE | Major type. Set to MFMediaType_Audio. |
MF_MT_SUBTYPE | Subtype. See Audio Subtype GUIDs. |
MF_MT_AUDIO_NUM_CHANNELS | Number of audio channels. |
MF_MT_AUDIO_SAMPLES_PER_SECOND | Number of audio samples per second. |
MF_MT_AUDIO_BLOCK_ALIGNMENT | Block alignment. |
MF_MT_AUDIO_AVG_BYTES_PER_SECOND | Average number of bytes per second. |
MF_MT_AUDIO_BITS_PER_SAMPLE | Number of bits per audio sample. |
MF_MT_ALL_SAMPLES_INDEPENDENT | Specifies whether each audio sample is independent. Set to TRUE for MFAudioFormat_PCM and MFAudioFormat_Float formats. |
另外,某些格式还要求以下 attribute:
Attribute | Description |
---|---|
MF_MT_AUDIO_VALID_BITS_PER_SAMPLE | Number of valid bits of audio data in each audio sample. Set this attribute if the audio samples have padding—that is, if the number of valid bits in each audio sample is less than the sample size. |
MF_MT_AUDIO_CHANNEL_MASK | The assignment of audio channels to speaker positions. Set this attribute for multichannel audio streams, such as 5.1. This attribute is not required for mono or stereo audio. |
Example Code
以下代码展示了如何为未压缩的 PCM 格式音频创建一个 media type。
下一个例子中输入一个音频编码格式,然后创建一个对应的 PCM 音频 type。这个 type 可以用来设置编码器或解码
AAC Media Types
本文介绍了如何在 Media Foundation 中创建 Advanced Audio Coding (AAC) 流格式的 media type。
AAC 音频定义了两种 subtype:
Subtype | Description | Header |
---|---|---|
MFAudioFormat_AAC | Raw AAC or ADTS AAC. | mfapi.h |
MEDIASUBTYPE_RAW_AAC1 | Raw AAC. | wmcodecdsp.h |
1. MFAudioFormat_AAC
对于这种 subtype,media type 在应用 spectral band replication (SBR) 和 parametric stereo (PS) tools(如果存在) 之前给出了 sample rate 和声道数。SBR 工具的效果是使解码后的 sample rate 变为 core AAC-LC sample rate 的两倍。PS tool 的效果是从单声道 core AAC-LC 流解码立体声。
此 subtype 等同于定义在 wmcodecdsp.h 中的 MEDIASUBTYPE_MPEG_HEAAC。参阅 Audio Subtype GUIDs。
2. MEDIASUBTYPE_RAW_AAC1
此 subtype 用于 AVI 文件中包含的 AAC,等同于 WAVE_FORMAT_RAW_AAC1 (0x00FF)。
对于此种 subtype,media type 在应用 SBR 和 PS 工具(如果存在)之后给出采样率和声道数。
以下 media type attributes 适用于 AAC 音频。
Attribute | Description |
---|---|
MF_MT_MAJOR_TYPE | Major type。必须是 MFMediaType_Audio. |
MF_MT_SUBTYPE | Audio subtype。参考上面的描述(两种其一) |
MF_MT_AAC_AUDIO_PROFILE_LEVEL_INDICATION | Audio profile and level. The value of this attribute is the audioProfileLevelIndication field, as defined by ISO/IEC 14496-3. If unknown, set to zero or 0xFE ("no audio profile specified"). |
MF_MT_AUDIO_AVG_BYTES_PER_SECOND | Bit rate of the encoded AAC stream, in bytes per second. |
MF_MT_AAC_PAYLOAD_TYPE | Payload type. Applies only to MFAudioFormat_AAC. MF_MT_AAC_PAYLOAD_TYPE is optional. If this attribute is not specified, the default value 0 is used, which specifies the stream contains raw_data_block elements only. |
MF_MT_AUDIO_BITS_PER_SAMPLE | Bit depth of the decoded PCM audio. |
MF_MT_AUDIO_CHANNEL_MASK | Assignment of audio channels to speaker positions. |
MF_MT_AUDIO_NUM_CHANNELS | Number of channels, including the low frequency (LFE) channel, if present. The interpretation of this value depends on the media subtype, as described previously. |
MF_MT_AUDIO_SAMPLES_PER_SECOND | Sample rate, in samples per second. The interpretation of this value depends on the media subtype, as described previously. |
MF_MT_USER_DATA | The value of this attribute depends on the subtype:
|