SciTech-AV-Audio-DAP(Digital Audio Processing)-ffmpeg -AMR录音文件剪切+格式转换+Normalization音量规范化

AMR录音文件的剪切和格式转换

  1. 剪切+格式转换:
    ffmpeg -i evidence.amr -ss 01:10 -to 04:10 -f mp3 sample.mp3
    ffmpeg -i x.mp3 -ss 01:14 -to 01:27 -f mp3 sample.mp3
  2. FFmpeg音频 格式转换
    • ffmpeg做 8000hz采样率,单声道,每个采样样本8bit 这种转换只要:
      ffmpeg -i in.mp3 -acodec pcm_s8 -ac 1 -ar 8000 -vn out.wav
    • mplayer可以用mencoder,对wav的细节处理也要间接用ffmpeg
      mencoder in.mp3 −oac lavc -lavcopts acodec=pcm_s8:o=ac=1,ar=8000 -of lavf -o out.wav
  3. 音量检测和Normalization:
    • 音量采样、离散统计检测:
      ffmpeg -i sample.mp3    -filter:a volumedetect -f null /dev/null  2>>input.log 1>>input.log
      grep 'volumedetect' input.log
      
    • Loudness Normalization
      ffmpeg -i sample.mp3    -filter:a loudnorm         norm.mp3
      

音频Normalization:统一音量音效

https://trac.ffmpeg.org/wiki/AudioVolume
Audio Volume Manipulation

  1. Changing volume: you may use FFmpeg's ​volume audio filter.

    • If we want our volume to be

      # half of the input volume
      ffmpeg -i input.wav -filter:a "volume=0.5" output.wav
      # 150% of current volume
      ffmpeg -i input.wav -filter:a "volume=1.5" output.wav
      # To **reduce the volume**, use a negative value:
       ffmpeg -i input.wav -filter:a "volume=-5dB" output.wav
      # You can also use decibel measures. To **increase the volume by 10dB**:
      `ffmpeg -i input.wav -filter:a "volume=10dB" output.wav`
      
    • Note: the volume filter only adjusts the volume. It does not set the volume. To set or otherwise normalize the volume of a stream, see the sections below.

  2. Peak and RMS Normalization
    To normalize the volume to a given peak or RMS level,
    the file first has to be analyzed using the volumedetect filter,
    doc: https://ffmpeg.org/ffmpeg-filters.html#volumedetect
    ffmpeg -i input.wav -filter:a volumedetect -f null /dev/null 2>>input.log 1>>input.log
    Read the output values from the command line log from file: input.log :

    [Parsed_volumedetect_0 @ 0x30000a7490] n_samples: 96000
    [Parsed_volumedetect_0 @ 0x30000a7490] mean_volume: -28.3 dB
    [Parsed_volumedetect_0 @ 0x30000a7490] max_volume: -7.5 dB
    [Parsed_volumedetect_0 @ 0x30000a7490] histogram_7db: 1
    [Parsed_volumedetect_0 @ 0x30000a7490] histogram_8db: 4
    [Parsed_volumedetect_0 @ 0x30000a7490] histogram_9db: 16
    [Parsed_volumedetect_0 @ 0x30000a7490] histogram_10db: 43
    [Parsed_volumedetect_0 @ 0x30000a7490] histogram_11db: 131
    

    Here is an excerpt of the output:

    mean_volume: -27 dB
    max_volume: -4 dB
    histogram_4db: 6
    histogram_5db: 62
    histogram_6db: 286
    histogram_7db: 1042
    histogram_8db: 2551
    histogram_9db: 4609
    histogram_10db: 8409
    

    It means that:

    • The mean square energy is approximately -27dB( or 10^-2.7).
    • The largest sample is at -4dB, or more precisely between -4dB and -5dB.
    • There are 6 samples at -4dB, 62 at -5dB, 286 at -6dB, etc.
      In other words,
      raising the volume by +4dB does not cause any clipping,
      raising it by +5dB causes clipping for 6 samples, etc..

    then calculate the required offset, and use the volume filter as shown above.

  3. Loudness Normalization
    If you want to normalize the (perceived) loudness of the file, use the ​loudnorm filter,
    doc: https://ffmpeg.org/ffmpeg-filters.html#loudnorm
    which implements the EBU R128 algorithm:
    ffmpeg -i input.wav -filter:a loudnorm output.wav
    This is recommended for most applications, as it will lead to a more uniform loudness level,
    compared to simple peak-based normalization. However, it is recommended to run the normalization with two passes,

    • extracting the measured values from the first run,
    • then using the values in a second run with linear normalization enabled.
      See the loudnorm filter documentation for more.
  4. Automatization with ffmpeg-normalize
    To automate the normalization processes with ffmpeg without having to manually perform two passes,
    and run normalization on multiple files (including video), you can also use the ​ffmpeg-normalize Python program via pip install ffmpeg-normalize. The script defaults to EBU R128 normalization with two passes, but peak and RMS normalization are also supported.
    For details, run ffmpeg-normalize -h or see the README file.


FFmpeg CommandLine Args. and Params.

FFmpeg Audio options:

-aframes number     **set the number of audio frames to output**
-aq quality         set audio quality (codec-specific)
-ar rate            **set audio sampling rate (in Hz)**
-ac channels        **set number of audio channels**
-an                 **disable audio**
-acodec codec       **force audio codec** ('copy' to copy stream)
-ab bitrate         **audio bitrate (please use -b:a)**
-af filter_graph    **set audio filters**

FFmpeg Time Duration

FFmpeg时间段( time duration )格式,两种:

  • [-][:]:[....]
    HH 小时, MM 分钟(最大两位数), SS 秒(最多两位数),
    m 小数秒(高精度, 十进制);
  • [-]<S>+[.<m>...]
    S 秒数, m 作小数秒(高精度, 十进制);;

上两种duration, 都可选前加-指示negative duration.
例如:

-13.567:      **negative** 13.567 seconds
12:03:45:    12hours 03minutes 45 seconds
23.189:       23.189 seconds
00:01:00:    60 seconds
60:               60 seconds

FFmpeg Global options

affect whole program instead of just one file:

  • -loglevel loglevel set logging level
  • -v loglevel **set logging level**
  • -report generate a report
  • -max_alloc bytes **set maximum size of a single allocated block**
  • -y **overwrite output files**
  • -n never overwrite output files
  • -ignore_unknown Ignore unknown stream types
  • -filter_threads **number of non-complex filter threads**
  • -filter_complex_threads **number of threads for -filter_complex**
  • -stats print progress report during encoding
  • -max_error_rate maximum error rate ratio of decoding errors (0.0: no errors, 1.0: 100% errors) above which ffmpeg returns an error instead of success.
  • -frames[:stream_specifier] framecount (output,per-stream)
    指定产出帧数: 设置产出视频的帧数 framecount .
  • -f fmt (input/output), 指定文件格式:
    • 导入:会自动检测导入格式;
    • 产出:文件扩展名自动推导产出格式 ( 因此这个 -f fmt 选项只在必要时使用. )
    • FFmpeg 支持的所有 fmt 格式可以查看执行: ffmpeg -formats
  • -ss position (input/output), 指定时间起点:
    • -i 参数的前面是作为导入设置, 从导入文件快进到指定position.
      注️意:
      • 多数文件不真正支持seek, ffmpeg 会快进到 position 之前最接近的seek point.
      • 转码(transcoding)时并启用选项 -accurate_seek(默认), 则解码并丢弃 前置seek point 和 position 之间的帧.
      • 流复制(stream copy) 时并启用选项 -noaccurate_seek (默认), 则保留seek point 和 position 之间的帧.
    • -i 参数的后面是作为产出选项(放在 output url 之前);
      解码读入文件并丢弃导入, 直到产出流的 timestamp 到达这个指定的 position.
  • -sseof position (input/output)
    类似 "-ss" 选项,但时间点相对于 eof(end of file). 0 表示 EOF, 负数表示文件的stream上.
  • -to position (input/output), 指定时间终点.
    • 在 写产出文件 / 读导入文件 到达指定时间终点 position后停止. ( ffmpeg Time duration 格式)
    • -to -t 两个选项只能两选一,且 -t优先级更高.
  • -t duration (input/output), 指定时长.
    • 在 "-i" 的前面,作为导入设置, 指定只从导入文件读取的数据时间长度.
    • 在 "-i" 的后面(output url前),作为产出设置, 指定只写指定时长的数据,就停止.
    • -to -t 两个选项只能两选一,且 -t优先级更高
  • -fs limit_size (output), Set the file size limit, expressed in bytes.
    No further chunk of bytes is written after the limit is exceeded.
    The size of the output file is slightly more than the requested file size.
  • -itsoffset offset (input), 指定导入时间偏移. offsetffmpeg time duration 格式
    offset 被添加到导入文件的timestamps(时间戳).
    指定positive offset 表示 streams 被 delayed 到指定 offset 的时间.
  • -timestamp date (output), 指定录制时间戳
    在 container(数据容器, 输出stream到存储的格式容器)上 设置录制 timestamp.

FFmpeg Per-file main options:

-f fmt              **force format**
-c codec            **codec name**
-codec codec        codec name
-pre preset         preset name
-map_metadata outfile[,metadata]:infile[,metadata]  **set metadata information of outfile from infile**
-t duration         **record or transcode "duration" seconds of audio/video**
-to time_stop       **record or transcode stop time**
-fs limit_size      **set the limit file size in bytes**
-ss time_off        **set the start time offset**
-sseof time_off     **set the start time offset relative to EOF**
-seek_timestamp     enable/disable seeking by timestamp with -ss
-timestamp time     set the recording timestamp ('now' to set the current time)
-metadata string=string  **add metadata**
-program title=string:st=number...  add program with specified streams
-target type        specify target file type ("vcd", "svcd", "dvd", "dv" or "dv50" with optional prefixes "pal-", "ntsc-" or "film-")
-apad               audio pad
-frames number      **set the number of frames to output**
-filter filter_graph  **set stream filtergraph**
-filter_script filename  **read stream filtergraph description from a file**
-reinit_filter      **reinit filtergraph on input parameter changes**
-discard            discard
-disposition        disposition

FFmpeg 常用到PCM格式


 DE s16be           PCM signed 16-bit big-endian
 DE s16le            PCM signed 16-bit little-endian
 DE s24be           PCM signed 24-bit big-endian
 DE s24le            PCM signed 24-bit little-endian
 DE s32be           PCM signed 32-bit big-endian
 DE s32le            PCM signed 32-bit little-endian
 DE s8                 PCM signed 8-bit
 DE f32be           PCM 32-bit floating-point big-endian
 DE f32le            PCM 32-bit floating-point little-endian
 DE f64be           PCM 64-bit floating-point big-endian
 DE f64le            PCM 64-bit floating-point little-endian
 DE mulaw         PCM mu-law
 DE u16be           PCM unsigned 16-bit big-endian
 DE u16le           PCM unsigned 16-bit little-endian
 DE u24be           PCM unsigned 24-bit big-endian
 DE u24le           PCM unsigned 24-bit little-endian
 DE u32be           PCM unsigned 32-bit big-endian
 DE u32le           PCM unsigned 32-bit little-endian
 DE u8                PCM unsigned 8-bit

ffmpeg 的解码编码格式

posted @   abaelhe  阅读(36)  评论(0编辑  收藏  举报
相关博文:
阅读排行:
· 震惊!C++程序真的从main开始吗?99%的程序员都答错了
· 别再用vector<bool>了!Google高级工程师:这可能是STL最大的设计失误
· 单元测试从入门到精通
· 【硬核科普】Trae如何「偷看」你的代码?零基础破解AI编程运行原理
· 上周热点回顾(3.3-3.9)
点击右上角即可分享
微信分享提示