视频精彩片段提取 - 调研
思路1:从字幕或音轨中找到对话较多的部分
- 抽取音轨
ffmpeg -i a.mp4 -map 0:a:0 a.mp3
- 逐帧抽取RMS功率:
ffmpeg -i in.mp3 -af astats=metadata=1:reset=1,ametadata=print:key=lavfi.astats.Overall.RMS_level:file=log.txt -f null -
Determining audio level peaks with ffmpeg
https://superuser.com/questions/1183663/determining-audio-level-peaks-with-ffmpeg
- 对整体进行音量分析:
ffmpeg -i input.wav -filter:a volumedetect -f null /dev/null
https://trac.ffmpeg.org/wiki/AudioVolume
https://ffmpeg.org/ffmpeg-filters.html#volumedetect
- 截取片段:
ffmpeg -ss $ss -t 00:05:00 -i $vfile.mp4 -vcodec copy -acodec copy -y $vfile.${ss//:/_}.mp4
https://stackoverflow.com/questions/21420296/how-to-extract-time-accurate-video-segments-with-ffmpeg
提取精彩片段时间区间:
import sys, os def getv(rms): return max(0, 100-abs(rms)) def extract(diff): pos=0 pos3 = 0 for n, v in enumerate(diff): if v > 0: pos += 1 if n < 3 and v >= 3: pos3 += 1 if pos >= 3 and pos3 >= 2: return 1 return 0 timebin = 0 s = [] v = [] diff = (0,)*5 for nline, line in enumerate(sys.stdin): if 'pts_time' in line: ts = float(line.split('pts_time:')[1]) if ts > timebin + 60: if s: avgrms = int(sum(s)/len(s)) # print '%.2d %.2d' % (timebin/60, timebin%60), avgrms, 100-abs(avgrms), '-' * (100-abs(avgrms)) if v: d = max(0, getv(avgrms)-v[-1]) diff = diff[1:] + (d,) ext = extract(diff) print >>sys.stderr, '%3d %2d %s %3d' % (timebin/60, timebin%60, avgrms, getv(avgrms)-v[-1]), '-' * d, '*' * ext if ext: h = timebin/3600 print '%.2d:%.2d:00' % (h, (timebin-3600*h)/60) if ext: diff = (0,)*5 v.append(getv(avgrms)) timebin += 60 s=[] if 'RMS' in line: rms = float(line.split('lavfi.astats.Overall.RMS_level=')[1]) if rms > -1000: s.append(rms)
调试:
ffmpeg volumedetect returns unstable result
https://stackoverflow.com/questions/48673923/ffmpeg-volumedetect-returns-unstable-result
思路2:思路1+镜头边缘检测
安装opencv:https://www.cnblogs.com/yaoyaohust/p/10228888.html
镜头边界检测:https://www.cnblogs.com/lynsyklate/p/7840881.html
Yahoo的开源工具Hecate:https://github.com/yahoo/hecate
思路3:耗时更长、技术难度更高的做法
百度BROAD-Video Highlights视频精彩片段数据集简要介绍与分析
https://zhuanlan.zhihu.com/p/31770408
Temporal Action Detection (时序动作检测)方向2017年会议论文整理
https://zhuanlan.zhihu.com/p/31501316
Video Analysis 相关领域解读之Temporal Action Detection(时序行为检测)
https://zhuanlan.zhihu.com/p/26603387
Video Analysis相关领域解读之Action Recognition(行为识别)
https://zhuanlan.zhihu.com/p/26460437
Temporal Action Detection with Structured Segment Networks
林达华(香港中文)的团队
https://github.com/yjxiong/action-detection
基于PyTorch和DenseFlow
UntrimmedNets for Weakly Supervised Action Recognition and Detection
林达华(香港中文)的团队
https://github.com/wanglimin/UntrimmedNet
https://github.com/yjxiong/caffe/tree/untrimmednet
基于Caffe