随笔 - 59,  文章 - 1,  评论 - 0,  阅读 - 24152

记录一下使用JavaCV + JavaFX + Netty开发视频语音聊天程序(H264、AAC编解码)

功能需求

  • 用户ID绑定
  • 邀请用户视频聊天
  • 视频采集、H264编码
  • 音频采集、AAC编码
  • 音视频TCP传输
  • 视频解码、显示
  • 音频解码、播放
  • 切换聊天视角
  • 开启/关闭视频

架构设计

米虫VIM,主要划分为界面层,通讯层和基础层,界面采用JavaCV + JavaFX开发,负责音视频采集及编解码,通讯层采用Netty开发,采用C/S架构,分为服务端和客户端,基础层由JavaCV的FFmpeg API支持。

连接绑定

进行音视频聊天前需要连接服务端,并绑定自己的ID:

视频聊天

视频、语音聊天需要先设置视频源和音频源,一般来说是摄像头和麦克风,目前暂时没有处理自动读取电脑支持的设备列表,需要通过FFmpeg工具获取:

ffmpeg -list_devices true -f dshow -i dummy

采集实现

视频采集实现:

public class VideoGrabber {
private AVFormatContext formatContext;
private AVInputFormat format;
private AVCodecContext context;
private AVCodec codec;
private AVFrame frame;
private AVPacket packet;
private H264Sws h264Sws;
private int[] got = {0};
private int width, height;
private boolean isEnd;
private int videoIndex = -1;
public static VideoGrabber of(String input, String format, Map<String, String> dict) {
VideoGrabber g = new VideoGrabber();
g.formatContext = avformat_alloc_context();
if (format != null) {
g.format = av_find_input_format(format);
}
AVDictionary dictionary = new AVDictionary();
dict.forEach((k, v) -> av_dict_set(dictionary, k, v, 0));
av_dict_free(dictionary);
int ret = avformat_open_input(g.formatContext, input, g.format, dictionary);
if (ret != 0) {
FFmpegException.asThrow(ret, "视频流打开失败");
}
ret = avformat_find_stream_info(g.formatContext, (AVDictionary) null);
if (ret < 0) {
FFmpegException.asThrow(ret, "查找视频流失败");
}
for (int i = 0; i < g.formatContext.nb_streams(); i++) {
if (g.formatContext.streams(i).codec().codec_type() == AVMEDIA_TYPE_VIDEO) {
g.videoIndex = i;
break;
}
}
if (g.videoIndex == -1) {
FFmpegException.asThrow("没有找到视频流");
}
g.context = g.formatContext.streams(g.videoIndex).codec();
g.codec = avcodec_find_decoder(g.context.codec_id());
if (g.codec == null) {
FFmpegException.asThrow("没有合适的视频流解码器");
}
ret = avcodec_open2(g.context, g.codec, (AVDictionary) null);
if (ret != 0) {
FFmpegException.asThrow(ret, "解码器打开失败");
}
g.width = g.context.width();
g.height = g.context.height();
g.frame = av_frame_alloc();
g.packet = new AVPacket();
g.h264Sws = H264Sws.of(g.width, g.height, g.context.pix_fmt(), AV_PIX_FMT_YUV420P);
return g;
}
public int width() {
return width;
}
public int height() {
return height;
}
public boolean isEnd() {
return isEnd;
}
public AVFrame grab() {
int ret = av_read_frame(formatContext, packet);
if (ret < 0) {
isEnd = true;
}
if (ret >= 0 && packet.stream_index() == videoIndex) {
ret = avcodec_decode_video2(context, frame, got, packet);
if (ret < 0) {
FFmpegException.asThrow(ret, "avcodec_decode_video2解码失败");
}
if (got[0] != 0) {
return h264Sws.scale(frame);
}
av_packet_unref(packet);
}
return null;
}
public void close() {
if (packet != null) {
av_free_packet(packet);
packet = null;
}
if (frame != null) {
av_frame_free(frame);
frame = null;
}
if (context != null) {
avcodec_close(context);
context = null;
}
if (formatContext != null) {
avformat_close_input(formatContext);
formatContext = null;
}
if (h264Sws != null) {
h264Sws.close();
h264Sws = null;
}
}
@Override
protected void finalize() throws Throwable {
super.finalize();
close();
}
}

音频采集实现:

public class AudioGrabber {
private avformat.AVFormatContext formatContext;
private avformat.AVInputFormat format;
private avcodec.AVCodecContext context;
private avcodec.AVCodec codec;
private AVFrame frame;
private avcodec.AVPacket packet;
private AACSwr aacSwr;
private int[] got = {0};
private int channels, sample_rate;
private boolean isEnd;
private int audioIndex = -1;
public static AudioGrabber of(String input, String format, Map<String, String> dict) {
AudioGrabber g = new AudioGrabber();
g.formatContext = avformat_alloc_context();
if (format != null) {
g.format = av_find_input_format(format);
}
AVDictionary dictionary = new AVDictionary();
dict.forEach((k, v) -> av_dict_set(dictionary, k, v, 0));
av_dict_free(dictionary);
int ret = avformat_open_input(g.formatContext, input, g.format, dictionary);
if (ret != 0) {
FFmpegException.asThrow(ret, "音频流打开失败");
}
ret = avformat_find_stream_info(g.formatContext, (AVDictionary) null);
if (ret < 0) {
FFmpegException.asThrow(ret, "查找音频流失败");
}
for (int i = 0; i < g.formatContext.nb_streams(); i++) {
if (g.formatContext.streams(i).codec().codec_type() == AVMEDIA_TYPE_AUDIO) {
g.audioIndex = i;
break;
}
}
if (g.audioIndex == -1) {
FFmpegException.asThrow("没有找到音频流");
}
g.context = g.formatContext.streams(g.audioIndex).codec();
g.codec = avcodec_find_decoder(g.context.codec_id());
if (g.codec == null) {
FFmpegException.asThrow("没有合适的音频流解码器");
}
ret = avcodec_open2(g.context, g.codec, (AVDictionary) null);
if (ret != 0) {
FFmpegException.asThrow(ret, "解码器打开失败");
}
g.channels = g.context.channels();
g.sample_rate = g.context.sample_rate();
g.frame = av_frame_alloc();
g.packet = new avcodec.AVPacket();
return g;
}
public int channels() {
return channels;
}
public int sample_rate() {
return sample_rate;
}
public boolean isEnd() {
return isEnd;
}
public AVFrame grabFrame() {
int ret = av_read_frame(formatContext, packet);
if (ret < 0) {
isEnd = true;
}
if (ret >= 0 && packet.stream_index() == audioIndex) {
ret = avcodec_decode_audio4(context, frame, got, packet);
if (ret < 0) {
FFmpegException.asThrow(ret, "avcodec_decode_audio4解码失败");
}
if (got[0] != 0) {
return frame;
}
av_packet_unref(packet);
}
return null;
}
// 如果不是sample_fmt不是S16,那么需要重采样
// public byte[] grab() {
// AVFrame frame = grabFrame();
// if (frame != null) {
// if (aacSwr == null) {
// aacSwr = AACSwr.of(context.channels(), context.sample_fmt(), context.sample_rate(), /*frame.nb_samples(),*/
// context.channels(), AV_SAMPLE_FMT_S16, context.sample_rate());
// }
// return aacSwr.convert(frame);
// }
// return null;
// }
public void close() {
if (packet != null) {
av_free_packet(packet);
packet = null;
}
if (frame != null) {
av_frame_free(frame);
frame = null;
}
if (context != null) {
avcodec_close(context);
// avcodec_free_context(context);
context = null;
}
if (formatContext != null) {
avformat_close_input(formatContext);
formatContext = null;
}
if (aacSwr != null) {
aacSwr.close();
aacSwr = null;
}
}
@Override
protected void finalize() throws Throwable {
super.finalize();
close();
}
}

代码结构

客户端代码结构(含音视频采集、编码、发送、接收、解码等)

.
├── App.java
├── ImClient.java
├── codec
│   ├── AACDecoder.java
│   ├── AACEncoder.java
│   ├── AACSwr.java
│   ├── H264Decoder.java
│   ├── H264Encoder.java
│   ├── H264Sws.java
│   └── Rgb24Render.java
├── exception
│   └── FFmpegException.java
├── grabber
│   ├── AudioGrabber.java
│   └── VideoGrabber.java
├── handler
│   ├── PacketHandler.java
│   └── PingHandler.java
├── im
│   ├── ImContext.java
│   ├── ImDebug.java
│   └── ImListener.java
├── stream
│   ├── AudioPlayer.java
│   ├── AudioReceiver.java
│   ├── AudioSender.java
│   ├── StreamReceiver.java
│   ├── StreamSender.java
│   ├── VideoReceiver.java
│   └── VideoSender.java
└── ui
└── MainController.java
7 directories, 25 files
posted on   $$X$$  阅读(2112)  评论(0编辑  收藏  举报
(评论功能已被禁用)
相关博文:
阅读排行:
· 全程不用写代码,我用AI程序员写了一个飞机大战
· DeepSeek 开源周回顾「GitHub 热点速览」
· 记一次.NET内存居高不下排查解决与启示
· MongoDB 8.0这个新功能碉堡了,比商业数据库还牛
· .NET10 - 预览版1新功能体验(一)

点击右上角即可分享
微信分享提示