1121
1.认识goole audio set(引用一个Google数据集audio set 使用教程)
数据集的两种形式:
1:描述每个片段的文本(csv)文件,包括YouTube视频ID、开始时间、结束时间和一个或多个标签。
2: TensorFlow Record 文件,称为feature dataset,frame-level features are stored as tensorflow.SequenceExample protocol buffers,这个
原型被以文本的形式给出:
context: { feature: { key : "video_id" value: { bytes_list: { value: [YouTube video id string] } } } feature: { key : "start_time_seconds" value: { float_list: { value: 6.0 } } } feature: { key : "end_time_seconds" value: { float_list: { value: 16.0 } } } feature: { key : "labels" value: { int64_list: { value: [1, 522, 11, 172] # The meaning of the labels can be found here. } } } } feature_lists: { feature_list: { key : "audio_embedding" value: { feature: { bytes_list: { value: [128 8bit quantized features] } } feature: { bytes_list: { value: [128 8bit quantized features] } } } ... # Repeated for every second of the segment } }