基于短时傅里叶变换的音频数字水印
有关音频文件数字水印的参考材料相对较少,在短暂的尝试后,通过模仿图像文件dwt算法的嵌入方式,使用librosa库中短时傅里叶变换函数完成了水印嵌入及提取。过程简洁,代码量极少。美中不足的是由于音频文件数据类型为complex64而图像文件数据类型为uint8,嵌入过程中会不可避免的存在部分数据丢失,且该方法暂无理论支撑。
嵌入部分
import librosa import matplotlib.pyplot as plt import soundfile import cv2 y, sr = librosa.load("C:\\Users\\split.wav") waterImg = cv2.imread("C:\\Users\\logo.bmp", 0) # stft 短时傅立叶变换 a = librosa.stft(y) w = a.shape[0] h = a.shape[1] waterImg = cv2.resize(waterImg, (h, w)) print(type(a[0][2])) key = 0.001 #嵌入强度 r_a = key * waterImg + a # istft 逆短时傅立叶变换 b = librosa.istft(r_a) soundfile.write("C:\\Users\\new.wav", b, sr)
频谱图对比(key=0.25时,原音频与嵌入后)
提取部分
import librosa import numpy as np from PIL import Image y1, sr1 = librosa.load("C:\\Users\\split.wav") y2, sr2 = librosa.load("C:\\Users\\new.wav") a = librosa.stft(y1) r_a = librosa.stft(y2) key = 0.001 waterImg = (r_a-a)/key waterImg = np.array(waterImg, dtype='uint8') waterImg = Image.fromarray(waterImg) waterImg = waterImg.resize((100, 100), Image.ANTIALIAS) waterImg = waterImg.convert("L") waterImg.save('C:\\Users\\57882\\Desktop\\audio.bmp')
提取图像处理
二值化
import cv2 from PIL import Image import numpy as np Img_path = 'C:\\Users\\audio.bmp' Img = Image.open(Img_path) Img = Img.convert('L') threshold = 2 #我这里设的阈值为2 table = [] for i in range(256): if i < threshold: table.append(0) else: table.append(255) waterImg = Img.point(table, '1') waterImg.save('C:\\Users\\57882\\Desktop\\2valuelogo.bmp')
图像增强与反相
import cv2 import numpy as np # 反相 def reverse(img): gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) dst = 255 - gray return dst image = cv2.imread("C:\\Users\\2valuelogo.bmp") kernel = np.array([[0, -1, 0], [-1, 5, -1], [0, -1, 0]], np.float32) dst = cv2.filter2D(image, -1, kernel=kernel) dst = reverse(dst) cv2.imwrite('C:\\Users\\newpic.bmp', dst)
结果示例:(水印图像100*100)
提取图像:
二值化(阈值=2)
图像增强,反相
水印原图
2022.3.16更新
使用二值化图像,并通过在复数的虚部中嵌入信息,减少信息丢失。
嵌入部分
import librosa import cv2 import soundfile import numpy as np y, sr = librosa.load('C:\\Users\\57882\\split.wav') waterImg = cv2.imread('C:\\Users\\2value.bmp', 0) stft = librosa.stft(y, n_fft=2048, hop_length=None, win_length=None, window='hann', center=True, pad_mode='reflect') w = int(stft.shape[0]) h = int(stft.shape[1]) key1 = 0.025 waterImg = cv2.resize(waterImg, (h, w)) Img = np.array(waterImg, dtype='complex64') for i in range(w): for j in range(h): Img[i][j] = complex(0, int(waterImg[i][j])*key1) key2 = 0.02 stft_new = key2 * Img + stft Y = librosa.istft(stft_new) soundfile.write("C:\\Users\\new1.wav", Y, sr)
提取部分
import librosa import numpy as np from PIL import Image Y, Sr = librosa.load('C:\\Users\\new1.wav') y, sr = librosa.load('C:\\Users\\split.wav') stft1 = librosa.stft(Y, n_fft=2048, hop_length=None, win_length=None, window='hann', center=True, pad_mode='reflect') stft2 = librosa.stft(y, n_fft=2048, hop_length=None, win_length=None, window='hann', center=True, pad_mode='reflect') w = stft1.shape[0] h = stft1.shape[1] key2 = 0.02 key1 = 0.025 Img = (stft1 - stft2)/key2 Img = np.imag(Img) Img = np.array((Img/key1), dtype='uint8') waterImg = Image.fromarray(Img) waterImg = waterImg.resize((100, 100), Image.ANTIALIAS) waterImg = waterImg.convert("L") waterImg.save('C:\\Users\\audionew.bmp')
结果示例
图像增强及反相后