实时边缘视频流人物检测(三)
三、 系统架构
As illustrated by Fig. 1, the proposed automatic surveillance system architecture consists of three hierarchical layers: edge computing, fog computing, and cloud computing. It pushes the computing tasks even closer to the data source.
如图1所示,所提出的自动监视系统架构由三个层次组成:边缘计算、雾计算和云计算。它使计算任务更接近数据源。
● Edge Computing Layer: On-site edge computing devices are responsible for performing low-level feature extraction tasks, such as human detection and object tracking. As the raw video stream is generated and also processed locally on edge, only the extracted object feature information and tracking data are transmitted to the nodes in higher layer, the fog computing layer.
● 边缘计算层:现场边缘计算设备负责执行低级特征提取任务,如人体检测和目标跟踪。由于原始视频流是在边缘上生成和局部处理的,因此只有提取的目标特征信息和跟踪数据被传输到更高层的节点,即雾计算层。
● Fog Computing Layer: The intermediate-level pattern recognition tasks like action recognition and semantic description are implemented on near-site fog computing nodes, such as laptops, tablets or smartphones. By analyzing extracted object features and tracking data, human activity models are constructed and submitted to cloud computing server for high-level tasks such as decision-making, alert generation, and long-term historical profile building and analysis.
● 雾计算层:动作识别、语义描述等中级模式识别任务在笔记本、平板或智能手机等近站点雾计算节点上实现。通过对提取的目标特征和跟踪数据的分析,构建了人体活动模型,并将其提交给云计算服务器,用于决策、警报生成、长期历史剖面的建立和分析等高层任务。
● Cloud Computing Layer: This layer handles high-level tasks that are often computing intensive such as machine learning algorithms or decision-making reasoning. Based on a thorough analysis of the extracted activity data, the human behavior profile is built. This profile is critical to describe the interactive events. Then, pre-defined or trained policies of event classification are applied. If a suspicious or anomalous event is identified, an alert will be generated for the end user.
● 云计算层:该层处理通常是计算密集型的高级任务,如机器学习算法或决策推理。在对提取的活动数据进行深入分析的基础上,建立了人体行为轮廓。此配置文件对于描述交互事件至关重要。然后,应用预先定义或训练过的事件分类策略。如果发现可疑或异常事件,将为最终用户生成警报。
According to the proposed hierarchical system architecture, we have implemented the human detection and tracking mechanisms on the edge to verify the low-level processing efficiency and effectiveness.
根据提出的分层体系结构,在边缘实现了人体检测和跟踪机制,验证了底层处理的效率和有效性。
四、人体检测与跟踪算法(★)
We have implemented both human objects detection and KCF tracking algorithms that take advantage of discriminative HOG features. The basic principle of the HOG feature based detection and tracking algorithms is described below.
我们利用了区分HOG特征已经实现了人体目标检测和KCF跟踪算法。基于HOG特征的检测和跟踪算法的基本原理如下所述。
A. HOG+SVM based Human Detection HOG+SVM
A、基于HOG+SVM的人体检测
based human detection algorithm chooses contour as the HOG feature to distinguish human beings from non-human objects given assumption that people have similar contour even though they have a different appearance of wares. Steps of feature extraction and object detection flow are shown in Fig. 2 and discussed below.
基于轮廓的人体检测算法在假设人的轮廓相似的前提下,选择轮廓作为HOG特征来区分人与非人的物体,即使人的外表不同。特征提取和目标检测流程的步骤如图2所示,并在下面讨论。
● Gradient computation: Normalizing the color and gamma values is the first step to calculate the feature detector, which is followed by the calculation of the magnitude and orientation of the gradient.
● 梯度计算:将颜色和伽马值标准化是计算特征检测器的第一步,然后计算梯度的大小和方向。
● Weighted vote in orientation cell: The image is divided based on sliding detection window and the cell histograms are created. Each pixel within the cell is associated with a weighted vote for an orientation-based histogram channel according to the values calculated in the gradient computation.
● 方向单元格中的加权投票:基于滑动检测窗口对图像进行分割,生成细胞直方图。根据在梯度计算中计算出的值,单元内的每个像素与基于方向的直方图信道的加权投票相关联。
● Contrast normalization: Considering the effect caused by illumination and contrast change, gradient strengths should be locally normalized by grouping the overlapping cells together into larger, spatially connected blocks.
● 对比度标准化:考虑到光照和对比度变化的影响,梯度强度应该通过将重叠的单元分组成更大的空间连接块来进行局部标准化。
● HOG collection: In this step, a concatenated vector of the components of the normalized cell histograms from all block regions is calculated to create HOG descriptor.
● HOG采集:在此步骤中,计算来自所有块区域的标准化细胞直方图的成分的连接向量,以创建HOG描述符。
● Linear SVM: HOG descriptor that contains extracted HOG feature vectors is fed to a linear SVM for human/non-human classification.
● 线性SVM:将包含提取的HOG特征向量的HOG描述符输入到线性SVM中进行人类/非人类的分类。
B. Kernelized Correlation Filters tracking algorithm
B、 核相关滤波跟踪算法
KCF is initially inspired by successful applications of the correlation filter in tracking [9]. Compared with more complicated approaches, correlation filters have been proved to be competitive with lower computational power requirement. Object detection based KCF could be defined as a problem of determining an object’s position through template matching that is performed by computing a correlation with a special filter h and subsequent searching of the maximum value on the obtained correlated image c [18]:
KCF最初的灵感来源于相关滤波器在跟踪中的成功应用[9]。与更复杂的方法相比,相关滤波器在计算能力要求较低的情况下具有竞争力。基于KCF的目标检测可以被定义为通过模板匹配来确定对象位置的问题,该模板匹配是通过计算与特殊滤波器h的相关性来执行的,并且随后对所获得的相关图像c(18)的最大值进行搜索:
- c : Correlated image.
- c : 相关图像。
- s : Image region for searching.
- s : 用于搜索的图像区域。
- h : Filter generated from the object template.
- h : 从对象模板生成的过滤器。
- ◦ : Operator to calculate two-dimensional correlation.
- ◦ : 计算二维相关性的运算符
- (x,y)* : The target object position corresponding to the maximum of Correlated image c.
- (x,y)* : 目标图像位置对应于相关图像C的最大值。
【注:argmax,最大值自变量点集,表示寻找具有最大评分的参量。argmax(f(x))是使得 f(x)取得最大值所对应的变量点x(或x的集合)。】
The correlated filter h is calculated by using the Ridge regression to minimize the squired error over a template t. It is:
通过使用岭回归来计算相关滤波器h,使模板t上的斜视误差最小化。它是:
- λ : regularization parameter, as in the SVM.
- λ : 正则化参数,如在SVM中的(正则化参数)。
- f(xi)=ti◦hi : Correlation function between template and filter images.
- f(xi)=ti◦hi : 模板与滤波图像之间的相关函数。
- c : Channels of the two-dimensional images.
- c : 二维图像的通道。
- g : Two-dimensional Gaussian distribution function, g(u,v)=[-(u²+v²)/(2*(σ²))].
- g : 二维高斯分布函数,g(u,v)=exp[-(u²+v²)/(2(σ²))]。
The optimization problem of Eq. (2) is to find a function h that correlates with object template t to output the minimum difference from Gaussian distribution function g. To work in the frequency domain, the Eq. (2) could also be directly transformed to Fourier expression:
式(2)的优化问题是找到一个与目标模板t相关的函数h,从高斯分布函数g*输出最小差。要在频域工作,式(2)也可以直接转换为傅里叶表达式:
- X* : Complex-conjugation operation of X.
- X* : X的复共轭运算。
- ⊙ : Element-wise product operator.
- ⊙ : 按元素的乘积算符。
- H : Filter in Fourier domain.
- H : 傅里叶域滤波器。
- T : Object template in Fourier domain.
- T : 傅里叶域中的目标模板。
- G : Gaussian function in Fourier domain.
- G : 傅里叶域中的高斯函数。
Using expressions (1) and (3), correlation image C in Fourier domain can be calculated as follow:
利用表达式(1)和(3),傅里叶域中的相关图像C可计算如下:
Given the expressions (1) and (4), finally, the object tracking algorithm are:
给定表达式(1)和(4),最后,目标跟踪算法为:
where Ƒ^-1() denotes the inverse DFT.
式中,Ƒ^-1()表示逆DFT。