Activity Recognition from Silhouettes using Linear Systems and Model (In)validation Techniques 利用线性系统对人体轮廓行为识别及其建模验证
Duanxx的论文阅读:
Activity Recognition from Silhouettes using Linear Systems
and Model (In)validation Techniques 利用线性系统
对人体轮廓行为识别
及其建模验证
——2015-04-27
——Duanxx
这篇文章针对视频文件,通过一个可以描述人体特定运动行为的系统,提供了一个对步态识别的模型验证方法。
首先,对视频中的每一帧每一个视频序列做一个合理的重现:
对于每一帧图像而言,可以考虑使用两种人体轮廓描述的方法,1、Fourier Descriptors (傅立叶描述子) 2、vectors of widths (宽度向量)。
Then each sequence is modeled as a linear time invariant (LTI) system that captures the dynamics of the evolution of the frame description vectors in time.
然后,对于每一个视频序列而言,使用linear time invariant (LTI) system (线性时不变系统)对其建模,用于捕捉视频帧描述向量在时间上的变化强度。
最后,使用SVM进行运动识别和模型验证。
The main contribution of this work is the provision of an activity recognition model and the performance evaluation of this model using two different feature spaces
本文主要的贡献在于提出了一个运动识别模型,并且在两种不同的特征空间下对该模型做了性能评估。
Proposed method算法描述
In section 3.1 two different methods for representing the contours will be described. Section 3.2 will explain the system that will be used to model the activities. The classification method will be presented in section 3.4
这里首先提供了两种轮廓重现算法,然后描述了运动建模系统,最后介绍分类方法
Contour Representation轮廓重现
- Silhouette Width轮廓宽度
Given the contour of a human's image, the silhouette width is the vector that contains for each x the difference max { } − min { }, where x denotes the horizontal coordinate and is the vertical one from the pair ( x, y ), first proposed in [4].
对于一副人体的图像而言,轮廓的宽度代表的是一个向量:对于每个x而言,max { }和
min { }之间的差值max { } − min { }。这里x指的是水平坐标系,是论文《Silhouette-based Human Identification from Body Shape and Gait》中提到的轮廓检测模板。
- Fourier Descriptors傅立叶描述子
Depicting ( x, y ) as a complex number z = x + iy, the Fourier coefficients of the z sequence are computed. Then the ones with the high amplitude are finally stored.
对图像做一个傅立叶变换,变换到频域中。
In doing so the information that the initial sequences carry is transformed in an compact way preserving important description of the activities, and the individuals.
轮廓重现的目的在于对原始运动的数据压缩。
Activity Modeling运动建模
Variations in the shape of the silhouette carry some distinctive information about the activity being performed by an individual. Given a time series of feature vectors representing the silhouette over a sequence of frames, temporal changes in these feature vectors can be used for gait recognition and classification purposes
人在运动时,其轮廓随时间的变化会携带一些与众不同的信息。
从一个视频帧的序列中抽取出基于时序的一个特征向量序列,这些特征向量序列在时序上的变化可以用于步态的识别和分类。
To this effect, the feature vector series are modeled as the output of a causal discrete linear shift invariant system. Further, observing the quasi periodic nature of human gait, the identification and model reduction method for neutrally stable systems proposed by Sznaier et al. [13] can be applied directly.
为了达到这个效果,特征向量序列由一个离散移不变系统建模产生。论文《Finite horizon model reduction of a class of neutrally stable systems with applications to texture synthesis and recognition》给出了步态识别和模型规约的算法。
Gait as Linear Time Invariant Neutrally Stable System以线性时不变中立稳态系统对步态建模
The sequence of feature vector trajectories {} is assumed to be the unit impulse response of an unknown causal discrete-time linear shift invariant system. It is a well-known fact that from an input-output point of view such
a system can be represented with its minimal state-space realization:
向量特征轨迹序列{}被认为是一个未知的离散时间线性移不变系统(discrete-time linear shift invariant system)的单位冲击响应,可以简单的表示为最小状态空间实现:
Since human gait can be regarded as a repetitive motion an additional constraint can be imposed, ATo = I, where To is the period found using the method described in section 3.2.2 and I is the identity matrix of appropriate size.
人的步态能够被当做是附加约束ATo = I的周期运动,其中To代表的是周期,I是单位矩阵。
上面有了{}的表示方法,这里再需要做的事情就是确定公式(1)中的A、B、C、D、x、u了。
Let be the sequence of feature vectors over one whole period. The following block matrix is formed。
表示超过一个完整周期的特征向量序列,就可以建立下面的这个矩阵:
关于SVD分解,我这里找到了一个老外写的教程,通俗易懂,而且从几何和图像的角度解释了SVD的原理和应用,感觉很不错,链接如下:
http://www.ams.org/samplings/feature-column/fcarc-svd
也有中文博客对这个篇文章做了翻译和自己的理解,这里就不重复造轮子了,参考链接:
http://www.cnblogs.com/LeftNotEasy/archive/2011/01/19/svd-and-applications.html
http://blog.sciencenet.cn/blog-696950-699432.html
Assuming σr > σr +1 the system matrices for the reduced order model with r states (r was set to 15 for our experiments) are: (5)(6)(7)(8)
Ur(Vr) denotes the submatrix formed by the first r columns (rows) of U(VT) and where and denote the first p × r block of Ur and r × 1 block of VTr , respectively.
再假设σr > σr +1,即让特征值递减显示,并根据经验取前15个特征向量作为主特征。就有了(5)(6)(7)(8)这四个公式。
Ur(Vr)代表的是U(VT)的前r行或列的矩阵;
和分别代表的是Ur的第一个p × r大小的矩阵和Vr的第一个p × r大小的矩阵
然后,由于步态具有周期性,也就是说对于第k帧而言,起始帧为s,那么施加在序列k-s上的信号所产生的响应和施加在序列k − s − (m ∗ To)上是一样的。
基于上面的说明,就可以做出下面的假设:对于k帧而言,是没有输入的。在这个假设的前提下,这个问题就规约成了为公式(10)找到一个初始状态xo 。
The above equation is equivalent to Y = O xo
Since a minimal realization is observable, the observability matrix O has full column rank by construction and
least square solution to above equation is given by xo =OTY
由于最小实现是可测量的,所以观测矩阵O是满秩的。上面式子就可以可出最小二乘解: