泡泡一分钟:Geometric and Physical Constraints for Drone-Based Head Plane Crowd Density Estimation
张宁 Geometric and Physical Constraints for Drone-Based Head Plane Crowd Density Estimation
基于无人机的向下平面人群密度估计的几何和物理约束
https://arxiv.org/abs/1803.08805
Weizhe Liu, Krzysztof Lis, Mathieu Salzmann, Pascal Fua
Abstract—State-of-the-art methods for counting people in crowded scenes rely on deep networks to estimate crowd density in the image plane. While useful for this purpose, this imageplane density has no immediate physical meaning because it is subject to perspective distortion. This is a concern in sequences acquired by drones because the viewpoint changes often. This distortion is usually handled implicitly by either learning scaleinvariant features or estimating density in patches of different sizes, neither of which accounts for the fact that scale changes must be consistent over the whole scene.
In this paper, we explicitly model the scale changes and reason in terms of people per square-meter. We show that feeding the perspective model to the network allows us to enforce global scale consistency and that this model can be obtained on the fly from the drone sensors. In addition, it also enables us to enforce physically-inspired temporal consistency constraints that do not have to be learned. This yields an algorithm that outperforms state-of-the-art methods in inferring crowd density from a moving drone camera especially when perspective effects are strong.
在拥挤场景中对人进行计数的最新方法依赖于深层网络来估计图像平面中的人群密度。尽管对于此目的很有用,但此像平面密度没有直接的物理意义,因为它会受到透视变形的影响。这是无人机获取序列中的一个问题,因为视点经常变化。 通常通过学习尺度不变特征或估计不同大小的面片中的密度来隐式处理这种失真,这两者都不能说明在整个场景中尺度变化必须一致的事实。
在本文中,我们以人均每平方米为单位对规模变化和原因进行显式建模。我们表明,将透视图模型馈送到网络可以使我们增强全局范围的一致性,并且可以从无人机传感器上以飞行的形式获得此模型。此外,它还使我们能够执行不必学习的,受到物理启发的时间一致性约束。 这产生了一种算法,该算法在从移动的无人机摄像机推断人群密度方面表现出超过最新方法,尤其是在透视效果很强的情况下。