多目标跟踪baseline methods

参考文献:

MOTChallenge 2015: Towards a Benchmark for Multi-Target Tracking
Laura Leal-Taix ´e, Anton Milan, Ian Reid, Stefan Roth, and Konrad Schindler

 

 

1、DP_NMS:Network flow tracing

  where each node represents a detection and each edge represents a transition between two detections. Special source and sink nodes allow spawning and absorbing trajectories.

2、CEM:Continuous energy minimization

  The target state X is represented by continuous x; y coordinates in all frames. The energy E(X) is made up of several components, including a data term to keep the solution close to the observed data (detections), a dynamic model to smooth the trajectories, an exclusion term to avoid collisions, a persistence term to reduce track fragmentations, and a regularizer. The resulting energy is highly non-convex and is minimized in an alternating fashion using conjugate gradient descent and deterministic jump moves.

3、SMOT:Similar moving objects

The Similar Multi-Object Tracking (SMOT) approach [15] specifically targets situations where target appearance is ambiguous and rather concentrates on using the motion as a primary cue for data association. Tracklets with similar motion are linked to longer trajectories using the generalized linear assignment (GLA) formulation. The motion similarity and the underlying dynamics of a tracklet are modeled as the order of a linear regressor
approximating that tracklet

4、TBD: Tracking-by-detection

This two-stage tracking-by-detection (TBD) approach [21], [56] is part of a larger traffic scene understanding framework and employs a rather simple data association technique. The first stage links overlapping detections with similar appearance in successive frames into trackletsThe second stage aims to bridge occlusions of up to 20 frames. Both stages employ the Hungarian algorithm to optimally solve the matching problem. Note that we did not re-train this baseline but rather used the original implementation and parameters provided.

5、SFM: Social forces for tracking

Most tracking systems work with the assumption that the motion model for each target is independent, but in reality, a pedestrian follows a series of social rules, i.e. is subject to social forces according to other moving targets around him/her. These have been defined in what is called the social force model (SFM) [23], [26] and have recently been applied to multiple people tracking. For the 3D benchmark we include two baselines that include a few hand-designed force terms, such as collision avoidance or group attraction. The first method (KALMANSFM) [40] includes those in an online predictive Kalman filter approach while the second (LPSFM) [30] includes the social forces in a Linear Programming framework as described in Sec. 4.2. For the 2D benchmark, we include a recent algorithm (MOTICON) [29], which learns an image-based motion context that encodes the pedestrian’s reaction to the environment, i.e., other moving objects. The motion context, created from low-level image features, leads to a much richer representation of the physical interactions between targets compared to hand-specified social force models. This allows for a more accurate prediction of the future position of each pedestrian in image space, information that is then included in a Linear Programming framework for multi-target tracking.

6、TC ODAL: Tracklet confidence

Robust Online Multi-Object Tracking based on Tracklet Confidence and Online Discriminative Appearance Learning, or TC ODAL [8], is the only online method among the baselines. It proceeds in two stages. First, close detections are linked to form a set of short, reliable tracklets. This so-called local association allows one to progressively aggregate confident tracklets. In case of occlusions or missed detections, the tracklet confidence value is decreased and a global association is employed to bridge longer occlusion gaps. Both association techniques are formulated as bipartite matching and tackled with the Hungarian algorithm. Another prominent component of TC ODAL is online appearance learning. To that end, positive samples are collected from tracklets with high confidence and incremental linear discriminant analysis (ILDA) is employed to update the appearance model in an online fashion.

posted on 2015-11-02 20:04  一动不动的葱头  阅读(1498)  评论(0编辑  收藏  举报

导航