Loading

Tracking without bells and whistles 英文 精读

no training or optimization on tracking data.

using only an object detection method to perform tracking.

takeaway

In Tracktor, tracklet-regression is more essential compared to detection, which is adopted accroding to the regressed tracklets.

  • frame-by-frame
  • detection-based
  • tracklet regression

inspiration

  • detection head can tackle simple motion scenarios.
  • utilize the continuity of tracklets

Pipeline

Training

NO TRAINING!

We show that one can achieve state-of-the-art tracking results by training a neural network only on the task of detection.

Inference

https://www.arvindrs.com/tracking-without-bells-and-whistles/

Faster RCNN

temporal realignment

  • prev. bounding box coordinates \(\mathbf{b}_{t-1}^{k}\), k is the object category.
  • use the on current frame to preform ROI Pooling to get pooled features \(\mathbf{f}_{t}\)
    • now, the classification confidence has not been calculated yet.
    • it might be killed, not being of class k anymore
  • then perform classification and regression on the pooled \(\mathbf{f}_{t}\)

Except for the first frame, detection-head-based bounding box regression is performed first.
i.e. get \(\mathbf{b}_{t} = \text{ROIPooling\&Regress}(\mathbf{b}_{t-1})\)
Then a detection was performed on current frame to get a set of \(\mathbf{d}_{t}^{i}\)

When to kill a trajectory

According to the results of bounding box regression

  • classification score < \(\sigma_{active} = 0.5\)
    • regressed position is not confident to maintain the object's bounding box anymore.
  • NMS: abandon IoU > \(\lambda_{active} = 0.3\)
    • if regressed tracklets occlud, keep the confident ones(w.r.t to the regression head, more discriminative features.)
    • and abandon the regressed tracklets with low confidence scores.

Initialize bounding boxes

When a detection \(\mathbf{d}_{t}^{i}\) has IoU score < \(\lambda\) with all active trackletes \(\mathbf{b}_{t}^{1},\mathbf{b}_{t}^{2},\dots\)

far enough from regressed tracklets

Tracking Extensions

Both are aimed at improving identity preservation

  1. Motion Model
  • For sequences with a moving camera, we apply a straightforward camera motion compensation (CMC) by aligning frames via image registration using the Enhanced Correlation Coefficient (ECC) maximization as introduced in [16].
  • For sequences with comparatively low frame rates, we apply a constant velocity assumption (CVA) for all objects as in [11, 2].
  1. ReID

for deactivated trajectories.

Motion Model continues to run on them, even though they are not activated.

  • Siamese Network -> ReID features
    • store killed (deactivated) tracks in their non-regressed version \(\mathbf{b}_{t}^{k-1}\) for a fixed number of \(F_{\text{ReID}}\) frames.
    • To minimize the risk of false reIDs, we only consider pairs of deactivated and new bounding boxes with a sufficiently large IoU.

Limitations

Without sophisticated tracking methods, it is not expected to excel in crowded and occluded, but rather only in benevolent, tracking scenarios.

  • slight movement
    • 『IDEA [This could be solved by expand the ROI to cover the target for regression to be done]
    • Severe Id switches when using low fps

Experiments

How to use public detections

Using public detections, Tracktor++ can achieve SOTA

Tracktor++ has camera motion model, which is better/more complicated than Kalman filter.

Analysis

  • Tracktor: demonstrate the strength of tracking-by-detection for easy scenarios
  • Complicated methods shoule be encouraged to focus on the complex tracking problems

occlusions/visiability

Tracktor++ achieves superior performance even for partially occluded bounding boxes with visibilities as low as 0.3.

  • This contributes to its high MOTA compared with other methods.
  • extended version only achieves minor improvements over our vanilla Tracktor.

object size

only compare objects with a visibility larger than 0.9

none of the trackers exhibit a notably better performance with respect to varying object sizes.

long-term tracking/(gap length)

TODO 『2022-03-16 [not sure about the meaning of gap length]?

ID preservation

oracle trackers

replacing parts of our algorithm with ground truth information

To this end, we analyse our performance twofold:

(i) the impact of the object detector on the killing policy and bounding box regression,

(ii) identify performance upper bounds for potential extensions to our Tracktor.

  • Oracle-Kill

  • Oracle-REG
    • match ground truth at frame \(t-1\)
    • inherit ground truth at frame \(t\)

  • Oracle-MM
    • like REG, but only inherit the center

  • Oracle-reID
    • match inactive tracks and new detections

posted @ 2022-03-17 11:21  ZXYFrank  阅读(100)  评论(0编辑  收藏  举报