Tracking without bells and whistles 英文精读

no training or optimization on tracking data.

using only an object detection method to perform tracking.

takeaway

In Tracktor, tracklet-regression is more essential compared to detection, which is adopted accroding to the regressed tracklets.

frame-by-frame
detection-based
tracklet regression

inspiration

detection head can tackle simple motion scenarios.
utilize the continuity of tracklets

Pipeline

Training

NO TRAINING!

We show that one can achieve state-of-the-art tracking results by training a neural network only on the task of detection.

Inference

https://www.arvindrs.com/tracking-without-bells-and-whistles/

Faster RCNN

temporal realignment

prev. bounding box coordinates \(\mathbf{b}_{t-1}^{k}\), k is the object category.
use the on current frame to preform ROI Pooling to get pooled features \(\mathbf{f}_{t}\)
- now, the classification confidence has not been calculated yet.
- it might be killed, not being of class k anymore
then perform classification and regression on the pooled \(\mathbf{f}_{t}\)

Except for the first frame, detection-head-based bounding box regression is performed first.
i.e. get \(\mathbf{b}_{t} = \text{ROIPooling\&Regress}(\mathbf{b}_{t-1})\)
Then a detection was performed on current frame to get a set of \(\mathbf{d}_{t}^{i}\)

When to kill a trajectory

According to the results of bounding box regression

classification score < \(\sigma_{active} = 0.5\)
- regressed position is not confident to maintain the object's bounding box anymore.
NMS: abandon IoU > \(\lambda_{active} = 0.3\)
- if regressed tracklets occlud, keep the confident ones(w.r.t to the regression head, more discriminative features.)
- and abandon the regressed tracklets with low confidence scores.

Initialize bounding boxes

When a detection \(\mathbf{d}_{t}^{i}\) has IoU score < \(\lambda\) with all active trackletes \(\mathbf{b}_{t}^{1},\mathbf{b}_{t}^{2},\dots\)

far enough from regressed tracklets

Tracking Extensions

Both are aimed at improving identity preservation

Motion Model

For sequences with a moving camera, we apply a straightforward camera motion compensation (CMC) by aligning frames via image registration using the Enhanced Correlation Coefficient (ECC) maximization as introduced in [16].
For sequences with comparatively low frame rates, we apply a constant velocity assumption (CVA) for all objects as in [11, 2].

ReID

for deactivated trajectories.

Motion Model continues to run on them, even though they are not activated.

Siamese Network -> ReID features
- store killed (deactivated) tracks in their non-regressed version \(\mathbf{b}_{t}^{k-1}\) for a fixed number of \(F_{\text{ReID}}\) frames.
- To minimize the risk of false reIDs, we only consider pairs of deactivated and new bounding boxes with a sufficiently large IoU.

Limitations

Without sophisticated tracking methods, it is not expected to excel in crowded and occluded, but rather only in benevolent, tracking scenarios.

slight movement
- 『IDEA [This could be solved by expand the ROI to cover the target for regression to be done]』
- Severe Id switches when using low fps

Experiments

How to use public detections

Using public detections, Tracktor++ can achieve SOTA

Tracktor++ has camera motion model, which is better/more complicated than Kalman filter.

Analysis

Tracktor: demonstrate the strength of tracking-by-detection for easy scenarios
Complicated methods shoule be encouraged to focus on the complex tracking problems

occlusions/visiability

Tracktor++ achieves superior performance even for partially occluded bounding boxes with visibilities as low as 0.3.

This contributes to its high MOTA compared with other methods.
extended version only achieves minor improvements over our vanilla Tracktor.

object size

only compare objects with a visibility larger than 0.9

none of the trackers exhibit a notably better performance with respect to varying object sizes.

long-term tracking/(gap length)

TODO 『2022-03-16 [not sure about the meaning of gap length]?』

ID preservation

oracle trackers

replacing parts of our algorithm with ground truth information

To this end, we analyse our performance twofold:

(i) the impact of the object detector on the killing policy and bounding box regression,

(ii) identify performance upper bounds for potential extensions to our Tracktor.

Oracle-Kill

Oracle-REG
- match ground truth at frame \(t-1\)
- inherit ground truth at frame \(t\)

Oracle-MM
- like REG, but only inherit the center

Oracle-reID
- match inactive tracks and new detections

posted @ 2022-03-17 11:21 ZXYFrank 阅读(100) 评论(0) 编辑收藏举报

刷新页面返回顶部

Loading

ZXYFrank

Enjoy the process🍀

Tracking without bells and whistles 英文精读

takeaway

Pipeline

Training

Inference

Tracking Extensions

Limitations

Experiments

Analysis

occlusions/visiability

object size

long-term tracking/(gap length)

ID preservation

oracle trackers

公告

Loading

ZXYFrank

Enjoy the process🍀

Tracking without bells and whistles 英文 精读

takeaway

Pipeline

Training

Inference

Tracking Extensions

Limitations

Experiments

Analysis

occlusions/visiability

object size

long-term tracking/(gap length)

ID preservation

oracle trackers

公告

Tracking without bells and whistles 英文精读