Loading

MOT20 评测指标详解 + MOT CLEAR 指标思想详解(英文)

  • Tracking of multiple, partially occluded humans based on static body part detection.

CLEAR metrics

MOT20 is inspired by the paper EvaluatingMultiple Object Tracking Performance: The CLEAR MOTMetrics

There are 2 main requirements in the paper.

  1. The distance between an object and a hypothesis should not exceed a threshold \(T\)

Distance measure can be

  • IoU(Overlap)
  • Euclidean distance(Centroid Tracking)

The threshold should be task-dependent and not pre-defined for general cases.

  1. Consistent Tracking

Construct a list of object-hypothesis mappings

\[M_{t} = \{(o_{i},h_{j})\}^{t} \]

only count mismatch errors once at the time frames where a change in object-hypothesis mappings is made;

and consider the correspondences in intermediate segments as correct. Especially in cases where many objects are being tracked and mismatches are frequent

It has already been the common manner in MOT.

Track reinlinization is needed after fragmentation.

If \(h_{1},h_{2}\)(hypotheses with ID as 1 and 2) are all valid choices(with distance \(<T\)) for \(o_{i}\)

Then the pair in \(M_{t}\), i.e. the hypothesis which has been matched to the object \(o_{i}\) before, is considered


Target/Annotations of MOT20

We define the target class of CVPR19 as all upright walking people that are reachable along the viewing ray without a physical obstacle, i.e. reflec tions, people behind a transparent wall or window are excluded.

DISTRACOTRS: distractor, static person, reflection, person on vehicle)

That is, a method is neither penalized nor rewarded for tracking or not tracking those similar classes.

Distractor cancellation

Since a detector is likely to fire in those cases, we do not want to penalize a tracker with a set of false positives for properly following that set of detections, i.e. of a person on a bicycle.

Likewise, we do not want to penalize with false negatives a tracker that is based on motion cues and therefore does not track a sitting person.

Metrics in Object Detection

Confusion Matrix

is-actual|test-result

  • TP: Actually Positive
  • FP: Not Actually Positive
  • FN: Not Actually Negative

Confusion Matrix

Confusion Matrix of Object Detection

Evaluating Object Detection Models: Guide to Performance Metrics

Intersection over Union, also referred to as the Jaccard Index

  • TP: Target Successfully Detected
  • FP: Distractor Wrongly Detected
    • 误检
      • 类别引起的
      • 回归框不准确引起的
  • FN: Target Failed to be Detected
    • 漏检

Tracker-to-target assignment Prerequisites

Hereinafter, hypothesis means a proposed track.

In a frame, a track is actually equals to a bounding-box

Bounding Box Matching

+ Prerequisite 1: Divide the hypotheses into `TP`, `FP`, `FN`

threshold is based on IoU = 0.5
refer to the paper 4.1.2

It's the same as object detection in single frame.

- TP,FP,FN
- FAF(False Alarm, i.e. FP, per Frame)
- FPPI(False Positives Per Image)

Tracklet Matching

+ Prerequisite 2: a true object should be recovered at most once,
+ and that one hypothesis cannot account for more than one target. 

For the following, we assume that each ground truth trajectory has one unique start and one unique end point, i.e. that it is not fragmented.
In other words, when a target leaves the field-of-view and then reappears, it is treated as an unseen target with a new ID

Tracklet Matching is not greedily performed on single frames.

a method that finds twice as many trajectories will almost certainly produce more identity switches.

IDSW should not be considered alone to assess the overall performance.

IDSW/recall is introduced.

- IDSW; IDSW/Recall(Normalized)

Tracker-to-target assignment IN DETAILS

Tracking of Multiple, Partially Occluded Humans based on Static Body Part Detection

In this part, let red track's ID is 1 and blue track's ID is 2, target is called T

ID Switch without fragmentation

  • At frame 2, T is matched to red track 1 the first time, so it should preserve 1 as its id during its appearance.
  • At frame 5, T is matched to blue track 2, ID switch occurs.

ID Switch with fragmentation

  • At frame 3, fragmentation occurs due to FP(误检)
  • At frame 5, T is matched to blue track, changing its ID from 1 to 2. ID switch occurs.

Error due to Propagation

  • At frame 1, matching is resonal good.
    • Along this part of sequence, the track above preserve the same ID 1 until frame 5.
    • The track below preserve the same ID 2 until frame 2.
  • There is 5 FN and 4 FP(blue hypotheses)
    • because the track above waste the closer blue track
    • and grasp the red track, causing the track below haveing no hypothesis to match.

There is NO fragmentation or ID switch in this part of sequence.

- Note: Fragment and ID switch is counted
- when THE NEW track appears

Note that no fragmentations are counted in frames 3 and 6 because tracking of those targets is not resumed at a later point (in this part of sequence).

Interrupted GT trajectory


Metrics

Prerequisite: Concatenate all sequences

Drawback of MOTA

It's normalized(influenced) by \(\#\text{GT}\)

metrics

  • MOTA: matching metric
  • MOTP: detector regression prediction
  • MT, PT, ML: Mostly Tracked, Partially Tracked, Mostly Lost
  • FM: fragmentations
    • FM / Recall.

Average Rank

  • For each track, calculate rank according to each metric.
    • If there are \(N\) metrics, the rank vector has dimensionality of \(N\)
  • Average the rank vector.

ref

posted @ 2022-04-08 15:57  ZXYFrank  阅读(303)  评论(0编辑  收藏  举报