SiamFC网络影响
To investigate the underlying reason, we analyze the
Siamese network architecture and identify that the receptive
field size of neurons, network stride and feature padding
are three important factors affecting tracking accuracy. In
author
1particular, the receptive field determines the image region
used in computing a feature. A larger receptive field pro-
vides greater image context, while a small one may not
capture the structure of target objects. The network stride
affects the degree of localization precision, especially for
small-sized objects. Meanwhile, it controls the size of out-
put feature maps, which affects feature discriminability and
detection accuracy. Moreover, for a fully-convolutional ar-
chitecture [2], the feature padding for convolutions induces
a potential position bias in model training, such that when
an object moves near the search range boundary, it has a
very low probability of being predicted as the target. These
three factors together prevent Siamese trackers from bene-
fiting from current deeper and more sophisticated network
architectures.