Tensorflow版Faster RCNN源码解析(TFFRCNN) (16) rpn_msr/generate_anchors.py
本blog为github上CharlesShang/TFFRCNN版源码解析系列代码笔记
---------------个人学习笔记---------------
----------------本文作者疆--------------
------点击此处链接至博客园原文------
1.generate_anchors(base_size=16, ratios=[0.5, 1, 2],scales=2**np.arange(3, 6))
在scaled图像(即真正馈入网络的图像)(0,0)位置产生9个base anchors并返回,被rpn_msr/proposal_layer_tf.py中proposal_layer(...)函数调用
# ratios=[0.5, 1, 2]表示1:2, 1:1, 2:1 # scales = 2**np.arange(3, 6)表示(8,16,32) def generate_anchors(base_size=16, ratios=[0.5, 1, 2], scales=2**np.arange(3, 6)): """ Generate anchor (reference) windows by enumerating aspect ratios X scales wrt a reference (0, 0, 15, 15) window. """ # 新建一个base数组 [0 0 15 15] base_anchor = np.array([1, 1, base_size, base_size]) - 1 # 枚举各种纵横比,生成三个比例的anchor ratio_anchors = _ratio_enum(base_anchor, ratios) # [[-3.5 2. 18.5 13.] # [0. 0. 15. 15.] # [2.5 -3. 12.5 18.]] anchors = np.vstack([_scale_enum(ratio_anchors[i, :], scales) for i in xrange(ratio_anchors.shape[0])]) return anchors
2._whctrs(anchor)
获取anchor的宽、高、中心坐标并返回,被_ratio_enum(...)和_scale_enum(...)函数调用
# 获取anchor的宽、高、中心坐标 def _whctrs(anchor): """ Return width, height, x center, and y center for an anchor (window). """ w = anchor[2] - anchor[0] + 1 h = anchor[3] - anchor[1] + 1 x_ctr = anchor[0] + 0.5 * (w - 1) y_ctr = anchor[1] + 0.5 * (h - 1) return w, h, x_ctr, y_ctr
3._mkanchors(ws,hs,x_ctr,y_ctr)
由anchor的宽、高、中心点坐标获取anchor的左上、右下坐标信息,被_ratio_enum(...)和_scale_enum(...)函数调用
# 由anchor的宽、高、中心点坐标获取anchor的左上、右下坐标信息 def _mkanchors(ws, hs, x_ctr, y_ctr): """ Given a vector of widths (ws) and heights (hs) around a center (x_ctr, y_ctr), output a set of anchors (windows). """ # ws:(23 16 11) hs:(12 16 22) # ws与hs维度都为(3,) np.newaxis后变为(3,1) ws = ws[:, np.newaxis] hs = hs[:, np.newaxis] # 3个anchors左上、右下坐标 anchors = np.hstack((x_ctr - 0.5 * (ws - 1), y_ctr - 0.5 * (hs - 1), x_ctr + 0.5 * (ws - 1), y_ctr + 0.5 * (hs - 1))) ''' [[-3.5 2. 18.5 13.] [0. 0. 15. 15.] [2.5 -3. 12.5 18.]] ''' return anchors
4._ratio_enum(anchor,ratios)
_enum表示枚举,由base anchor(即scaled图像中 [0, 0, 15, 15])计算其size,然后除以aspect ratios并开根号,得到三组宽、高值,并以base anchor中心为中心,以该3组宽、高值得到3个anchors并返回,被generate_anchors(...)调用
# 传入anchor为base_anchor [0, 0, 15, 15] ratios为[0.5, 1, 2] def _ratio_enum(anchor, ratios): """ Enumerate a set of anchors for each aspect ratio wrt an anchor. """ # 获取base anchor(在scaled图像上的)宽、高、中心坐标 w, h, x_ctr, y_ctr = _whctrs(anchor) # 计算一个基础的size 16*16=256 size = w * h size_ratios = size / ratios # 根据base anchor的size得到纵横比下的size分别为(512,256,128) ws = np.round(np.sqrt(size_ratios)) # (23 16 11) hs = np.round(ws * ratios) # (12 16 22) anchors = _mkanchors(ws, hs, x_ctr, y_ctr) # 3个(0,0)位置上的anchors return anchors
5._scale_enum(anchor,scales)
_enum表示枚举,以_ratio_enum(...)得到的3个anchor,得到其中心点和宽、高值,并将宽、高值与3个scale相乘(保持中心点不变),最终得到9个在scaled图像中(0,0)位置的base anchors,被generate_anchors(...)调用
def _scale_enum(anchor, scales): """ Enumerate a set of anchors for each scale wrt an anchor. """ # [[-3.5 2. 18.5 13.] # [0. 0. 15. 15.] # [2.5 -3. 12.5 18.]] # 得到_ratio_enum(...)得到的3个anchors中某个anchor的宽、高、中心点坐标 w, h, x_ctr, y_ctr = _whctrs(anchor) ws = w * scales # scales=(8,16,32) hs = h * scales anchors = _mkanchors(ws, hs, x_ctr, y_ctr) # 得到(0,0)位置的9个anchors return anchors
6.主函数
if __name__ == '__main__': import time t = time.time() #返回当前时间戳 a = generate_anchors() print time.time() - t print a from IPython import embed; embed()