hirotaka-hachiya.hatenablog.com
前回導入したFaster R-CNNを、つくばチャレンジの看板検出用に学習してみた。
以下の「独自のデータでの学習」を参考にした。
https://www.cs.gunma-u.ac.jp/~nagai/wiki/index.php?py-faster-rcnn%20%B3%D0%A4%A8%BD%F1%A4%AD
1)独自の学習データを準備
data/kanban/results/set1/2007/Main
data/kanban/set1/Annotations/*.xml
data/kanban/set1/ImageSets/Main/{trainval,test}.txt
data/kanban/set1/PNGImages/*.png
PNGImagesには、以下のようにUnityで自動生成した看板(kanban)画像と、看板なし(other)画像を200枚ずつ用意した。
>ls data/kanban/set1/PNGImages/ kanban_00000001.png kanban_00000134.png other_00000068.png kanban_00000002.png kanban_00000135.png other_00000069.png kanban_00000003.png kanban_00000136.png other_00000070.png ... > eog data/kanban/set1/PNGImages/kanban_00000003.png
> eog data/kanban/set1/PNGImages/other_00000003.png
また、Annotationsには、xmlファイルで、kanbanとotherのラベルとbounding boxの座標を以下のように記載しているものを用意した。
cat data/kanban/set1/Annotations/kanban_00000003.xml <annotation> <filename>kanban_00000003.png</filename> <object> <name>kanban</name> <pose>Unspecified</pose> <truncated>1</truncated> <difficult>0</difficult> <bndbox> <xmin>183</xmin> <ymin>108</ymin> <xmax>293</xmax> <ymax>277</ymax> </bndbox> </object> </annotation>
また、ImageSets/Main/には、以下のようにランダムに選択した画像名の一覧を含む、trainval.txtとtest.txtを置いた。
> cat data/kanban/set1/ImageSets/Main/trainval.txt other_00000018 other_00000167 other_00000038 other_00000160 ... > cat data/kanban/set1/ImageSets/Main/test.txt other_00000063 kanban_00000107 kanban_00000063
ちなみに、trainval.txtとtest.txtは以下のスクリプトで自動生成した。
import numpy as np import os import os.path sourcedir = 'data/kanban/set1/PNGImages/' targetdir = 'data/kanban/set1/ImageSets/Main' trainfile = 'trainval.txt' testfile = 'test.txt' trainRatio = 0.8 # get file list filelist = os.listdir(sourcedir) filelist = np.array(filelist) # random permutation numfile = filelist.shape[0] randindex = np.random.permutation(numfile) # number of files numtrain = np.floor(numfile*trainRatio) # open target files trainpath=os.path.join(targetdir,trainfile) testpath=os.path.join(targetdir,testfile) ftrain=open(trainpath,'w') ftest=open(testpath,'w') cnt = 1 # write to target files for index in randindex: splits = filelist[index].split(".") if cnt <= numtrain: ftrain.write(splits[0]+"\n") else: ftest.write(splits[0]+"\n") cnt=cnt+1 ftrain.close() ftest.close()
2)lib/datasets/kanban.pyを作成
lib/datasets/pascal_voc.pyからコピーし、看板検出用に修正
> cp lib/datasets/pascal_voc.py lib/datasets/kanban.py > vi lib/datasets/kanban.py ... class kanban(imdb): # setname added by hachiya def __init__(self, image_set, setname, year, devkit_path=None): imdb.__init__(self, image_set + '_' + setname) self._year = year # added by hachiya self._setname = setname self._image_set = image_set self._devkit_path = self._get_default_path() if devkit_path is None \ else devkit_path self._data_path = os.path.join(self._devkit_path, self._setname) # modified by hachiya self._classes = ('__background__', # always index 0 'other', 'kanban' ) self._class_to_ind = dict(zip(self.classes, xrange(self.num_classes))) # modified by hachiya from jpg to png self._image_ext = '.png' ... def image_path_from_index(self, index): """ Construct an image path from the image's "index" identifier. """ # modified by hachiya to load images from PNGImages image_path = os.path.join(self._data_path, 'PNGImages', index + self._image_ext) ... def _get_default_path(self): """ Return the default path where PASCAL VOC is expected to be installed. """ # modified by hachiya, root dir of kanban data return os.path.join(cfg.DATA_DIR, 'kanban') def gt_roidb(self): """ Return the database of ground-truth regions of interest. This function loads/saves from/to a cache file to speed up future calls. """ # modified by hachiya cache_file = os.path.join(self.cache_path, 'kanban_' + self.name + '_gt_roidb.pkl') ... def _get_voc_results_file_template(self): # VOCdevkit/results/VOC2007/Main/<comp_id>_det_test_aeroplane.txt filename = self._get_comp_id() + '_det_' + self._image_set + '_{:s}.txt' path = os.path.join( self._devkit_path, 'results', # modified by hachiya self._setname, self._year, 'Main', filename) ... def _do_python_eval(self, output_dir = 'output'): annopath = os.path.join( self._devkit_path, # modified by hachiya self._setname, 'Annotations', '{:s}.xml') imagesetfile = os.path.join( self._devkit_path, # modified by hachiya self._setname, 'ImageSets', 'Main', self._image_set + '.txt')
3)lib/datasets/factory.pyに追加
> vi lib/datasets/factory.py ... # added by hachiya from datasets.kanban import kanban # Set up kanban for setname in ['set1','set2','set3']: for split in ['trainval','test']: name = 'kanban_{}_{}'.format(setname, split) __sets[name] = (lambda split=split, setname=setname: kanban(split, setname, '2007'))
4)experiments/scripts/faster_rcnn_end2end.sh に追加
> vi experiments/scripts/faster_rcnn_end2end.sh ... # added by hachiya kanban) TRAIN_IMDB="kanban_set1_trainval" TEST_IMDB="kanban_set1_test" PT_DIR="kanban" ITERS=70000 ;;
5)run.shを作成
vi run.sh #!/bin/bash GPU=0 #NET=ZF NET=VGG_CNN_M_1024 #NET=VGG16 DATASET=kanban #DATASET=pascal_voc #DATASET=pascal_voc_2012 #DATASET=coco EXPDIR=hachiya HOST=`hostname` (time ./experiments/scripts/faster_rcnn_end2end.sh $GPU $NET $DATASET --set EXP_DIR $EXPDIR) 2>&1
6)kanban用のモデルファイルを用意
Pascal_VOCはクラス数が21だったのに対し、今回は3なので、train.prototxtとtest.txtのnum_classesとnum_outputを以下のように修正
> cp -rp models/pascal_voc models/kanban > vi models/kanban/VGG_CNN_M_1024/faster_rcnn_end2end/train.prototxt name: "VGG_CNN_M_1024" layer { name: 'input-data' type: 'Python' top: 'data' top: 'im_info' top: 'gt_boxes' python_param { module: 'roi_data_layer.layer' layer: 'RoIDataLayer' param_str: "'num_classes': 3" #used to be 21 } } ... layer { name: 'roi-data' type: 'Python' bottom: 'rpn_rois' bottom: 'gt_boxes' top: 'rois' top: 'labels' top: 'bbox_targets' top: 'bbox_inside_weights' top: 'bbox_outside_weights' python_param { module: 'rpn.proposal_target_layer' layer: 'ProposalTargetLayer' param_str: "'num_classes': 3" #used to be 21 } } ... inner_product_param { num_output: 3 #used to be 21 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "bbox_pred" type: "InnerProduct" bottom: "fc7" top: "bbox_pred" param { lr_mult: 1 } param { lr_mult: 2 } inner_product_param { num_output: 12 #used to be 84 weight_filler { type: "gaussian" std: 0.001 } bias_filler { type: "constant" value: 0 } } } > vi models/kanban/VGG_CNN_M_1024/faster_rcnn_end2end/train.prototxt ... inner_product_param { num_output: 3 # used to be 21 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "bbox_pred" type: "InnerProduct" bottom: "fc7" top: "bbox_pred" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 12 # used to be 84 weight_filler { type: "gaussian" std: 0.001 } bias_filler { type: "constant" value: 0 } } } ...
また、solver.txtを編集
train_net: "models/kanban/VGG_CNN_M_1024/faster_rcnn_end2end/train.prototxt" base_lr: 0.001 lr_policy: "step" gamma: 0.1 stepsize: 50000 display: 20 average_loss: 100 momentum: 0.9 weight_decay: 0.0005 # We disable standard caffe solver snapshotting and implement our own snapshot # function snapshot: 0 # We still use the snapshot prefix, though snapshot_prefix: "vgg_cnn_m_1024_faster_rcnn"
7)lib/fast_rcnn/config.pyの編集
> vi lib/fast_rcnn/config.py ... # Model directory # modified by hachiya #__C.MODELS_DIR = osp.abspath(osp.join(__C.ROOT_DIR, 'models', 'pascal_voc')) __C.MODELS_DIR = osp.abspath(osp.join(__C.ROOT_DIR, 'models', 'kanban')) ...
8)run.shを実行
> ./run.sh ... Evaluating detections Writing other VOC results file Writing kanban VOC results file VOC07 metric? Yes Reading annotation for 1/80 Saving cached annotations to /home/hachiya/Works/DeepNet/py-faster-rcnn/data/kanban/annotations_cache/annots.pkl AP for other = 0.0524 AP for kanban = 1.0000 Mean AP = 0.5262 ~~~~~~~~ Results: 0.052 1.000 0.526 ~~~~~~~~ -------------------------------------------------------------- Results computed with the **unofficial** Python eval code. Results should be very close to the official MATLAB eval code. Recompute with `./tools/reval.py --matlab ...` for your paper. -- Thanks, The Management -------------------------------------------------------------- real 0m7.559s user 0m6.611s sys 0m0.937s real 120m27.721s user 107m52.652s sys 12m19.880s
学習は約2時間で終わった。テストのAverage Precisionの結果は、kanbanが100%なのに対し、otherは5%だった。。。otherは単なる背景画像だから学習できないので、次はもともとFaster R-CNNで用意している__background__とkanbanの2クラスで学習した方してみる。
9)tools/demo_kanban.pyを作成する
> cp tools/demo.py tools/demo_kanban.py > vi tools/demo_kanban.py ... CLASSES = ('__background__', 'other', 'kanban') NETS = {'vgg16': ('VGG_CNN_M_1024', 'VGG16_faster_rcnn_final_kanban.caffemodel'), 'zf': ('ZF', 'ZF_faster_rcnn_final.caffemodel')} ... if __name__ == '__main__': cfg.TEST.HAS_RPN = True # Use RPN for proposals args = parse_args() #prototxt = os.path.join(cfg.MODELS_DIR, NETS[args.demo_net][0], # 'faster_rcnn_alt_opt', 'faster_rcnn_test.pt') prototxt = os.path.join(cfg.MODELS_DIR, NETS[args.demo_net][0], 'faster_rcnn_end2end', 'test.prototxt') ...
10)学習したモデルを所定の場所にコピーし、demo_kanban.pyを実行
> cp output/hachiya/trainval_set1/vgg_cnn_m_1024_faster_rcnn_iter_70000.caffemodel data/faster_rcnn_models/VGG16_faster_rcnn_final_kanban.caffemodel > ./tools/demo_kanban.py > /tmp/output.txt 2>&1
※faster_rcnn_end2endの代わりに、models/kanban/VGG_CNN_M_1024/faster_rcnn_alt_optをモデルとして用いた場合は、下記のようなクラス数が21から3に変わったことによるblobの出力のサイズの違いに関するエラーがでた。models/kanban/VGG_CNN_M_1024/以下のptファイルの修正では対応できなかった。
I0517 20:29:52.864132 13909 net.cpp:380] loss_bbox -> loss_bbox F0517 20:29:52.864150 13909 smooth_L1_loss_layer.cpp:28] Check failed: bottom[0]->channels() == bottom[1]->channels() (84 vs. 12)