fcn (二) pascalcontext-fcn32s training


這一篇是想訓練fcn.berkleyvision.org/pascalcontext-fcn32s

一直覺得voc-fcn32s應該是有仇吧@@

換個試試,只要收斂一次,就在試看voc-fcn32s使用的dataset!

1. github : fcn. berkelyvision

    • https://github.com/shelhamer/fcn.berkeleyvision.org.git 
2. dataset , 下載了四個檔案

  • ├── data
    │   ├── pascal-context
    │   │   ├── 59_context_labels
    │   │   ├── 59_context_labels.tar.gz
    │   │   ├── 59_labels.txt    ---> (copy from classes-59.txt, 懶的改code)
    │   │   ├── classes-400.txt
    │   │   ├── classes-59.txt
    │   │   ├── labels.txt           ---> (labels.txt add label 0 --> 0: background )
    │   │   ├── README.md
    │   │   ├── trainval
    │   │   ├── trainval.tar.gz
    │   │   ├── VOCdevkit
    │   │   ├── VOCdevkit_08-May-2010.tar
    │   │   └── VOCtrainval_03-May-2010.tar
  • dataset VOC2010
    • http://host.robots.ox.ac.uk/pascal/VOC/voc2010/#devkit
    •  VOCtrainval_03-May-2010.tar 
    • VOCdevkit_08-May-2010.tar
    •  $ tar xvf VOCdevkit_08-May-2010.tar
  •  tranval and labels 
  • vgg 16 
    • caffemodel 用過 vgg16-fcn.caffemodel 和 vgg_ilsvrc_16_layers_deploy.prototxt
      ├── ilsvrc-nets
      │   ├── README.md
      │   ├── vgg16-fcn.caffemodel
      │   └── vgg_ilsvrc_16_layers_deploy.prototxt

3.  copy 以下檔案到 pascal-context-fcn32s/
  • .
    ├── infer.py
    ├── net.py
    ├── pascalcontext_layers.py
    ├── score.py
    ├── snapshot   ---> cteate snapshot/train
    ├── solve.py
    ├── solver.prototxt
    ├── surgery.py
    ├── train.prototxt
    └── val.prototxt
4.  修改以下的檔案
  • .
    ├── infer.py
    ├── net.py
    ├── pascalcontext_layers.py
    ├── score.py
    ├── snapshot  
    ├── solve.py
    ├── solver.prototxt
    ├── surgery.py
    ├── train.prototxt
    └── val.prototxt
  • ├── deploy.prototxt
4.1
solve.py
-weights = '../ilsvrc-nets/vgg16-fcn.caffemodel'
+#weights = '../ilsvrc-nets/vgg16-fcn.caffemodel'
+vgg_weights = '../ilsvrc-nets/vgg16-fcn.caffemodel'
+vgg_proto= '../ilsvrc-nets/vgg_ilsvrc_16_layers_deploy.prototxt'
+

# init
-caffe.set_device(int(sys.argv[1]))
+#caffe.set_device(int(sys.argv[1]))
+caffe.set_device(0)
caffe.set_mode_gpu()

solver = caffe.SGDSolver('solver.prototxt')
-solver.net.copy_from(weights)
+#solver.net.copy_from(weights)
+vgg_net=caffe.Net(vgg_proto,vgg_weights, caffe.TRAIN)
+surgery.transplant(solver.net, vgg_net)
+del vgg_net

# surgeries
interp_layers = [k for k in solver.net.params.keys() if 'up' in k]
surgery.interp(solver.net, interp_layers)

# scoring
-val = np.loadtxt('../data/pascal/VOC2010/ImageSets/Main/val.txt', dtype=str)
+val = np.loadtxt('/home/zoey/fcn.berkeleyvision.org/data/pascal-context/VOCdevkit/VOC2010/ImageSets/Main/val.txt', dtype=str)
+




4.2
solver.prototxt




train_net: "train.prototxt"
test_net: "val.prototxt"
test_iter: 5105
# make test net, but don't invoke it from the solver itself
test_interval: 999999999
display: 20
average_loss: 20
lr_policy: "fixed"
# lr for unnormalized softmax
base_lr: 1e-10
# high momentum
momentum: 0.99
# no gradient accumulation
iter_size: 1
max_iter: 300000
weight_decay: 0.0005
snapshot: 4000
snapshot_prefix: "snapshot/train"
test_initialization: false




4.3
net.py




@@ -15,8 +15,8 @@ def fcn(split):
n = caffe.NetSpec()
n.data, n.label = L.Python(module='pascalcontext_layers',
layer='PASCALContextSegDataLayer', ntop=2,
- param_str=str(dict(voc_dir='../../data/pascal',
- context_dir='../../data/pascal-context', split=split,
+ param_str=str(dict(voc_dir='../data/pascal-context/VOCdevkit',
+ context_dir='../data/pascal-context', split=split,
seed=1337)))
*voc_dir 指到 VOC2010的路徑, ImageSets/Main/*.txt, JPEG
*context_dir 指到 trainval/*.mat ,




4.4
deploy.prototxt




Copy from voc-fen8slayout, 所以根據這邊的train.prototxt, 把多的刪掉。

自己從train.prototxt 改過來也行,就是把drop loss拿掉 ,還有一個我忘了,先這樣。


5. Result



6. nvidia
aaa:~/fcn.berkeleyvision.org$ nvidia-smi
Wed Nov  6 13:51:50 2019       ---> starting time
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 435.21       Driver Version: 435.21       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1080    Off  | 00000000:01:00.0 Off |                  N/A |
| 40%   64C    P2   175W / 198W |   4153MiB /  8116MiB |     82%      Default |
+-------------------------------+----------------------+----------------------+
                                                                              
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0     10334      C   python                                      4141MiB |
+-----------------------------------------------------------------------------+

I1108 03:27:42.271994 10334 sgd_solver.cpp:284] Snapshotting solver state to binary proto file snapshot/train/solver_iter_400000.solverstate
>>> 2019-11-08 03:27:42.929233 Begin seg tests
>>> 2019-11-08 03:41:03.580019 Iteration 400000 loss 76704.01230231168
>>> 2019-11-08 03:41:03.629054 Iteration 400000 overall accuracy 0.9488732847493926
>>> 2019-11-08 03:41:03.665190 Iteration 400000 mean accuracy 0.14218947967705742
>>> 2019-11-08 03:41:03.668212 Iteration 400000 mean IU 0.11892353497429191
>>> 2019-11-08 03:41:03.668304 Iteration 400000 fwavacc 0.9079830134090507

real    2270m19.042s
user    1587m14.698s
sys    518m50.233s
---> 跑了1天半。。。有點怪,網誌參考寫了7-8天,(11/29 32s 16s 8s ->3次*1天半,7~8天)



[參考]


留言