fcn (二) pascalcontext-fcn32s training
這一篇是想訓練fcn.berkleyvision.org/pascalcontext-fcn32s
一直覺得voc-fcn32s應該是有仇吧@@
換個試試,只要收斂一次,就在試看voc-fcn32s使用的dataset!
1. github : fcn. berkelyvision
- https://github.com/shelhamer/fcn.berkeleyvision.org.git
├── data
│ ├── pascal-context
│ │ ├── 59_context_labels
│ │ ├── 59_context_labels.tar.gz
│ │ ├── 59_labels.txt ---> (copy from classes-59.txt, 懶的改code)
│ │ ├── classes-400.txt
│ │ ├── classes-59.txt
│ │ ├── labels.txt ---> (labels.txt add label 0 --> 0: background )
│ │ ├── README.md
│ │ ├── trainval
│ │ ├── trainval.tar.gz
│ │ ├── VOCdevkit
│ │ ├── VOCdevkit_08-May-2010.tar
│ │ └── VOCtrainval_03-May-2010.tar
- dataset VOC2010
- http://host.robots.ox.ac.uk/pascal/VOC/voc2010/#devkit
- VOCtrainval_03-May-2010.tar
- VOCdevkit_08-May-2010.tar
- $ tar xvf VOCdevkit_08-May-2010.tar
- tranval and labels
- http://www.cs.stanford.edu/~roozbeh/pascal-context/
- 59_context_labels.tar.gz
- trainval.tar.gz
- $ tar zxvf trainval.tar.gz
- vgg 16
- caffemodel 用過 vgg16-fcn.caffemodel 和 vgg_ilsvrc_16_layers_deploy.prototxt
├── ilsvrc-nets
│ ├── README.md
│ ├── vgg16-fcn.caffemodel
│ └── vgg_ilsvrc_16_layers_deploy.prototxt
3. copy 以下檔案到 pascal-context-fcn32s/
- .
├── infer.py
├── net.py
├── pascalcontext_layers.py
├── score.py
├── snapshot ---> cteate snapshot/train
├── solve.py
├── solver.prototxt
├── surgery.py
├── train.prototxt
└── val.prototxt
- .
├── infer.py
├── net.py
├── pascalcontext_layers.py
├── score.py
├── snapshot
├── solve.py
├── solver.prototxt
├── surgery.py
├── train.prototxt
└── val.prototxt - ├── deploy.prototxt
4.1 solve.py |
-weights = '../ilsvrc-nets/vgg16-fcn.caffemodel' +#weights = '../ilsvrc-nets/vgg16-fcn.caffemodel' +vgg_weights = '../ilsvrc-nets/vgg16-fcn.caffemodel' +vgg_proto= '../ilsvrc-nets/vgg_ilsvrc_16_layers_deploy.prototxt' + # init -caffe.set_device(int(sys.argv[1])) +#caffe.set_device(int(sys.argv[1])) +caffe.set_device(0) caffe.set_mode_gpu() solver = caffe.SGDSolver('solver.prototxt') -solver.net.copy_from(weights) +#solver.net.copy_from(weights) +vgg_net=caffe.Net(vgg_proto,vgg_weights, caffe.TRAIN) +surgery.transplant(solver.net, vgg_net) +del vgg_net # surgeries interp_layers = [k for k in solver.net.params.keys() if 'up' in k] surgery.interp(solver.net, interp_layers) # scoring -val = np.loadtxt('../data/pascal/VOC2010/ImageSets/Main/val.txt', dtype=str) +val = np.loadtxt('/home/zoey/fcn.berkeleyvision.org/data/pascal-context/VOCdevkit/VOC2010/ImageSets/Main/val.txt', dtype=str) + |
4.2 solver.prototxt |
train_net: "train.prototxt" test_net: "val.prototxt" test_iter: 5105 # make test net, but don't invoke it from the solver itself test_interval: 999999999 display: 20 average_loss: 20 lr_policy: "fixed" # lr for unnormalized softmax base_lr: 1e-10 # high momentum momentum: 0.99 # no gradient accumulation iter_size: 1 max_iter: 300000 weight_decay: 0.0005 snapshot: 4000 snapshot_prefix: "snapshot/train" test_initialization: false |
4.3 net.py |
@@ -15,8 +15,8 @@ def fcn(split): n = caffe.NetSpec() n.data, n.label = L.Python(module='pascalcontext_layers', layer='PASCALContextSegDataLayer', ntop=2, - param_str=str(dict(voc_dir='../../data/pascal', - context_dir='../../data/pascal-context', split=split, + param_str=str(dict(voc_dir='../data/pascal-context/VOCdevkit', + context_dir='../data/pascal-context', split=split, seed=1337))) *voc_dir 指到 VOC2010的路徑, ImageSets/Main/*.txt, JPEG *context_dir 指到 trainval/*.mat , |
4.4 deploy.prototxt |
Copy from voc-fen8s的layout, 所以根據這邊的train.prototxt, 把多的刪掉。 自己從train.prototxt 改過來也行,就是把drop 跟 loss拿掉 ,還有一個我忘了,先這樣。 |
5. Result
6. nvidia
aaa:~/fcn.berkeleyvision.org$ nvidia-smi
Wed Nov 6 13:51:50 2019 ---> starting time
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 435.21 Driver Version: 435.21 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 1080 Off | 00000000:01:00.0 Off | N/A |
| 40% 64C P2 175W / 198W | 4153MiB / 8116MiB | 82% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 10334 C python 4141MiB |
+-----------------------------------------------------------------------------+
I1108 03:27:42.271994 10334 sgd_solver.cpp:284] Snapshotting solver state to binary proto file snapshot/train/solver_iter_400000.solverstate
>>> 2019-11-08 03:27:42.929233 Begin seg tests
>>> 2019-11-08 03:41:03.580019 Iteration 400000 loss 76704.01230231168
>>> 2019-11-08 03:41:03.629054 Iteration 400000 overall accuracy 0.9488732847493926
>>> 2019-11-08 03:41:03.665190 Iteration 400000 mean accuracy 0.14218947967705742
>>> 2019-11-08 03:41:03.668212 Iteration 400000 mean IU 0.11892353497429191
>>> 2019-11-08 03:41:03.668304 Iteration 400000 fwavacc 0.9079830134090507
real 2270m19.042s
user 1587m14.698s
sys 518m50.233s
---> 跑了1天半。。。有點怪,網誌參考寫了7-8天,(11/29 32s 16s 8s ->3次*1天半,7~8天)
[參考]
- caffe随记(八)---使用caffe训练FCN的pascalcontext-fcn32s模型(pascal-context数据集)
- http://host.robots.ox.ac.uk/pascal/VOC/voc2010/#devkit
- VOCtrainval_03-May-2010.tar
- VOCdevkit_08-May-2010.tar
留言
張貼留言