tensorflow语义分割api之使用deeplab训练cityscapes的示例分析

这篇文章给大家分享的是有关tensorflow语义分割api之使用deeplab训练cityscapes的示例分析的内容。小编觉得挺实用的,因此分享给大家做个参考,一起跟随小编过来看看吧。

安装教程:

tensorflow语义分割api之使用deeplab训练cityscapes的示例分析

cityscapes训练:

遇到的坑:

1. 环境:

- tensorflow1.8+CUDA9.0+cudnn7.0+annaconda3+py3.5

- 使用最新的tensorflow1.12或者1.10都不行,报错:报错不造卷积算法(convolution algorithm...)

2. 数据集转换

#Exitimmediatelyifacommandexitswithanon-zerostatus.
set-e
CURRENT_DIR=$(pwd)
WORK_DIR="."
#RootpathforCityscapesdataset.
CITYSCAPES_ROOT="${WORK_DIR}/cityscapes"
#Createtraininglabels.
python"${CITYSCAPES_ROOT}/cityscapesscripts/preparation/createTrainIdLabelImgs.py"
#BuildTFRecordsofthedataset.
#First,createoutputdirectoryforstoringTFRecords.
OUTPUT_DIR="${CITYSCAPES_ROOT}/tfrecord"
mkdir-p"${OUTPUT_DIR}"
BUILD_SCRIPT="${CURRENT_DIR}/build_cityscapes_data.py"
echo"ConvertingCityscapesdataset..."
python"${BUILD_SCRIPT}"\
--cityscapes_root="${CITYSCAPES_ROOT}"\
--output_dir="${OUTPUT_DIR}"\

- 首先当前conda环境下安装cityscapesScripts模块,要支持py3.5才行;

- 由于cityscapesscripts/preparation/createTrainIdLabelImgs.py里面默认会把数据集gtFine下面的test,train,val文件夹json文件都转为TrainIdlandelImgs.png;然而在test文件下有很多json文件编码格式是错误的,大约十几张,每次报错,然后将其剔除!!!

- 然后执行build_cityscapes_data.py将img,lable转换为tfrecord格式。

3. 训练cityscapes代码

- 将训练代码写成脚本文件:train_deeplab_cityscapes.sh

#!/bin/bash
#CUDA_VISIBLE_DEVICES=0,1,2,3pythontrain.py--backboneresnet--lr0.01--workers4--epochs40--batch-size16--gpu-ids0,1,2,3--checknamedeeplab-resnet--eval-interval1--datasetcoco

PATH_TO_INITIAL_CHECKPOINT='/home/rjw/tf-models/research/deeplab/pretrain_models/deeplabv3_cityscapes_train/model.ckpt'
PATH_TO_TRAIN_DIR='/home/rjw/tf-models/research/deeplab/datasets/cityscapes/exp/train_on_train_set/train/'
PATH_TO_DATASET='/home/rjw/tf-models/research/deeplab/datasets/cityscapes/tfrecord'
WORK_DIR='/home/rjw/tf-models/research/deeplab'
#Fromtensorflow/models/research/
python"${WORK_DIR}"/train.py\
--logtostderr\
--training_number_of_steps=40000\
--train_split="train"\
--model_variant="xception_65"\
--atrous_rates=6\
--atrous_rates=12\
--atrous_rates=18\
--output_stride=16\
--decoder_output_stride=4\
--train_crop_size=513\
--train_crop_size=513\
--train_batch_size=1\
--fine_tune_batch_norm=False\
--dataset="cityscapes"\
--tf_initial_checkpoint=${PATH_TO_INITIAL_CHECKPOINT}\
--train_logdir=${PATH_TO_TRAIN_DIR}\
--dataset_dir=${PATH_TO_DATASET}

参数分析:

training_number_of_steps: 训练迭代次数;

train_crop_size:训练图片的裁剪大小,因为我的GPU只有8G,故我将这个设置为513了;

train_batch_size: 训练的batchsize,也是因为硬件条件,故保持1;

fine_tune_batch_norm=False :是否使用batch_norm,官方建议,如果训练的batch_size小于12的话,须将该参数设置为False,这个设置很重要,否则的话训练时会在2000步左右报错

tf_initial_checkpoint:预训练的初始checkpoint,这里设置的即是前面下载的../research/deeplab/backbone/deeplabv3_cityscapes_train/model.ckpt.index

train_logdir: 保存训练权重的目录,注意在开始的创建工程目录的时候就创建了,这里设置为"../research/deeplab/exp/train_on_train_set/train/"

dataset_dir:数据集的地址,前面创建的TFRecords目录。这里设置为"../dataset/cityscapes/tfrecord"

4.验证测试

- 验证脚本:

#!/bin/bash
#CUDA_VISIBLE_DEVICES=0,1,2,3pythontrain.py--backboneresnet--lr0.01--workers4--epochs40--batch-size16--gpu-ids0,1,2,3--checknamedeeplab-resnet--eval-interval1--datasetcoco
PATH_TO_INITIAL_CHECKPOINT='/home/rjw/tf-models/research/deeplab/pretrain_models/deeplabv3_cityscapes_train/'
PATH_TO_CHECKPOINT='/home/rjw/tf-models/research/deeplab/datasets/cityscapes/exp/train_on_train_set/train/'
PATH_TO_EVAL_DIR='/home/rjw/tf-models/research/deeplab/datasets/cityscapes/exp/train_on_train_set/eval/'
PATH_TO_DATASET='/home/rjw/tf-models/research/deeplab/datasets/cityscapes/tfrecord'
WORK_DIR='/home/rjw/tf-models/research/deeplab'
#Fromtensorflow/models/research/
python"${WORK_DIR}"/eval.py\
--logtostderr\
--eval_split="val"\
--model_variant="xception_65"\
--atrous_rates=6\
--atrous_rates=12\
--atrous_rates=18\
--output_stride=16\
--decoder_output_stride=4\
--eval_crop_size=1025\
--eval_crop_size=2049\
--dataset="cityscapes"\
--checkpoint_dir=${PATH_TO_INITIAL_CHECKPOINT}\
--eval_logdir=${PATH_TO_EVAL_DIR}\
--dataset_dir=${PATH_TO_DATASET}

- rusult:model.ckpt-40000为在初始化模型上训练40000次迭代的模型;后面用初始化模型测试miou_1.0还是很低,不知道是不是有什么参数设置的问题!!!

- 注意,如果使用官方提供的checkpoint,压缩包中是没有checkpoint文件的,需要手动添加一个checkpoint文件;初始化模型中是没有提供chekpoint文件的。

INFO:tensorflow:Restoringparametersfrom/home/rjw/tf-models/research/deeplab/datasets/cityscapes/exp/train_on_train_set/train/model.ckpt-40000
INFO:tensorflow:Runninglocal_init_op.
INFO:tensorflow:Donerunninglocal_init_op.
INFO:tensorflow:Startingevaluationat2018-12-18-07:13:08
INFO:tensorflow:Evaluation[50/500]
INFO:tensorflow:Evaluation[100/500]
INFO:tensorflow:Evaluation[150/500]
INFO:tensorflow:Evaluation[200/500]
INFO:tensorflow:Evaluation[250/500]
INFO:tensorflow:Evaluation[300/500]
INFO:tensorflow:Evaluation[350/500]
INFO:tensorflow:Evaluation[400/500]
INFO:tensorflow:Evaluation[450/500]
miou_1.0[0.478293568]
INFO:tensorflow:Waitingfornewcheckpointat/home/rjw/tf-models/research/deeplab/pretrain_models/deeplabv3_cityscapes_train/
INFO:tensorflow:Foundnewcheckpointat/home/rjw/tf-models/research/deeplab/pretrain_models/deeplabv3_cityscapes_train/model.ckpt
INFO:tensorflow:Graphwasfinalized.
2018-12-1815:18:05.210957:Itensorflow/core/common_runtime/gpu/gpu_device.cc:1435]Addingvisiblegpudevices:0
2018-12-1815:18:05.211047:Itensorflow/core/common_runtime/gpu/gpu_device.cc:923]DeviceinterconnectStreamExecutorwithstrength1edgematrix:
2018-12-1815:18:05.211077:Itensorflow/core/common_runtime/gpu/gpu_device.cc:929]0
2018-12-1815:18:05.211100:Itensorflow/core/common_runtime/gpu/gpu_device.cc:942]0:N
2018-12-1815:18:05.211645:Itensorflow/core/common_runtime/gpu/gpu_device.cc:1053]CreatedTensorFlowdevice(/job:localhost/replica:0/task:0/device:GPU:0with9404MBmemory)->physicalGPU(device:0,name:GeForceGTX1080Ti,pcibusid:0000:01:00.0,computecapability:6.1)
INFO:tensorflow:Restoringparametersfrom/home/rjw/tf-models/research/deeplab/pretrain_models/deeplabv3_cityscapes_train/model.ckpt
INFO:tensorflow:Runninglocal_init_op.
INFO:tensorflow:Donerunninglocal_init_op.
INFO:tensorflow:Startingevaluationat2018-12-18-07:18:06
INFO:tensorflow:Evaluation[50/500]
INFO:tensorflow:Evaluation[100/500]
INFO:tensorflow:Evaluation[150/500]
INFO:tensorflow:Evaluation[200/500]
INFO:tensorflow:Evaluation[250/500]
INFO:tensorflow:Evaluation[300/500]
INFO:tensorflow:Evaluation[350/500]
INFO:tensorflow:Evaluation[400/500]
INFO:tensorflow:Evaluation[450/500]
miou_1.0[0.496331513]

5.可视化测试

- 在vis目录下生成分割结果图

#!/bin/bash
#CUDA_VISIBLE_DEVICES=0,1,2,3pythontrain.py--backboneresnet--lr0.01--workers4--epochs40--batch-size16--gpu-ids0,1,2,3--checknamedeeplab-resnet--eval-interval1--datasetcoco

PATH_TO_CHECKPOINT='/home/rjw/tf-models/research/deeplab/datasets/cityscapes/exp/train_on_train_set/train/'
PATH_TO_VIS_DIR='/home/rjw/tf-models/research/deeplab/datasets/cityscapes/exp/train_on_train_set/vis/'
PATH_TO_DATASET='/home/rjw/tf-models/research/deeplab/datasets/cityscapes/tfrecord'
WORK_DIR='/home/rjw/tf-models/research/deeplab'

#Fromtensorflow/models/research/
python"${WORK_DIR}"/vis.py\
--logtostderr\
--vis_split="val"\
--model_variant="xception_65"\
--atrous_rates=6\
--atrous_rates=12\
--atrous_rates=18\
--output_stride=16\
--decoder_output_stride=4\
--vis_crop_size=1025\
--vis_crop_size=2049\
--dataset="cityscapes"\
--colormap_type="cityscapes"\
--checkpoint_dir=${PATH_TO_CHECKPOINT}\
--vis_logdir=${PATH_TO_VIS_DIR}\
--dataset_dir=${PATH_TO_DATASET}

感谢各位的阅读!关于“tensorflow语义分割api之使用deeplab训练cityscapes的示例分析”这篇文章就分享到这里了,希望以上内容可以对大家有一定的帮助,让大家可以学到更多知识,如果觉得文章不错,可以把它分享出去让更多的人看到吧!

发布于 2021-05-30 14:08:26
收藏
分享
海报
0 条评论
166
上一篇:Pytorch dataloader时报错每个tensor维度不一样怎么办 下一篇:pytorch中DataLoader()过程中会遇到的问题有哪些
目录

    0 条评论

    本站已关闭游客评论,请登录或者注册后再评论吧~

    忘记密码?

    图形验证码