FCIS+mxnet的大坑(终于ok了)
1 cuda
mxnet最高支持到cu10
4090显卡最低cuda11,所以无法使用,报错:compute_89,gpu版本无法使用
cp了fcis的补充库文件后
使用cpu版本。(mxnet-master),编译完成,用python3安装,python3.7安装了mxnet库,可以初始化矩阵。python2无法安装(python3<3.7,3.8不行)
SyntaxError: invalid syntaxFile "/usr/local/lib/python2.7/dist-packages/mxnet-2.0.0-py2.7.egg/mxnet/symbol/symbol.py", line 80return f'<{self.__class__.__name__} group [{name}]>'^
.......Unsupported Python version
==========================
This version of Requests requires at least Python 3.8, but
you're trying to install it on Python 2.7. To resolve this,
consider upgrading to a supported Python version.If you can't upgrade your Python version, you'll need to
pin to an older version of Requests (<2.32.0).
error: Setup script exited with 1
于是只能用python3安装了mxnet,
将fcis的python2改为python3,但是运行fcis/demo.py时,报错:
$ python fcis/demo.py
Traceback (most recent call last):File "fcis/demo.py", line 30, in <module>from core.tester import im_detect, PredictorFile "/home/FCISpy3/fcis/core/tester.py", line 24, in <module>from nms.nms import py_nms_wrapperFile "/home/FCISpy3/fcis/../lib/nms/nms.py", line 3, in <module>from cpu_nms import cpu_nms
ImportError: dynamic module does not define module export function (PyInit_cpu_nms)
PyInit_cpu_nms 问题无法解决:无解答
https://github.com/msracver/FCIS/issues/66
https://github.com/msracver/FCIS/issues/154
GG
所以mxnet 和 fcis 用 python3 也不行,还是得用python2?
但是mxnet oldversion ,cmake 都过不去。。。
2
fcis使用python2
mxnet>1.6只支持python3
mxnet<=1.6支持python2
需要下载老版本mxnet,demo的998378a不可用,存在bug
官方的REDEME进行安装运行,结果在运行Demo的时候出错,提示:
AttributeError: 'module' object has no attribute' 'ChannelOperator
不知道该用哪个mxnet version。
教程:https://blog.csdn.net/xiangxianghehe/article/details/78971383
使用mxnet0.10.1 或1.0.0
但是现在的tag版本中cmake无法通过,error:
CMake Error at CMakeLists.txt:165 (include):include could not find requested file:mshadow/cmake/Utils.cmake
不启用;
需要git clone with --recursive
CMake Error at CMakeLists.txt:589 (add_library):No SOURCES given to target: mxnet
发现:https://github.com/apache/mxnet/issues/10708
需要使用cmake version<3.10 ,大于3.11报错
于是安了3.10.0
3
3.1
mxnet version = 1.0.0
$ git clone --recursive https://github.com/dmlc/mxnet.git
$ git checkout 25720d0
$ git submodule init
$ git submodule update
3.2
cp FCIS file
3.3
$ make -j $(nproc) USE_OPENCV=1 USE_BLAS=openblas
error&deal: [https://blog.csdn.net/tcjy1000/article/details/134042714]
src/operator/tensor/elemwise_binary_broadcast_op_logic.cc:141:1: internal compiler error: 段错误141 | } // namespace mxnet| ^
$ ulimit -a
$ ulimit -n 65535
3.4 多线程有bug,少用几个cpu
$ make -j4 USE_OPENCV=1 USE_BLAS=openblas
ok
3.5
python install
$ sudo python setup.py install
。error:
==========================
Unsupported Python version
==========================
This version of Requests requires at least Python 3.8, but
you're trying to install it on Python 2.7. To resolve this,
consider upgrading to a supported Python version.If you can't upgrade your Python version, you'll need to
pin to an older version of Requests (<2.32.0).
error: Setup script exited with 1
问题不大,不安了,直接用mxnet/python 的路径
3.6
$ python fcis/demo.py
result:
ImportError: libcudart.so.12: cannot open shared object file: No such file or directory
3.8
不能直接用makefile,没有cuda(版本),从cmake重新编译makefile
建立build
一样的问题:
Traceback (most recent call last):File "fcis/demo.py", line 30, in <module>from core.tester import im_detect, PredictorFile "/home/wys/work/farmland/dl/autooutlining/FCIS/fcis/core/tester.py", line 23, in <module>from nms.nms import py_nms_wrapperFile "/home/wys/work/farmland/dl/autooutlining/FCIS/fcis/../lib/nms/nms.py", line 4, in <module>from gpu_nms import gpu_nms
ImportError: libcudart.so.12: cannot open shared object file: No such file or directory
mxnet有cpu版本
但是fcis必须要用gpu?
demo.py 要用 gpu_nms 库,没有cuda GG
4
conda python2
pip install Cython== 0.27.3 ( == 0.27.3)
pip install opencv-python3.4.0.14 3.2.0.6 ( == 3.4.0.14)
pip install easydict1.6
pip install hickle==3.4.9 (3.4.9)
git clone https://github.com/msracver/FCIS.git
sh ./init.sh
按照官方demo
mxnet使用版本:
git clone --recursive https://github.com/dmlc/mxnet.git
git checkout 998378a
git submodule init
git submodule updatecp -r FCIS/fcis/operator_cxx/* mxnet/src/operator/contrib/cd mxnetmake -j4 USE_OPENCV=1 USE_BLAS=openblasmake -j $(nproc) USE_OPENCV=1 USE_BLAS=openblas USE_CUDA=1 USE_CUDA_PATH=/usr/local/cuda USE_CUDNN=1make -j4 USE_OPENCV=1 USE_BLAS=openblas USE_CUDA=1 USE_CUDA_PATH=/usr/local/cuda USE_CUDNN=1
没有在系统中安装python mxnet
而是修改fcis/demo.py
加:
import sys
sys.path.append('/home/wys/work/farmland/dl/autooutlining/mxnet/python')
然后进入FCIS中
conda python2 $ python fcis/demo.py
结果:
(outlining27) wys@wys-PC:~/work/farmland/dl/autooutlining/FCIS$ python fcis/demo.py
/home/wys/work/farmland/dl/autooutlining/FCIS/fcis/config/config.py:173: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.exp_config = edict(yaml.load(f))
use mxnet at /home/wys/work/farmland/dl/autooutlining/mxnet/python/mxnet/__init__.pyc
{'BINARY_THRESH': 0.4,'CLASS_AGNOSTIC': True,'MASK_SIZE': 21,'MXNET_VERSION': 'mxnet','SCALES': [(600, 1000)],'TEST': {'BATCH_IMAGES': 1,'CXX_PROPOSAL': False,'HAS_RPN': True,'ITER': 2,'MASK_MERGE_THRESH': 0.5,'MIN_DROP_SIZE': 2,'NMS': 0.3,'PROPOSAL_MIN_SIZE': 2,'PROPOSAL_NMS_THRESH': 0.7,'PROPOSAL_POST_NMS_TOP_N': 2000,'PROPOSAL_PRE_NMS_TOP_N': 20000,'RPN_MIN_SIZE': 2,'RPN_NMS_THRESH': 0.7,'RPN_POST_NMS_TOP_N': 300,'RPN_PRE_NMS_TOP_N': 6000,'USE_GPU_MASK_MERGE': True,'USE_MASK_MERGE': True,'test_epoch': 8},'TRAIN': {'ASPECT_GROUPING': True,'BATCH_IMAGES': 1,'BATCH_ROIS': -1,'BATCH_ROIS_OHEM': 128,'BBOX_MEANS': [0.0, 0.0, 0.0, 0.0],'BBOX_NORMALIZATION_PRECOMPUTED': True,'BBOX_REGRESSION_THRESH': 0.5,'BBOX_STDS': [0.2, 0.2, 0.5, 0.5],'BBOX_WEIGHTS': array([1., 1., 1., 1.]),'BG_THRESH_HI': 0.5,'BG_THRESH_LO': 0,'BINARY_THRESH': 0.4,'CONVNEW3': True,'CXX_PROPOSAL': False,'ENABLE_OHEM': True,'END2END': True,'FG_FRACTION': 0.25,'FG_THRESH': 0.5,'FLIP': True,'GAP_SELECT_FROM_ALL': False,'IGNORE_GAP': False,'LOSS_WEIGHT': [1.0, 10.0, 1.0],'RESUME': False,'RPN_ALLOWED_BORDER': 0,'RPN_BATCH_SIZE': 256,'RPN_BBOX_WEIGHTS': [1.0, 1.0, 1.0, 1.0],'RPN_CLOBBER_POSITIVES': False,'RPN_FG_FRACTION': 0.5,'RPN_MIN_SIZE': 2,'RPN_NEGATIVE_OVERLAP': 0.3,'RPN_NMS_THRESH': 0.7,'RPN_POSITIVE_OVERLAP': 0.7,'RPN_POSITIVE_WEIGHT': -1.0,'RPN_POST_NMS_TOP_N': 300,'RPN_PRE_NMS_TOP_N': 6000,'SHUFFLE': True,'begin_epoch': 0,'end_epoch': 8,'lr': 0.0005,'lr_step': '5.33','model_prefix': 'e2e','momentum': 0.9,'warmup': True,'warmup_lr': 5e-05,'warmup_step': 250,'wd': 0.0005},'dataset': {'NUM_CLASSES': 81,'dataset': 'coco','dataset_path': './data/coco','image_set': 'train2014+valminusminival2014','proposal': 'rpn','root_path': './data','test_image_set': 'test-dev2015'},'default': {'frequent': 20, 'kvstore': 'device'},'gpus': '0','network': {'ANCHOR_RATIOS': [0.5, 1, 2],'ANCHOR_SCALES': [4, 8, 16, 32],'FIXED_PARAMS': ['conv1','bn_conv1','res2','bn2','gamma','beta'],'FIXED_PARAMS_SHARED': ['conv1','bn_conv1','res2','bn2','res3','bn3','res4','bn4','gamma','beta'],'IMAGE_STRIDE': 0,'NUM_ANCHORS': 12,'PIXEL_MEANS': array([103.06, 115.9 , 123.15]),'RCNN_FEAT_STRIDE': 16,'RPN_FEAT_STRIDE': 16,'pretrained': './model/pretrained_model/resnet_v1_101','pretrained_epoch': 0},'output_path': '../output/fcis','symbol': 'resnet_v1_101_fcis'}
[18:22:41] src/c_api/c_api_ndarray.cc:133: GPU support is disabled. Compile MXNet with USE_CUDA=1 to enable GPU support.
[18:22:41] /home/wys/work/farmland/dl/autooutlining/mxnet/dmlc-core/include/dmlc/./logging.h:304: [18:22:41] src/c_api/c_api_ndarray.cc:390: Operator _zeros is not implemented for GPU.Stack trace returned 10 entries:
[bt] (0) /home/wys/work/farmland/dl/autooutlining/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN4dmlc15LogMessageFatalD1Ev+0x3e) [0x7fd5241911de]
[bt] (1) /home/wys/work/farmland/dl/autooutlining/mxnet/python/mxnet/../../lib/libmxnet.so(_Z20ImperativeInvokeImplRKN5mxnet7ContextERKN4nnvm9NodeAttrsEPSt6vectorINS_7NDArrayESaIS8_EESB_+0x32b) [0x7fd525003b0b]
[bt] (2) /home/wys/work/farmland/dl/autooutlining/mxnet/python/mxnet/../../lib/libmxnet.so(MXImperativeInvoke+0x1ff) [0x7fd525004a0f]
[bt] (3) /home/wys/anaconda3/envs/outlining27/lib/python2.7/lib-dynload/../../libffi.so.8(+0xa052) [0x7fd5295a5052]
[bt] (4) /home/wys/anaconda3/envs/outlining27/lib/python2.7/lib-dynload/../../libffi.so.8(+0x8925) [0x7fd5295a3925]
[bt] (5) /home/wys/anaconda3/envs/outlining27/lib/python2.7/lib-dynload/../../libffi.so.8(ffi_call+0xde) [0x7fd5295a406e]
[bt] (6) /home/wys/anaconda3/envs/outlining27/lib/python2.7/lib-dynload/_ctypes.so(_ctypes_callproc+0x4de) [0x7fd5295befae]
[bt] (7) /home/wys/anaconda3/envs/outlining27/lib/python2.7/lib-dynload/_ctypes.so(+0xa253) [0x7fd5295b6253]
[bt] (8) /home/wys/anaconda3/envs/outlining27/bin/../lib/libpython2.7.so.1.0(PyObject_Call+0x52) [0x7fd57e38a822]
[bt] (9) /home/wys/anaconda3/envs/outlining27/bin/../lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x2954) [0x7fd57e423274]Traceback (most recent call last):File "fcis/demo.py", line 152, in <module>main()File "fcis/demo.py", line 83, in mainarg_params=arg_params, aux_params=aux_params)File "/home/wys/work/farmland/dl/autooutlining/FCIS/fcis/core/tester.py", line 35, in __init__self._mod.bind(provide_data, provide_label, for_training=False)File "/home/wys/work/farmland/dl/autooutlining/FCIS/fcis/core/module.py", line 845, in bindfor_training, inputs_need_grad, force_rebind=False, shared_module=None)File "/home/wys/work/farmland/dl/autooutlining/FCIS/fcis/core/module.py", line 402, in bindstate_names=self._state_names)File "/home/wys/work/farmland/dl/autooutlining/FCIS/fcis/core/DataParallelExecutorGroup.py", line 184, in __init__self.bind_exec(data_shapes, label_shapes, shared_group)File "/home/wys/work/farmland/dl/autooutlining/FCIS/fcis/core/DataParallelExecutorGroup.py", line 284, in bind_execshared_group))File "/home/wys/work/farmland/dl/autooutlining/FCIS/fcis/core/DataParallelExecutorGroup.py", line 598, in _bind_ith_execcontext, self.logger)File "/home/wys/work/farmland/dl/autooutlining/FCIS/fcis/core/DataParallelExecutorGroup.py", line 576, in _get_or_reshapearg_arr = nd.zeros(arg_shape, context, dtype=arg_type)File "/home/wys/work/farmland/dl/autooutlining/mxnet/python/mxnet/ndarray.py", line 1047, in zerosreturn _internal._zeros(shape=shape, ctx=ctx, dtype=dtype, **kwargs)File "<string>", line 15, in _zerosFile "/home/wys/work/farmland/dl/autooutlining/mxnet/python/mxnet/_ctypes/ndarray.py", line 72, in _imperative_invokec_array(ctypes.c_char_p, [c_str(str(val)) for val in vals])))File "/home/wys/work/farmland/dl/autooutlining/mxnet/python/mxnet/base.py", line 85, in check_callraise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: [18:22:41] src/c_api/c_api_ndarray.cc:390: Operator _zeros is not implemented for GPU.Stack trace returned 10 entries:
[bt] (0) /home/wys/work/farmland/dl/autooutlining/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN4dmlc15LogMessageFatalD1Ev+0x3e) [0x7fd5241911de]
[bt] (1) /home/wys/work/farmland/dl/autooutlining/mxnet/python/mxnet/../../lib/libmxnet.so(_Z20ImperativeInvokeImplRKN5mxnet7ContextERKN4nnvm9NodeAttrsEPSt6vectorINS_7NDArrayESaIS8_EESB_+0x32b) [0x7fd525003b0b]
[bt] (2) /home/wys/work/farmland/dl/autooutlining/mxnet/python/mxnet/../../lib/libmxnet.so(MXImperativeInvoke+0x1ff) [0x7fd525004a0f]
[bt] (3) /home/wys/anaconda3/envs/outlining27/lib/python2.7/lib-dynload/../../libffi.so.8(+0xa052) [0x7fd5295a5052]
[bt] (4) /home/wys/anaconda3/envs/outlining27/lib/python2.7/lib-dynload/../../libffi.so.8(+0x8925) [0x7fd5295a3925]
[bt] (5) /home/wys/anaconda3/envs/outlining27/lib/python2.7/lib-dynload/../../libffi.so.8(ffi_call+0xde) [0x7fd5295a406e]
[bt] (6) /home/wys/anaconda3/envs/outlining27/lib/python2.7/lib-dynload/_ctypes.so(_ctypes_callproc+0x4de) [0x7fd5295befae]
[bt] (7) /home/wys/anaconda3/envs/outlining27/lib/python2.7/lib-dynload/_ctypes.so(+0xa253) [0x7fd5295b6253]
[bt] (8) /home/wys/anaconda3/envs/outlining27/bin/../lib/libpython2.7.so.1.0(PyObject_Call+0x52) [0x7fd57e38a822]
[bt] (9) /home/wys/anaconda3/envs/outlining27/bin/../lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x2954) [0x7fd57e423274]
说明fcis 必须要mxnet的gpu支持 GG
直接用:
5 试试租服务器
找到了还有cuda8的服务器
编译mxnet,和4一样没有问题
测试fcis,conda 使用 python2,
fcis/demo.py 增加 import sys 和 sys.path.append(“…/…/mexnet/python”)路径,就不在系统里安装了
服务器没有图形界面:
下载xvfb 虚拟显示器
sudo apt install xvfb
fcis中:
(python2)~/FCIS/fcis$: xvfb-run python demo.py
and
文件:lib/util/show_mask.py
def show_masks(im, detections, masks, class_names, cfg, scale=1.0, show = False):
默认的show=True改false
ok
终于搞定了
然后自己又增加了一些output 的img和file
demo result:
(py2) root@I1c1d239d9105001488:/home/FCIS/fcis# xvfb-run python demo.py
/home/FCIS/fcis/config/config.py:173: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.exp_config = edict(yaml.load(f))
use mxnet at ../../mxnet/python/mxnet/__init__.pyc
{'BINARY_THRESH': 0.4,'CLASS_AGNOSTIC': True,'MASK_SIZE': 21,'MXNET_VERSION': 'mxnet','SCALES': [(600, 1000)],'TEST': {'BATCH_IMAGES': 1,'CXX_PROPOSAL': False,'HAS_RPN': True,'ITER': 2,'MASK_MERGE_THRESH': 0.5,'MIN_DROP_SIZE': 2,'NMS': 0.3,'PROPOSAL_MIN_SIZE': 2,'PROPOSAL_NMS_THRESH': 0.7,'PROPOSAL_POST_NMS_TOP_N': 2000,'PROPOSAL_PRE_NMS_TOP_N': 20000,'RPN_MIN_SIZE': 2,'RPN_NMS_THRESH': 0.7,'RPN_POST_NMS_TOP_N': 300,'RPN_PRE_NMS_TOP_N': 6000,'USE_GPU_MASK_MERGE': True,'USE_MASK_MERGE': True,'test_epoch': 8},'TRAIN': {'ASPECT_GROUPING': True,'BATCH_IMAGES': 1,'BATCH_ROIS': -1,'BATCH_ROIS_OHEM': 128,'BBOX_MEANS': [0.0, 0.0, 0.0, 0.0],'BBOX_NORMALIZATION_PRECOMPUTED': True,'BBOX_REGRESSION_THRESH': 0.5,'BBOX_STDS': [0.2, 0.2, 0.5, 0.5],'BBOX_WEIGHTS': array([1., 1., 1., 1.]),'BG_THRESH_HI': 0.5,'BG_THRESH_LO': 0,'BINARY_THRESH': 0.4,'CONVNEW3': True,'CXX_PROPOSAL': False,'ENABLE_OHEM': True,'END2END': True,'FG_FRACTION': 0.25,'FG_THRESH': 0.5,'FLIP': True,'GAP_SELECT_FROM_ALL': False,'IGNORE_GAP': False,'LOSS_WEIGHT': [1.0, 10.0, 1.0],'RESUME': False,'RPN_ALLOWED_BORDER': 0,'RPN_BATCH_SIZE': 256,'RPN_BBOX_WEIGHTS': [1.0, 1.0, 1.0, 1.0],'RPN_CLOBBER_POSITIVES': False,'RPN_FG_FRACTION': 0.5,'RPN_MIN_SIZE': 2,'RPN_NEGATIVE_OVERLAP': 0.3,'RPN_NMS_THRESH': 0.7,'RPN_POSITIVE_OVERLAP': 0.7,'RPN_POSITIVE_WEIGHT': -1.0,'RPN_POST_NMS_TOP_N': 300,'RPN_PRE_NMS_TOP_N': 6000,'SHUFFLE': True,'begin_epoch': 0,'end_epoch': 8,'lr': 0.0005,'lr_step': '5.33','model_prefix': 'e2e','momentum': 0.9,'warmup': True,'warmup_lr': 5e-05,'warmup_step': 250,'wd': 0.0005},'dataset': {'NUM_CLASSES': 81,'dataset': 'coco','dataset_path': './data/coco','image_set': 'train2014+valminusminival2014','proposal': 'rpn','root_path': './data','test_image_set': 'test-dev2015'},'default': {'frequent': 20, 'kvstore': 'device'},'gpus': '0','network': {'ANCHOR_RATIOS': [0.5, 1, 2],'ANCHOR_SCALES': [4, 8, 16, 32],'FIXED_PARAMS': ['conv1','bn_conv1','res2','bn2','gamma','beta'],'FIXED_PARAMS_SHARED': ['conv1','bn_conv1','res2','bn2','res3','bn3','res4','bn4','gamma','beta'],'IMAGE_STRIDE': 0,'NUM_ANCHORS': 12,'PIXEL_MEANS': array([103.06, 115.9 , 123.15]),'RCNN_FEAT_STRIDE': 16,'RPN_FEAT_STRIDE': 16,'pretrained': './model/pretrained_model/resnet_v1_101','pretrained_epoch': 0},'output_path': '../output/fcis','symbol': 'resnet_v1_101_fcis'}
(426, 640)
testing COCO_test2015_000000000275.jpg 0.1835s
(427, 640)
testing COCO_test2015_000000001412.jpg 0.2058s
(427, 640)
testing COCO_test2015_000000073428.jpg 0.1579s
(428, 640)
testing COCO_test2015_000000393281.jpg 0.1787s
done
(py2) root@I1c1d239d9105001488:/home/FCIS/fcis#
以后再也不碰远古坑了
复现难度(无意义的难度)和作用不太匹配
几个要点:
1、必须cuda8 , 显卡 titan xp 、1080
2、mxnet version = 378a (按照官方demo的版本)