быстрее rcnn работает на GPU - не хватает памяти - PullRequest
0 голосов
/ 09 июня 2018

Я использую это быстрее. Rcnn: https://github.com/lev-kusanagi/Faster-RCNN_TF

Демонстрация работает нормально и работает.Я работаю над проектом, где я отправляю изображения от своего робота на предварительно обученную модель.Приблизительно после 15 отправленных изображений я получаю эту ошибку:

/usr/local/lib/python2.7/dist-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Initializing frcnn...
2018-06-09 12:46:20.343027: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-06-09 12:46:20.456905: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:898] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-06-09 12:46:20.457885: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 0 with properties: 
name: GeForce 840M major: 5 minor: 0 memoryClockRate(GHz): 1.124
pciBusID: 0000:03:00.0
totalMemory: 1.96GiB freeMemory: 1.84GiB
2018-06-09 12:46:20.457924: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0
2018-06-09 12:46:25.081980: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-06-09 12:46:25.082022: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929]      0 
2018-06-09 12:46:25.082039: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0:   N 
2018-06-09 12:46:25.082268: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1605 MB memory) -> physical GPU (device: 0, name: GeForce 840M, pci bus id: 0000:03:00.0, compute capability: 5.0)
Tensor("Placeholder:0", shape=(?, ?, ?, 3), dtype=float32)
Tensor("conv5_3/conv5_3:0", shape=(?, ?, ?, 512), dtype=float32)
Tensor("rpn_conv/3x3/rpn_conv/3x3:0", shape=(?, ?, ?, 512), dtype=float32)
Tensor("rpn_cls_score/rpn_cls_score:0", shape=(?, ?, ?, 18), dtype=float32)
Tensor("rpn_cls_prob:0", shape=(?, ?, ?, ?), dtype=float32)
Tensor("rpn_cls_prob_reshape:0", shape=(?, ?, ?, 18), dtype=float32)
Tensor("rpn_bbox_pred/rpn_bbox_pred:0", shape=(?, ?, ?, 36), dtype=float32)
Tensor("Placeholder_1:0", shape=(?, 3), dtype=float32)
Tensor("conv5_3/conv5_3:0", shape=(?, ?, ?, 512), dtype=float32)
Tensor("rois:0", shape=(?, 5), dtype=float32)
[<tf.Tensor 'conv5_3/conv5_3:0' shape=(?, ?, ?, 512) dtype=float32>, <tf.Tensor 'rois:0' shape=(?, 5) dtype=float32>]
Tensor("fc7/fc7:0", shape=(?, 4096), dtype=float32)


Loaded network VGGnet_fast_rcnn_iter_25000.ckpt
2018-06-09 12:46:41.637686: W tensorflow/core/common_runtime/bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 3.23GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2018-06-09 12:46:41.861576: W tensorflow/core/common_runtime/bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 791.02MiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2018-06-09 12:46:42.118830: W tensorflow/core/common_runtime/bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.32GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2018-06-09 12:46:42.440887: W tensorflow/core/common_runtime/bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 3.09GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2018-06-09 12:46:42.635119: W tensorflow/core/common_runtime/bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 1.19GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2018-06-09 12:46:42.927540: W tensorflow/core/common_runtime/bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 1.59GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2018-06-09 12:46:43.155943: W tensorflow/core/common_runtime/bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 627.19MiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2018-06-09 12:46:43.449477: W tensorflow/core/common_runtime/bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 848.25MiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2018-06-09 12:46:43.780302: W tensorflow/core/common_runtime/bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 610.59MiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
Starting naoqi session...
2018-06-09 12:46:51.216501: W tensorflow/core/common_runtime/bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 1,03GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
Detection took 2.971s for 50 object proposals
Detection took 0.790s for 50 object proposals
Detection took 0.803s for 50 object proposals
Detection took 0.794s for 50 object proposals
Detection took 0.793s for 50 object proposals
Detection took 0.793s for 50 object proposals
Detection took 0.790s for 50 object proposals
Detection took 0.803s for 50 object proposals
Detection took 0.798s for 50 object proposals
Detection took 0.788s for 50 object proposals
Detection took 0.797s for 50 object proposals
Detection took 0.798s for 50 object proposals
Detection took 0.793s for 50 object proposals
Detection took 0.802s for 50 object proposals
Detection took 0.805s for 50 object proposals
Detection took 0.795s for 50 object proposals
Detection took 0.798s for 50 object proposals
out of memory
invalid argument
an illegal memory access was encountered
2018-06-09 12:47:51.523140: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:650] failed to record completion event; therefore, failed to create inter-stream dependency
2018-06-09 12:47:51.523143: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:650] failed to record completion event; therefore, failed to create inter-stream dependency
2018-06-09 12:47:51.541009: E tensorflow/stream_executor/stream.cc:309] Error recording event in stream: error recording CUDA event on stream 0x44bca20: CUDA_ERROR_ILLEGAL_ADDRESS; not marking stream as bad, as the Event object may be at fault. Monitor for further errors.
2018-06-09 12:47:51.541009: I tensorflow/stream_executor/stream.cc:4737] stream 0x44bc950 did not memcpy host-to-device; source: 0x7f024c40f800
2018-06-09 12:47:51.541164: E tensorflow/stream_executor/cuda/cuda_event.cc:49] Error polling for event status: failed to query event: CUDA_ERROR_ILLEGAL_ADDRESS
2018-06-09 12:47:51.541197: F tensorflow/core/common_runtime/gpu/gpu_event_mgr.cc:208] Unexpected Event status: 1
Aborted (core dumped)

Есть ли решение этой проблемы, кроме получения лучшей видеокарты?Есть ли способ освободить память после того, как изображение аннотировано, или в моем коде что-то не так, и где мне искать проблему?

Информация о видеокарте:

Sat Jun  9 13:41:59 2018       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 396.26                 Driver Version: 396.26                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce 840M        Off  | 00000000:03:00.0 Off |                  N/A |
| N/A   42C    P5    N/A /  N/A |    164MiB /  2004MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

Демонстрационный код: https://pastebin.com/uny48BQG Та же ошибка происходит после запуска этого кода.Я поместил в папку около 200 изображений, но после 30 изображений он сломался.

...