Проблемы производительности обнаружения объектов при использовании Tensorflow 2.1.0 и Tensorflow Hub - PullRequest
2 голосов
/ 04 мая 2020

Просматривая некоторую документацию по обнаружению объектов и примеры, найденные в Интернете с использованием модели данных OpenImagesV4, я испытываю неудовлетворительную производительность по скорости обработки событий обнаружения. Код, который я использую, выглядит следующим образом и представляет собой упрощенную версию обнаружения, поэтому я могу понять метрики производительности. Camera Stream обрабатывает нормально без использования какого-либо обнаружения. Как только обнаружение реализовано, оно замедляет подачу примерно на 20 секунд или около того. Я видел, как это было сделано в TF1.14 с использованием обнаружения старых объектов с помощью функций tf.graph () с почти нулевой задержкой на другой модели, поэтому мой вопрос на самом деле, где можно повысить производительность потока потока или где зависание с этой урезанной версией. Это использует GPU для обработки, но видит только пики в ~ 6%. Моя первоначальная мысль заключалась в том, чтобы ввести многопоточность в процессе обнаружения, но я не уверен, как go сделать это или, если необходимо,

Программное обеспечение

  • Версия Tensorflow (2.1.0)
  • Cuda 10.1
  • cudnn 7

Аппаратное обеспечение

  • CPU : Intel i7-4820K
  • GPU: Geforce GTX 1660 (6 ГБ)
  • Память: 16 ГБ
import cv2
import time
import gc
from datetime import datetime
import tensorflow as tf
import tensorflow_hub as hub

low_res_vid_source = "http://192.168.1.85:14238/videostream.cgi?loginuse=####&loginpas=######"
hi_res_vid_source = "rtsp://####:####@192.168.1.85:10554/tcp/av0_0"
cap = cv2.VideoCapture(low_res_vid_source)

#Low Res (640): Hi Res (1280)
width = cap.get(3)

#Low Res (480): Hi Res (720)
height = cap.get(4)

print("Dimensions: Width: ", width, "Height: ", height)
#Remote Loading
#module_handle = "https://tfhub.dev/google/faster_rcnn/openimages_v4/inception_resnet_v2/1"

#Local Loading
module_handle = "C://Users//Isaiah//tf2//Tutorial Sets//Expert//HubCache//ddd04e3eaa283f2b3ae566e084863074d12b403a"
detector = hub.load(module_handle).signatures['default']

def LoadStream():
   ret, frame = cap.read()
   image_resize_val = (1280, 720)
   frame = cv2.resize(frame, image_resize_val)

   ## Average Calculation Time of Conversion Of Pixel Normalization = 0.018950 Seconds
   frame = frame / 255

   ## Average Calculation Time of Conversion Of Image Data Type      = 0.001999 Seconds
   converted_img = tf.image.convert_image_dtype(frame, tf.float32)[tf.newaxis, ...]

   ## Average Calculation Time of Loading Results From Detector      = 1.7 Seconds
   time_start = time.time()
   results = detector(converted_img)
   time_end = time.time()
   print("Detection Took: ", time_end - time_start)
   cv2.imshow('camera feed', frame)


while True:
   LoadStream()

   if cv2.waitKey(1) & 0xFF == ord('q'):
      cv2.destroyAllWindows()
      break

Выход из среды Conda для этого кода выглядит как следует, и ничто, кажется, действительно не торчит

(tf2-gpu) C:\Users\Isaiah\tf2\Tutorial Sets\Expert\Camera_Feed>python Camera_Feed_Raw.py
2020-05-03 16:52:36.567941: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
Dimensions: Width:  640.0 Height:  360.0
2020-05-03 16:54:52.037826: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2020-05-03 16:54:52.253465: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:03:00.0 name: GeForce GTX 1660 computeCapability: 7.5
coreClock: 1.815GHz coreCount: 22 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 178.86GiB/s
2020-05-03 16:54:52.260714: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-05-03 16:54:52.272442: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-05-03 16:54:52.282134: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-05-03 16:54:52.287729: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-05-03 16:54:52.300130: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-05-03 16:54:52.307647: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-05-03 16:54:52.326362: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-05-03 16:54:52.331006: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-05-03 16:54:52.334046: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX
2020-05-03 16:54:52.626783: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:03:00.0 name: GeForce GTX 1660 computeCapability: 7.5
coreClock: 1.815GHz coreCount: 22 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 178.86GiB/s
2020-05-03 16:54:52.633826: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-05-03 16:54:52.638740: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-05-03 16:54:52.642777: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-05-03 16:54:52.647763: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-05-03 16:54:52.651710: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-05-03 16:54:52.656789: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-05-03 16:54:52.660852: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-05-03 16:54:52.667018: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-05-03 16:54:53.626966: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-05-03 16:54:53.630823: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102]      0
2020-05-03 16:54:53.633295: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0:   N
2020-05-03 16:54:53.638096: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4630 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1660, pci bus id: 0000:03:00.0, compute capability: 7.5)
2020-05-03 16:57:25.429470: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-05-03 16:57:26.697611: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-05-03 16:57:29.627538: W tensorflow/stream_executor/gpu/redzone_allocator.cc:312] Internal: Invoking GPU asm compilation is supported on Cuda non-Windows platforms only
Relying on driver to perform ptx compilation. This message will be only logged once.
Detection Took:  58.80091857910156
Detection Took:  1.747373104095459
Detection Took:  1.7253808975219727
Detection Took:  1.736377477645874
Detection Took:  1.7273805141448975
Detection Took:  1.7343783378601074
Detection Took:  1.742375373840332
Detection Took:  1.7413759231567383
Detection Took:  1.7293803691864014
Detection Took:  1.7283804416656494
Detection Took:  1.7403762340545654
Detection Took:  1.7323787212371826
Detection Took:  1.7373778820037842
Detection Took:  1.7323782444000244
...