Просматривая некоторую документацию по обнаружению объектов и примеры, найденные в Интернете с использованием модели данных OpenImagesV4, я испытываю неудовлетворительную производительность по скорости обработки событий обнаружения. Код, который я использую, выглядит следующим образом и представляет собой упрощенную версию обнаружения, поэтому я могу понять метрики производительности. Camera Stream обрабатывает нормально без использования какого-либо обнаружения. Как только обнаружение реализовано, оно замедляет подачу примерно на 20 секунд или около того. Я видел, как это было сделано в TF1.14 с использованием обнаружения старых объектов с помощью функций tf.graph () с почти нулевой задержкой на другой модели, поэтому мой вопрос на самом деле, где можно повысить производительность потока потока или где зависание с этой урезанной версией. Это использует GPU для обработки, но видит только пики в ~ 6%. Моя первоначальная мысль заключалась в том, чтобы ввести многопоточность в процессе обнаружения, но я не уверен, как go сделать это или, если необходимо,
Программное обеспечение
- Версия Tensorflow (2.1.0)
- Cuda 10.1
- cudnn 7
Аппаратное обеспечение
- CPU : Intel i7-4820K
- GPU: Geforce GTX 1660 (6 ГБ)
- Память: 16 ГБ
import cv2
import time
import gc
from datetime import datetime
import tensorflow as tf
import tensorflow_hub as hub
low_res_vid_source = "http://192.168.1.85:14238/videostream.cgi?loginuse=####&loginpas=######"
hi_res_vid_source = "rtsp://####:####@192.168.1.85:10554/tcp/av0_0"
cap = cv2.VideoCapture(low_res_vid_source)
#Low Res (640): Hi Res (1280)
width = cap.get(3)
#Low Res (480): Hi Res (720)
height = cap.get(4)
print("Dimensions: Width: ", width, "Height: ", height)
#Remote Loading
#module_handle = "https://tfhub.dev/google/faster_rcnn/openimages_v4/inception_resnet_v2/1"
#Local Loading
module_handle = "C://Users//Isaiah//tf2//Tutorial Sets//Expert//HubCache//ddd04e3eaa283f2b3ae566e084863074d12b403a"
detector = hub.load(module_handle).signatures['default']
def LoadStream():
ret, frame = cap.read()
image_resize_val = (1280, 720)
frame = cv2.resize(frame, image_resize_val)
## Average Calculation Time of Conversion Of Pixel Normalization = 0.018950 Seconds
frame = frame / 255
## Average Calculation Time of Conversion Of Image Data Type = 0.001999 Seconds
converted_img = tf.image.convert_image_dtype(frame, tf.float32)[tf.newaxis, ...]
## Average Calculation Time of Loading Results From Detector = 1.7 Seconds
time_start = time.time()
results = detector(converted_img)
time_end = time.time()
print("Detection Took: ", time_end - time_start)
cv2.imshow('camera feed', frame)
while True:
LoadStream()
if cv2.waitKey(1) & 0xFF == ord('q'):
cv2.destroyAllWindows()
break
Выход из среды Conda для этого кода выглядит как следует, и ничто, кажется, действительно не торчит
(tf2-gpu) C:\Users\Isaiah\tf2\Tutorial Sets\Expert\Camera_Feed>python Camera_Feed_Raw.py
2020-05-03 16:52:36.567941: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
Dimensions: Width: 640.0 Height: 360.0
2020-05-03 16:54:52.037826: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2020-05-03 16:54:52.253465: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:03:00.0 name: GeForce GTX 1660 computeCapability: 7.5
coreClock: 1.815GHz coreCount: 22 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 178.86GiB/s
2020-05-03 16:54:52.260714: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-05-03 16:54:52.272442: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-05-03 16:54:52.282134: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-05-03 16:54:52.287729: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-05-03 16:54:52.300130: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-05-03 16:54:52.307647: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-05-03 16:54:52.326362: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-05-03 16:54:52.331006: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-05-03 16:54:52.334046: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX
2020-05-03 16:54:52.626783: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:03:00.0 name: GeForce GTX 1660 computeCapability: 7.5
coreClock: 1.815GHz coreCount: 22 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 178.86GiB/s
2020-05-03 16:54:52.633826: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-05-03 16:54:52.638740: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-05-03 16:54:52.642777: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-05-03 16:54:52.647763: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-05-03 16:54:52.651710: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-05-03 16:54:52.656789: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-05-03 16:54:52.660852: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-05-03 16:54:52.667018: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-05-03 16:54:53.626966: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-05-03 16:54:53.630823: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] 0
2020-05-03 16:54:53.633295: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0: N
2020-05-03 16:54:53.638096: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4630 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1660, pci bus id: 0000:03:00.0, compute capability: 7.5)
2020-05-03 16:57:25.429470: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-05-03 16:57:26.697611: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-05-03 16:57:29.627538: W tensorflow/stream_executor/gpu/redzone_allocator.cc:312] Internal: Invoking GPU asm compilation is supported on Cuda non-Windows platforms only
Relying on driver to perform ptx compilation. This message will be only logged once.
Detection Took: 58.80091857910156
Detection Took: 1.747373104095459
Detection Took: 1.7253808975219727
Detection Took: 1.736377477645874
Detection Took: 1.7273805141448975
Detection Took: 1.7343783378601074
Detection Took: 1.742375373840332
Detection Took: 1.7413759231567383
Detection Took: 1.7293803691864014
Detection Took: 1.7283804416656494
Detection Took: 1.7403762340545654
Detection Took: 1.7323787212371826
Detection Took: 1.7373778820037842
Detection Took: 1.7323782444000244