Question

Я пытаюсь выполнить некоторую обработку изображения на графическом процессоре (GeForce GTX 1050).

Я использую tenorflow и сравниваю с той же реализацией в CPU:

вот дваФункции:

Процессор

def contrast_stretch(img, out_min=0.0, out_max=255.0):
"""
Performs a simple contrast stretch of the given image, in order to remove
extreme outliers.
"""

 import time
 t1 = time.time()
 in_min = np.percentile(img, 0.05)
 in_max = np.percentile(img, 99.95)
 t2 = time.time()
 print ("time taken cpu", t2-t1)

 out = img - in_min
 out *= (out_max-out_min) / (in_max - in_min)
 out += out_min

 out[out < out_min] = out_min
 out[out > out_max] = out_max

 return out

Версия Tensorflow:

def contrast_stretch(img, out_min=0.0, out_max=255.0):
  """
  Performs a simple contrast stretch of the given image, in order to      remove
extreme outliers.
"""
  t1 = time.time()

  in_min = np.percentile(img.numpy(), 0.05, name="percentile")
  in_max = np.contrib.distributions.percentile(img.numpy(), 99.95, name="percentile")

  t2 = time.time()
  print ("time_taken gpu ", t2-t1)

  #JAD should i use tf to subtract scalars???

  out = tf.add(tf.multiply(tf.subtract(img, in_min), (out_max-out_min) / (in_max - in_min)), out_min)
  out = tf.clip_by_value(out, out_min, out_max)
  return out

А вот и вывод программы:

процессорное время 0.009064197540283203

затраченное время процессора 0,014983415603637695

затрачиваемое время процессора 0,011657476425170898

время_процессора GPU 2,475447654724121

время_процессора GPU 0,7083957195281982 *1022* 1025

10245 * 10255 *1025* 1025 *1025* 1025 *1025* 1025 *1025* 10255 *1025* 1025 * 10255 *1025* 10255 *1025* 1025 * 10255 *1025* 1025 *1025* 10255 *1025* 10255 * 10255 * 10255 * 10255 * 10255 * 10255 * 10255 * 10255 * 10255 * 10255 * 10255 * 10255 * 10255 * 10255 *1025* 1025 * 10255 *1025* 1025 * 10255 *1025* 10255 *1025* 1025Наблюдения: процессор занимает гораздо меньше времени .. (вроде в 70 раз меньше).Кроме того, первый вызов contrib.percentile занимает больше времени, чем другие.

ПРИМЕЧАНИЕ. Я также сравниваю эти две функции, где реализация тензорного потока выполняется быстрее (как и ожидалось).

Процессор:

 def normalize(im):
    t1 = time.time()
    b,g,r = cv2.split(im)

    mean_b = np.mean(b)
    mean_g = np.mean(g)
    mean_r = np.mean(r)

    var_b = np.std(b)
    var_g = np.std(g)
    var_r = np.std(r)

    b = np.divide(np.subtract(b, mean_b), var_b)
    g = np.divide(np.subtract(g, mean_g), var_g)
    r = np.divide(np.subtract(r, mean_r), var_r)
    t2 = time.time()
    print ("cpu numpy time ", t2-t1)
    img = cv2.merge((b,g,r))
    return img

Tensorflow:

def normalize(tf_image):
  float_caster = tf.cast(tf_image, tf.float64, name='caster')
  t1 = time.time()
  b, g, r = tf.split(float_caster, 3, axis=2, name="splitter")

  mean_b, var_b = tf.nn.moments(b, axes=[0, 1], name="moment")
  mean_g, var_g = tf.nn.moments(g, axes=[0, 1])
  mean_r, var_r = tf.nn.moments(r, axes=[0, 1])

  b_norm = tf.math.divide(tf.math.subtract(b, mean_b), tf.math.sqrt(var_b))
  g_norm = tf.math.divide(tf.math.subtract(g, mean_g), tf.math.sqrt(var_g))
  r_norm = tf.math.divide(tf.math.subtract(r, mean_r), tf.math.sqrt(var_r))
  t2 = time.time()
  print ("tensorflow time ", t2-t1)
  return b_norm, g_norm, r_norm

И вот результат этого:

процессорное время numpy 0.029398679733276367

тензорное время 0.014415740966796875

это показывает некоторые скоростные игры с использованием графического процессора, но я понятия не имею, почему tf.distributions.percentile занимает вечно ...

tenorflow.contrib.percentile намного медленнее, чем numpy.percentile, работает ли он на процессорах или графических процессорах

Процессор

Пожалуйста, войдите или зарегистрируйтесь чтобы ответить на этот вопрос.

Ответы [ 0 ]

tenorflow.contrib.percentile намного медленнее, чем numpy.percentile, работает ли он на процессорах или графических процессорах

Процессор

Пожалуйста, войдите или зарегистрируйтесь чтобы ответить на этот вопрос.

Ответы [ 0 ]

Похожие темы