Question

Я работаю над проектом распознавания текста. Есть вероятность, что текст повернут на 180 градусов. Я попробовал tesseract-ocr на терминале, но не повезло. Есть ли способ обнаружить и исправить это? Пример текста показан ниже.

tesseract input.png output

nathancy · Answer 1 · 15 мая 2019

Один простой способ определить, повернут ли текст на 180 градусов, - это использовать наблюдение, которое имеет тенденцию к перекосу вниз. Вот стратегия:

Преобразование изображения в оттенки серого
размытие по Гауссу
Пороговое изображение
Найти ROI верхней / нижней половины порогового изображения
Количество ненулевых элементов массива для каждой половины

Пороговое изображение

Найти ROI верхней и нижней половины

Далее мы разбиваем верхнюю / нижнюю секции

С каждой половиной мы считаем ненулевые элементы массива, используя cv2.countNonZero(). Мы получаем это

('top', 4035)
('bottom', 3389)

Сравнивая значения между двумя половинами, , если верхняя половина имеет больше пикселей, чем нижняя, она переворачивается на 180 градусов. Если она меньше, она правильно ориентирована .

Теперь, когда мы обнаружили, что он перевернут, мы можем повернуть его с помощью этой функции

def rotate(image, angle):
    # Obtain the dimensions of the image
    (height, width) = image.shape[:2]
    (cX, cY) = (width / 2, height / 2)

    # Grab the rotation components of the matrix
    matrix = cv2.getRotationMatrix2D((cX, cY), -angle, 1.0)
    cos = np.abs(matrix[0, 0])
    sin = np.abs(matrix[0, 1])

    # Find the new bounding dimensions of the image
    new_width = int((height * sin) + (width * cos))
    new_height = int((height * cos) + (width * sin))

    # Adjust the rotation matrix to take into account translation
    matrix[0, 2] += (new_width / 2) - cX
    matrix[1, 2] += (new_height / 2) - cY

    # Perform the actual rotation and return the image
    return cv2.warpAffine(image, matrix, (new_width, new_height))

Поворот изображения

rotated = rotate(original_image, 180)
cv2.imshow("rotated", rotated)

что дает нам правильный результат

Это результат в пикселях, если изображение было правильно ориентировано

('top', 3209)
('bottom', 4206)

Полный код

import numpy as np
import cv2

def rotate(image, angle):
    # Obtain the dimensions of the image
    (height, width) = image.shape[:2]
    (cX, cY) = (width / 2, height / 2)

    # Grab the rotation components of the matrix
    matrix = cv2.getRotationMatrix2D((cX, cY), -angle, 1.0)
    cos = np.abs(matrix[0, 0])
    sin = np.abs(matrix[0, 1])

    # Find the new bounding dimensions of the image
    new_width = int((height * sin) + (width * cos))
    new_height = int((height * cos) + (width * sin))

    # Adjust the rotation matrix to take into account translation
    matrix[0, 2] += (new_width / 2) - cX
    matrix[1, 2] += (new_height / 2) - cY

    # Perform the actual rotation and return the image
    return cv2.warpAffine(image, matrix, (new_width, new_height))

image = cv2.imread("1.PNG")
original_image = image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blurred = cv2.GaussianBlur(gray, (3,3), 0)
thresh = cv2.threshold(blurred, 110, 255, cv2.THRESH_BINARY_INV)[1]
cv2.imshow("thresh", thresh)

x, y, w, h = 0, 0, image.shape[1], image.shape[0]

top_half = ((x,y), (x+w, y+h/2))
bottom_half = ((x,y+h/2), (x+w, y+h))

top_x1,top_y1 = top_half[0]
top_x2,top_y2 = top_half[1]
bottom_x1,bottom_y1 = bottom_half[0]
bottom_x2,bottom_y2 = bottom_half[1]

# Split into top/bottom ROIs
top_image = thresh[top_y1:top_y2, top_x1:top_x2]
bottom_image = thresh[bottom_y1:bottom_y2, bottom_x1:bottom_x2]

cv2.imshow("top_image", top_image)
cv2.imshow("bottom_image", bottom_image)

# Count non-zero array elements
top_pixels = cv2.countNonZero(top_image)
bottom_pixels = cv2.countNonZero(bottom_image)

print('top', top_pixels)
print('bottom', bottom_pixels)

# Rotate if upside down
if top_pixels > bottom_pixels:
    rotated = rotate(original_image, 180)
    cv2.imshow("rotated", rotated)

cv2.waitKey(0)

user898678 · Answer 2 · 15 мая 2019

tesseract input.png - --psm 0 -c min_characters_to_try = 10

Warning. Invalid resolution 0 dpi. Using 70 instead.
Page number: 0
Orientation in degrees: 180
Rotate: 180
Orientation confidence: 0.74
Script: Latin
Script confidence: 1.67

Как определить, что текст повернут на 180 градусов или перевернут вверх ногами

Пожалуйста, войдите или зарегистрируйтесь чтобы ответить на этот вопрос.

Ответы [ 2 ]

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Как определить, что текст повернут на 180 градусов или перевернут вверх ногами

Пожалуйста, войдите или зарегистрируйтесь чтобы ответить на этот вопрос.

Ответы [ 2 ]

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Похожие темы