Я работаю над тессерактом и создал коробочный файл для символов на картинке.Используя tesseract в командной строке, он обнаружил символы и показал положение каждого обнаруженного символа в файле блока.
Вот вывод командной строки.
/Desktop $ tesseract spa.arial.first_page.tif spa.arial.box nobatch
box.train .stderr
read_params_file: Can't open .stderr
Tesseract Open Source OCR Engine v4.0.0-146-gc39a with Leptonica
Page 1
Detected 74 diacritics
row xheight=2, but median xheight = 17.4815
row xheight=2.5, but median xheight = 17.4815
row xheight=91, but median xheight = 17.4815
row xheight=2.5, but median xheight = 17.4815
row xheight=3, but median xheight = 17.4815
row xheight=61.875, but median xheight = 17.4815
row xheight=23, but median xheight = 17.4815
row xheight=3, but median xheight = 17.4815
row xheight=3, but median xheight = 17.4815
row xheight=12.8333, but median xheight = 17.4815
row xheight=15.1282, but median xheight = 17.4815
row xheight=3.5, but median xheight = 17.4815
row xheight=3.5, but median xheight = 17.4815
row xheight=3.5, but median xheight = 17.4815
row xheight=628, but median xheight = 17.4815
row xheight=415.5, but median xheight = 17.4815
row xheight=4, but median xheight = 17.4815
row xheight=630, but median xheight = 17.4815
FAIL!
APPLY_BOXES: boxfile line 7/A ((286,1979),(325,2002)): FAILURE! Couldn't find a matching blob
APPLY_BOXES: boxfile line 11/U ((199,1943),(239,1967)): FAILURE! Couldn't find a matching blob
APPLY_BOXES: boxfile line 14/R ((298,1943),(323,1967)): FAILURE! Couldn't find a matching blob
APPLY_BOXES: boxfile line 16/M ((325,1943),(360,1967)): FAILURE! Couldn't find a matching blob
FAIL!
APPLY_BOXES: boxfile line 1611/a ((849,451),(875,480)): FAILURE! Couldn't find a matching blob
FAIL!
APPLY_BOXES: boxfile line 1617/5 ((947,457),(973,480)): FAILURE! Couldn't find a matching blob
FAIL!
APPLY_BOXES: boxfile line 1622/. ((1038,457),(1042,460)): FAILURE! Couldn't find a matching blob
FAIL!
APPLY_BOXES: boxfile line 1839/a ((679,280),(705,303)): FAILURE! Couldn't find a matching blob
APPLY_BOXES: boxfile line 1860/u ((1030,274),(1063,304)): FAILURE! Couldn't find a matching blob
FAIL!
APPLY_BOXES: boxfile line 1865/p ((1113,274),(1133,304)): FAILURE! Couldn't find a matching blob
FAIL!
APPLY_BOXES: boxfile line 1876/a ((1303,275),(1329,302)): FAILURE! Couldn't find a matching blob
FAIL!
APPLY_BOXES: boxfile line 1879/, ((1362,275),(1365,282)): FAILURE! Couldn't find a matching blob
APPLY_BOXES: boxfile line 1886/c ((1467,278),(1494,301)): FAILURE! Couldn't find a matching blob
APPLY_BOXES: boxfile line 1889/d ((1542,277),(1551,300)): FAILURE! Couldn't find a matching blob
APPLY_BOXES: boxfile line 1892/h ((1569,277),(1595,300)): FAILURE! Couldn't find a matching blob
APPLY_BOXES: boxfile line 1895/c ((619,245),(645,268)): FAILURE! Couldn't find a matching blob
APPLY_BOXES: boxfile line 1910/n ((888,245),(920,262)): FAILURE! Couldn't find a matching blob
APPLY_BOXES: boxfile line 1911/l ((941,245),(949,267)): FAILURE! Couldn't find a matching blob
APPLY_BOXES: boxfile line 1913/e ((981,239),(997,267)): FAILURE! Couldn't find a matching blob
APPLY_BOXES: Unlabelled word at :Bounding box=(133,887)->(1631,893)
APPLY_BOXES: Unlabelled word at :Bounding box=(132,569)->(1631,575)
APPLY_BOXES: Unlabelled word at :Bounding box=(132,484)->(1631,491)
APPLY_BOXES: Unlabelled word at :Bounding box=(1408,418)->(1470,479)
APPLY_BOXES: Unlabelled word at :Bounding box=(132,413)->(1630,420)
APPLY_BOXES: Unlabelled word at :Bounding box=(1238,346)->(1415,400)
APPLY_BOXES: Unlabelled word at :Bounding box=(1408,359)->(1476,425)
APPLY_BOXES: Unlabelled word at :Bounding box=(133,341)->(1628,348)
APPLY_BOXES: Unlabelled word at :Bounding box=(133,205)->(137,1461)
APPLY_BOXES: Unlabelled word at :Bounding box=(598,203)->(602,1034)
APPLY_BOXES: Unlabelled word at :Bounding box=(133,200)->(1629,208)
APPLY_BOXES: Unlabelled word at :Bounding box=(1628,200)->(1633,1460)
Found 1698 good blobs.
Leaving 59 unlabelled blobs in 0 words.
21 remaining unlabelled words deleted.
Generated training data for 353 words
Я хочу нарисоватьблоб (блок) для каждого обнаруженного блоба, который я искал, но не смог получить ссылку.Может ли кто-нибудь помочь мне нарисовать BLOB-объект на изображении файла создания.
Я попытался ниже кода Python, чтобы нарисовать BLOB-объект для текста, используя Pytesseract
import cv2
import pytesseract
file = '/home/Desktop/second_page.png'
img = cv2.imread(file)
h, w, _ = img.shape
boxes = pytesseract.image_to_boxes(img)
for b in boxes.splitlines():
b = b.split(' ')
img = cv2.rectangle(img, (int(b[1]), h - int(b[2])), (int(b[3]), h - int(b[4])), (0, 255, 0), 2)
cv2.imshow(filename, img)
cv2.waitKey(0)
Вывод: