Я создал набор данных для обнаружения пользовательских меток AutoML. Это прекрасно работает. Но этот API возвращает только имена меток. Мои ярлыки состоят из текста, поэтому мне нужно то, что написано на ярлыке. Как я могу объединить эти два компонента, чтобы иметь обе информации в одном JSON выводе?
Например, AutoML и вывод моей собственной модели (мне нужно в этом формате, JSON):
annotation_spec_id: "4732177099668848640"
image_object_detection {
bounding_box {
normalized_vertices {
x: 0.029733024537563324
y: 0.2874366343021393
}
normalized_vertices {
x: 0.33260250091552734
y: 0.3122401535511017
}
}
score: 0.8710569143295288
}
display_name: "f_name"
}
payload {
annotation_spec_id: "4732177099668848640"
image_object_detection {
bounding_box {
normalized_vertices {
x: 0.038460467010736465
y: 0.8654949069023132
}
normalized_vertices {
x: 0.3702985942363739
y: 0.8889270424842834
}
}
score: 0.8308634757995605
}
display_name: "price"
}
payload {
annotation_spec_id: "4732177099668848640"
image_object_detection {
bounding_box {
normalized_vertices {
x: 0.026972321793437004
y: 0.20658300817012787
}
normalized_vertices {
x: 0.43270379304885864
y: 0.23540039360523224
}
}
score: 0.8028228878974915
}
display_name: "ingds"
}
Вывод OCR API Google Vision (мне не нужен этот формат, но мне нужна эта информация): Тексты:
"Serinleten İçecekler / Drinks
Su / Water
t4.00
Maden Suyu / Soda water
+5.00
Ayran / Yogurt Drink
+8.00
Meşrubat Çeşitleri / Fizzy Drinks
t8.00
Sikma Portakal Suyu
Meyve Suları / Fruit Juices
+8.00
Ice Tea / Ice Tea
+8.00
Şalgam Suyu / Turnip Juice
t8.00
Portakal Suyu / Fresh Orange
t15.00
Sıkma Nar Suyu Pomegranate Juice
t15.00
Komposto / Compote
t15.00
Limonata
Limonata / Fresh Lemonade
t12.00
"
bounds: (12,21),(451,21),(451,679),(12,679)
"Serinleten"
bounds: (44,27),(168,26),(168,42),(44,43)
"İçecekler"
bounds: (178,22),(291,21),(291,45),(178,46)
"/"
bounds: (301,23),(308,23),(308,41),(301,41)
"Drinks"
bounds: (318,28),(371,28),(371,39),(318,39)
"Su"
bounds: (16,94),(46,94),(46,105),(16,105)
"/"
bounds: (48,91),(56,91),(56,108),(48,108)
"Water"
bounds: (58,91),(89,91),(89,108),(58,108)
"t4.00"
bounds: (285,92),(328,92),(328,103),(285,103)
"Maden"
bounds: (15,148),(72,148),(72,159),(15,159)
"Suyu"
bounds: (79,148),(119,148),(119,160),(79,160)
"/"
bounds: (126,148),(131,148),(131,159),(126,159)
"Soda"
bounds: (137,151),(167,151),(167,159),(137,159)
"water"
bounds: (173,152),(208,152),(208,159),(173,159)
"+5.00"
bounds: (287,148),(330,148),(330,159),(287,159)
"Ayran"
bounds: (15,203),(63,203),(63,217),(15,217)
"/"
bounds: (70,203),(75,203),(75,215),(70,215)
"Yogurt"
bounds: (81,207),(122,207),(122,217),(81,217)
"Drink"
bounds: (128,207),(161,207),(161,215),(128,215)
"+8.00"
bounds: (288,205),(331,205),(331,217),(288,217)
"Meşrubat"
bounds: (15,258),(94,259),(94,275),(15,274)
"Çeşitleri"
bounds: (101,259),(169,260),(169,275),(101,274)
"/"
bounds: (175,260),(180,260),(180,272),(175,272)
"Fizzy"
bounds: (187,263),(218,263),(218,274),(187,274)
"Drinks"
bounds: (224,263),(263,263),(263,273),(224,273)
Мой код такой:
import sys
from google.cloud import automl_v1beta1
from google.cloud.automl_v1beta1.proto import service_pb2
from google.cloud import vision
def get_prediction(content, project_id, model_id):
prediction_client = automl_v1beta1.PredictionServiceClient()
name = 'projects/{}/locations/us-central1/models/{}'.format(project_id, model_id)
payload = {'image': {'image_bytes': content }}
params = {}
request = prediction_client.predict(name, payload, params)
return request
def detect_text(content):
from google.cloud import vision
import io
content = content
client = vision.ImageAnnotatorClient()
image = vision.types.Image(content=content)
response = client.text_detection(image=image)
texts= response.text_annotations
print('Texts: ')
for text in texts:
print('\n"{}"'.format(text.description))
vertices = (['({},{})'.format(vertex.x, vertex.y)
for vertex in text.bounding_poly.vertices])
print('bounds: {}'.format(','.join(vertices)))
if __name__ == '__main__':
file_path = "testMenu.jpg"
project_id = "*********"
model_id = "********"
with open(file_path, 'rb') as ff:
content = ff.read()
print (get_prediction(content, project_id, model_id))
print (detect_text(content))