Я хочу классифицировать документы по 4 категориям (местоположениям) на основе 3 столбцов, разбивать 4-значный код, поднимать значение словаря, и waers, а также значение словаря с LinarClassifier.Затем сохраните модель, обслужите ее и добавьте в нее значения burks, lifnr и waers, чтобы получить прогноз.
Мои тренировочные данные выглядят так:
bukrs;lifnr;waers;location
5280;1004008999;EUR;0
5280;1004009000;EUR;2
5280;1004003061;EUR;1
...
И я могу успешно тренироватьсямодель и сохраните ее, что приведет к появлению папок save_model.pb и Variables.
Пока все хорошо.
Я проверил, работает ли сама модель следующим образом:
saved_model_cli show --dir 1561324458 --all
, который дает мне:
MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:
signature_def['classification']:
The given SavedModel SignatureDef contains the following input(s):
inputs['inputs'] tensor_info:
dtype: DT_STRING
shape: (-1)
name: input_example_tensor:0
The given SavedModel SignatureDef contains the following output(s):
outputs['classes'] tensor_info:
dtype: DT_STRING
shape: (-1, 4)
name: head/Tile:0
outputs['scores'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 4)
name: head/predictions/probabilities:0
Method name is: tensorflow/serving/classify
signature_def['predict']:
The given SavedModel SignatureDef contains the following input(s):
inputs['examples'] tensor_info:
dtype: DT_STRING
shape: (-1)
name: input_example_tensor:0
The given SavedModel SignatureDef contains the following output(s):
outputs['all_class_ids'] tensor_info:
dtype: DT_INT32
shape: (-1, 4)
name: head/predictions/Tile:0
outputs['all_classes'] tensor_info:
dtype: DT_STRING
shape: (-1, 4)
name: head/predictions/Tile_1:0
outputs['class_ids'] tensor_info:
dtype: DT_INT64
shape: (-1, 1)
name: head/predictions/ExpandDims:0
outputs['classes'] tensor_info:
dtype: DT_STRING
shape: (-1, 1)
name: head/predictions/str_classes:0
outputs['logits'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 4)
name: linear/linear_model/linear/linear_model/linear/linear_model/weighted_sum:0
outputs['probabilities'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 4)
name: head/predictions/probabilities:0
Method name is: tensorflow/serving/predict
signature_def['serving_default']:
The given SavedModel SignatureDef contains the following input(s):
inputs['inputs'] tensor_info:
dtype: DT_STRING
shape: (-1)
name: input_example_tensor:0
The given SavedModel SignatureDef contains the following output(s):
outputs['classes'] tensor_info:
dtype: DT_STRING
shape: (-1, 4)
name: head/Tile:0
outputs['scores'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 4)
name: head/predictions/probabilities:0
Method name is: tensorflow/serving/classify
, и мне это нравится.
Вот весь сценарий моего Python для обучения:
from __future__ import absolute_import, division, print_function, unicode_literals
import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np
import os
import pandas as pd
import re
import seaborn as sns
from tensorflow import feature_column
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
import itertools
from itertools import islice
#read data
dataframe = pd.read_csv('invoices_classed2.csv', sep=';',header=0)
dataframe.head()
#cut in sets
train, test = train_test_split(dataframe, test_size=0.3)
train, val = train_test_split(train, test_size=0.3)
#print metrics
print(len(train), 'train examples')
print(len(val), 'validation examples')
print(len(test), 'test examples')
# A utility method to create a tf.data dataset from a Pandas Dataframe
labels = pd.Series();
def df_to_dataset(dataframe, shuffle=False, batch_size=32):
dataframe = dataframe.copy()
labels = dataframe.pop('location')
ds = tf.data.Dataset.from_tensor_slices((dict(dataframe), labels))
if shuffle:
ds = ds.shuffle(buffer_size=len(dataframe))
ds = ds.batch(batch_size)
return ds
# A utility method to create a tf.data dataset from a Pandas Dataframe and use it as functional variable
def make_input_fn(dataframe=None, n_epochs=None, shuffle=False, batch_size=32):
def input_fn():
internal_dataframe = dataframe.copy()
labels = internal_dataframe.pop('location')
ds = tf.data.Dataset.from_tensor_slices((dict(internal_dataframe), labels))
if shuffle:
ds = ds.shuffle(buffer_size=len(internal_dataframe))
ds = ds.repeat(n_epochs)
ds = ds.batch(batch_size)
return ds
return input_fn
#building feature columns
bukrs = feature_column.numeric_column("bukrs")
lifnr = feature_column.categorical_column_with_vocabulary_list(
'lifnr',['1004000409','1004003061','1004008999','1004009001','1004009000','1004003768','1004009002'])
lifnr_one_hot = feature_column.indicator_column(lifnr)
waers = feature_column.categorical_column_with_vocabulary_list(
'waers', ['EUR', 'GBP', 'USD','JPY','CZK','HUF'])
waers_one_hot = feature_column.indicator_column(waers)
actual_feature_columns = []
actual_feature_columns.append(bukrs)
actual_feature_columns.append(lifnr_one_hot)
actual_feature_columns.append(waers_one_hot)
#making datasets
train_ds = make_input_fn(train)
val_ds = make_input_fn(val)
test_ds = make_input_fn(test)
print ('####################creating model####################')
linear_est = tf.estimator.LinearClassifier(feature_columns=actual_feature_columns,n_classes=4,model_dir="C:\\Users\\70D4867\\Desktop\\invoicemodel")
print ('####################Train model####################')
#Train model.
linear_est.train(train_ds,max_steps=10000)
print ('####################Evaluation####################')
# Evaluation.
result = linear_est.evaluate(val_ds, steps=1000)
print ('####################printing result####################')
print(result)
print ('####################Done evaluating####################')
for key in sorted(result):
print (key, result[key])
print ('####################predictions####################')
y_generator = linear_est.predict(test_ds)
print ('####################slice predictions####################')
predictions = list(itertools.islice(y_generator,len(test)))
print ('####################predictions output####################')
final_preds = []
template = ('\nPrediction is "{}" ({:.1f}%)')
i = 0;
for pred in (predictions):
final_preds.append(pred['class_ids'][0])
class_id = pred['class_ids'][0]
probability = pred['probabilities'][class_id]
i = i +1
expected = []
for index, row in test.iterrows():
expected.append(row['location'])
print ('####################Test Results####################')
print(classification_report(expected,final_preds))
print ('####################Saving Model####################')
feature_spec = tf.feature_column.make_parse_example_spec(actual_feature_columns)
print(feature_spec)
my_serving_input_receiver_fn = tf.estimator.export.build_parsing_serving_input_receiver_fn(feature_spec)
linear_est.export_saved_model(export_dir_base="invoicemodel\\1",serving_input_receiver_fn=my_serving_input_receiver_fn)
Нокогда я хочу получить прогноз от модели, как это:
saved_model_cli run --dir invoicemodel\1\1561324458 --tag_set serve --signature_def predict --input_examples 'examples=[{"bukrs": 5280, "lifnr": "1004003930", "waers": "EUR"}]'
Я ожидал бы что-то вроде:
[0]
Я получаю ошибку:
NameError: name 'bukrs' is not defined
Я также пытался накормить его .npy файлом.Я создал файл из некоторого зелья моих тренировочных данных:
bukrs;lifnr;waers
5280;1004008999;EUR
5280;1004009000;EUR
5280;1004003061;EUR
...
вот так:
csv_fn = "invoices_classed_npy.csv"
file = pd.read_csv(csv_fn)
np.save('invoices_classed_npy.npy', file, allow_pickle = True);
Но когда я попробовал:
saved_model_cli run --dir .\invoicemodel\1\1561324458 --tag_set serve --signature_def classification --inputs 'inputs="invoices_classed_npy.npy"'
Я ожидал
[1],[2],[0]
Я получил:
ValueError: Cannot feed value of shape (55276, 1) for Tensor
'input_example_tensor:0', which has shape '(None,)'
Хорошо, поэтому я также попытался подать его в контейнере:
docker run -t --rm -p 8501:8501 \
> -v "/data/container/tensorflow/model:/models/saved_model" \
> -e MODEL_NAME=saved_model \
> tensorflow/serving
, и я получил:
/usr/bin/tf_serving_entrypoint.sh: line 3: 6 Illegal instruction (core dumped) tensorflow_model_server --port=8500 --rest_api_port=8501 --model_name=${MODEL_NAME} --model_base_path=${MODEL_BASE_PATH}/${MODEL_NAME} "$@"
Что я делаю не так?Как правильно получить прогнозы из моей модели?