вызов конечной точки sagemaker случайным образом выдает ошибку - PullRequest
0 голосов
/ 15 апреля 2020

Env:

  • Модель XGBoost, обученная на данных c (Excel).
  • Модель Сохранена и развернута в Sagemaker с MLFlow.
  • Модель Sagemaker и Sagemaker Endpoint работают.

Вызов: - Я вызываю модель с новыми данными через запрос REST (POSTMAN) и через boto3.sagemaker

client.invoke_endpoint(
    EndpointName="xxxxxxx",
    Body=data,
    ContentType="application/json",
    Accept="string",
)

Проблема: Кажется, что Sagemaker случайно (~ 50% времени) терпит неудачу со следующим Исключением:

{
"ErrorCode": "CLIENT_ERROR_FROM_MODEL",
"LogStreamArn": "arn:aws:logs:eu-central-1:xxxxxxxxxx:log-group:/aws/sagemaker/Endpoints/xxxxxxxx",
"Message": "Received client error (400) from mfs-xxxxxxxxx-model-dztwabjotscyoc-zx7lzyfg with message \"{\"error_code\": \"BAD_REQUEST\", \"message\": \"Encountered an unexpected error while evaluating the model. Verify that the serialized input Dataframe is compatible with the model for inference.\", \"stack_trace\": \"Traceback (most recent call last):\\n  File \\\"/miniconda/envs/custom_env/lib/python3.6/site-packages/mlflow/pyfunc/scoring_server/__init__.py\\\", line 196, in transformation\\n    raw_predictions = model.predict(data)\\n  File \\\"/miniconda/envs/custom_env/lib/python3.6/site-packages/mlflow/xgboost.py\\\", line 198, in predict\\n    return self.xgb_model.predict(xgb.DMatrix(dataframe))\\n  File \\\"/miniconda/envs/custom_env/lib/python3.6/site-packages/xgboost/core.py\\\", line 1443, in predict\\n    self._validate_features(data)\\n  File \\\"/miniconda/envs/custom_env/lib/python3.6/site-packages/xgboost/core.py\\\", line 1862, in _validate_features\\n    data.feature_names))\\nValueError: feature_names mismatch: ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '20', '21', '22', '23', '24', '25', '26', '27', '28', '29', '30', '31', '32', '33', '34', '35', '36', '37', '38', '39', '40', '41', '42', '43', '44', '45', '46', '47', '48', '49', '50', '51', '52', '53', '54', '55', '56', '57', '58', '59', '60', '61', '62', '63', '64', '65', '66', '67', '68', '69', '70', '71', '72', '73', '74', '75', '76', '77', '78', '79', '80', '81'] ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '20', '21', '22', '23', '24', '25', '26', '27', '28', '29', '30', '31', '32', '33', '34', '35', '36', '37', '38', '39', '40', '41', '42', '43', '44', '45', '46', '47', '48', '49', '50', '51', '52', '53', '54', '55', '56', '57', '58', '59', '60', '61', '62', '63', '64', '65', '66', '67', '68', '69', '70', '71', '72', '73', '74', '75', '76', '77', '78', '79', '80', '81', '82', '83', '84', '85']\\ntraining data did not have the following fields: 83, 85, 84, 82\\n\"}\". See https://eu-central-1.console.aws.amazon.com/cloudwatch/home?region=eu-central-1#logEventViewer:group=/aws/sagemaker/Endpoints/xxxxxxxxx in account xxxxxxxfor more information.",
"OriginalMessage": "{\"error_code\": \"BAD_REQUEST\", \"message\": \"Encountered an unexpected error while evaluating the model. Verify that the serialized input Dataframe is compatible with the model for inference.\", \"stack_trace\": \"Traceback (most recent call last):\\n  File \\\"/miniconda/envs/custom_env/lib/python3.6/site-packages/mlflow/pyfunc/scoring_server/__init__.py\\\", line 196, in transformation\\n    raw_predictions = model.predict(data)\\n  File \\\"/miniconda/envs/custom_env/lib/python3.6/site-packages/mlflow/xgboost.py\\\", line 198, in predict\\n    return self.xgb_model.predict(xgb.DMatrix(dataframe))\\n  File \\\"/miniconda/envs/custom_env/lib/python3.6/site-packages/xgboost/core.py\\\", line 1443, in predict\\n    self._validate_features(data)\\n  File \\\"/miniconda/envs/custom_env/lib/python3.6/site-packages/xgboost/core.py\\\", line 1862, in _validate_features\\n    data.feature_names))\\nValueError: feature_names mismatch: ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '20', '21', '22', '23', '24', '25', '26', '27', '28', '29', '30', '31', '32', '33', '34', '35', '36', '37', '38', '39', '40', '41', '42', '43', '44', '45', '46', '47', '48', '49', '50', '51', '52', '53', '54', '55', '56', '57', '58', '59', '60', '61', '62', '63', '64', '65', '66', '67', '68', '69', '70', '71', '72', '73', '74', '75', '76', '77', '78', '79', '80', '81'] ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '20', '21', '22', '23', '24', '25', '26', '27', '28', '29', '30', '31', '32', '33', '34', '35', '36', '37', '38', '39', '40', '41', '42', '43', '44', '45', '46', '47', '48', '49', '50', '51', '52', '53', '54', '55', '56', '57', '58', '59', '60', '61', '62', '63', '64', '65', '66', '67', '68', '69', '70', '71', '72', '73', '74', '75', '76', '77', '78', '79', '80', '81', '82', '83', '84', '85']\\ntraining data did not have the following fields: 83, 85, 84, 82\\n\"}",
"OriginalStatusCode": 400

}

После этого Исключения я снова нажимаю «Отправить запрос» в Почтальоне (с точно таким же Телом Данные) и получите успешный ответ:

[
0.9989840388298035
]

Проблема точно такая же (случайно) при использовании boto3.sagemaker в AWS Лямбда или локально при тестировании.

ОБНОВЛЕНИЕ 1: Когда я загружаю экспортированную функцию с помощью mlflow.xgboost.load_model () и запускаю прогноз, она работает каждый раз. Вот код:

    import mlflow
    from mlflow import xgboost
    from xgboost import DMatrix
    import numpy as np
    import pandas as pd
    base_path = "src/models/"


    model_path_name_a = f"{base_path}/model_a"
    model_path_name_b = f"{base_path}/model_b"
    # xgboost.log_model(xgb_model_a, artifact_path='https://mlflow-server-tracking.s3.eu-central-1.amazonaws.com')
    xgb_model_a = xgboost.load_model(model_path_name_a)
    xgb_model_b = xgboost.load_model(model_path_name_b)

    X_test = DMatrix(
        np.array(
            [
                [7.60000e+01, 2.57000e+02, 9.25200e+03, 2.00400e+03, 0.00000e+00, 1.00000e+00, 0.00000e+00, 0.00000e+00,
                 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00,
                 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00,
                 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00,
                 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00,
                 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00,
                 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00,
                 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00,
                 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00,
                 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00, 1.00000e+00, 0.00000e+00, 0.00000e+00,
                 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00]
            ]
        )
    )


    ############# Prediction Model A ################
    y_predA = xgb_model_a.predict(X_test)
    print(f"y_predA: {y_predA}")

    ############# Prediction Model B ################
    y_predB = xgb_model_b.predict(X_test)
    print(f"y_predA: {y_predB}")

Значит ли это, что проблема в Sagemaker Inference / Endpoint Service?

...