Я хочу настроить отслеживающий сервер MLFlow с внешними метриками и хранилищем артефактов. У меня есть следующие docker контейнеры внутри docker сети: mlflow-server, postgres, sftp-mlflow и python -client. Мне удалось настроить postgres и подключить его к mlflow-серверу и клиенту:
mlflow server --backend-store-uri postgresql://postgres:<pass>@mlflow_db:5432/mlflow_db --default-artifact-root sftp://sftp:<pass>@sftp-mlflow:22 -h 0.0.0.0 -p 8000
Однако я ничего не могу сделать с хранением артефактов. Пробовал следующие изображения sftp
Также следил за этим гид . Но все же хранилище артефактов не работает = (
На моей клиентской стороне у меня
import mlflow
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
remote_server_uri = "http://mlflow-server:8000" # set to your server URI
mlflow.set_tracking_uri(remote_server_uri)
# plotting
fig.savefig("test.png")
ARTIFACT_URI = "sftp://sftp:<pass>@sftp-mlflow:22"
EXPERIMENT_NAME = "test"
mlflow.create_experiment(EXPERIMENT_NAME, artifact_location=ARTIFACT_URI)
mlflow.set_experiment(EXPERIMENT_NAME)
with mlflow.start_run():
mlflow.log_param("a", 1)
mlflow.log_metric("b", 2)
mlflow.log_artifact('test.png')
, и при запуске этого кода я получаю:
2020/08/06 01:05:19 ERROR mlflow.utils.rest_utils: API request to http://mlflow-server:8000/api/2.0/mlflow/experiments/create failed with code 500 != 200, retrying up to 0 more times. API response body: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<title>500 Internal Server Error</title>
<h1>Internal Server Error</h1>
<p>The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.</p>
Traceback (most recent call last):
File "/usr/local/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/local/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/run.py", line 24, in <module>
mlflow.create_experiment(EXPERIMENT_NAME, artifact_location=ARTIFACT_URI)
File "/usr/local/lib/python3.8/site-packages/mlflow/tracking/fluent.py", line 357, in create_experiment
return MlflowClient().create_experiment(name, artifact_location)
File "/usr/local/lib/python3.8/site-packages/mlflow/tracking/client.py", line 164, in create_experiment
return self._tracking_client.create_experiment(name, artifact_location)
File "/usr/local/lib/python3.8/site-packages/mlflow/tracking/_tracking_service/client.py", line 126, in create_experiment
return self.store.create_experiment(
File "/usr/local/lib/python3.8/site-packages/mlflow/store/tracking/rest_store.py", line 54, in create_experiment
response_proto = self._call_endpoint(CreateExperiment, req_body)
File "/usr/local/lib/python3.8/site-packages/mlflow/store/tracking/rest_store.py", line 32, in _call_endpoint
return call_endpoint(self.get_host_creds(), endpoint, method, json_body, response_proto)
File "/usr/local/lib/python3.8/site-packages/mlflow/utils/rest_utils.py", line 142, in call_endpoint
response = http_request(
File "/usr/local/lib/python3.8/site-packages/mlflow/utils/rest_utils.py", line 86, in http_request
raise MlflowException("API request to %s failed to return code 200 after %s tries" %
mlflow.exceptions.MlflowException: API request to http://mlflow-server:8000/api/2.0/mlflow/experiments/create failed to return code 200 after 3 tries
Я могу подключиться к sftp-хранилищу из mlflow-server contaner и из клиента python с помощью sftp: sftp -P 22 sftp@sftp-mlflow