Конвейер непрерывного машинного обучения нарушен установкой требований tenorflow - PullRequest
2 голосов
/ 10 июля 2020

Я выполняю непрерывное машинное обучение (https://cml.dev/) на моем собственном сервере GitLab. Моя цель - протестировать базовый c конвейер непрерывного машинного обучения на примере python скрипта.

Мой файл .gitlab-ci.yml - это базовый c один:

stages:
  - cml_run

cml:
  stage: cml_run
  image: dvcorg/cml-py3:latest
  script:
    - pip3 install -r requirements.txt
    - python train.py

    - cat metrics.txt >> report.md
    - cml-publish confusion_matrix.png --md >> report.md
    - cml-send-comment report.md

Для pandas, sklearn и Keras в файле requirements.txt происходит успешная установка. Но я получаю конвейер, нарушенный установкой требований TensorFlow

  $ pip3 install -r requirements.txt
 Collecting pandas
   Downloading pandas-1.0.5-cp36-cp36m-manylinux1_x86_64.whl (10.1 MB)
 Collecting sklearn
   Downloading sklearn-0.0.tar.gz (1.1 kB)
 Collecting keras
   Downloading Keras-2.4.3-py2.py3-none-any.whl (36 kB)
 Collecting tensorflow
   Downloading tensorflow-2.2.0-cp36-cp36m-manylinux2010_x86_64.whl (516.2 MB)
 ERROR: Exception:
 Traceback (most recent call last):
   File "/usr/local/lib/python3.6/dist-packages/pip/_internal/cli/base_command.py", line 188, in _main
     status = self.run(options, args)
   File "/usr/local/lib/python3.6/dist-packages/pip/_internal/cli/req_command.py", line 185, in wrapper
     return func(self, options, args)
   File "/usr/local/lib/python3.6/dist-packages/pip/_internal/commands/install.py", line 333, in run
     reqs, check_supported_wheels=not options.target_dir
   File "/usr/local/lib/python3.6/dist-packages/pip/_internal/resolution/legacy/resolver.py", line 179, in resolve
     discovered_reqs.extend(self._resolve_one(requirement_set, req))
   File "/usr/local/lib/python3.6/dist-packages/pip/_internal/resolution/legacy/resolver.py", line 362, in _resolve_one
     abstract_dist = self._get_abstract_dist_for(req_to_install)
   File "/usr/local/lib/python3.6/dist-packages/pip/_internal/resolution/legacy/resolver.py", line 314, in _get_abstract_dist_for
     abstract_dist = self.preparer.prepare_linked_requirement(req)
   File "/usr/local/lib/python3.6/dist-packages/pip/_internal/operations/prepare.py", line 469, in prepare_linked_requirement
     hashes=hashes,
   File "/usr/local/lib/python3.6/dist-packages/pip/_internal/operations/prepare.py", line 259, in unpack_url
     hashes=hashes,
   File "/usr/local/lib/python3.6/dist-packages/pip/_internal/operations/prepare.py", line 130, in get_http_url
     link, downloader, temp_dir.path, hashes
   File "/usr/local/lib/python3.6/dist-packages/pip/_internal/operations/prepare.py", line 281, in _download_http_url
     for chunk in download.chunks:
   File "/usr/local/lib/python3.6/dist-packages/pip/_internal/cli/progress_bars.py", line 166, in iter
     for x in it:
   File "/usr/local/lib/python3.6/dist-packages/pip/_internal/network/utils.py", line 39, in response_chunks
     decode_content=False,
   File "/usr/local/lib/python3.6/dist-packages/pip/_vendor/urllib3/response.py", line 564, in stream
     data = self.read(amt=amt, decode_content=decode_content)
   File "/usr/local/lib/python3.6/dist-packages/pip/_vendor/urllib3/response.py", line 507, in read
     data = self._fp.read(amt) if not fp_closed else b""
   File "/usr/local/lib/python3.6/dist-packages/pip/_vendor/cachecontrol/filewrapper.py", line 65, in read
     self._close()
   File "/usr/local/lib/python3.6/dist-packages/pip/_vendor/cachecontrol/filewrapper.py", line 52, in _close
     self.__callback(self.__buf.getvalue())
   File "/usr/local/lib/python3.6/dist-packages/pip/_vendor/cachecontrol/controller.py", line 309, in cache_response
     cache_url, self.serializer.dumps(request, response, body=body)
   File "/usr/local/lib/python3.6/dist-packages/pip/_vendor/cachecontrol/serialize.py", line 72, in dumps
     return b",".join([b"cc=4", msgpack.dumps(data, use_bin_type=True)])
   File "/usr/local/lib/python3.6/dist-packages/pip/_vendor/msgpack/__init__.py", line 35, in packb
     return Packer(**kwargs).pack(o)
   File "/usr/local/lib/python3.6/dist-packages/pip/_vendor/msgpack/fallback.py", line 936, in pack
     self._pack(obj)
   File "/usr/local/lib/python3.6/dist-packages/pip/_vendor/msgpack/fallback.py", line 920, in _pack
     len(obj), dict_iteritems(obj), nest_limit - 1
   File "/usr/local/lib/python3.6/dist-packages/pip/_vendor/msgpack/fallback.py", line 1021, in _pack_map_pairs
     self._pack(v, nest_limit - 1)
   File "/usr/local/lib/python3.6/dist-packages/pip/_vendor/msgpack/fallback.py", line 920, in _pack
     len(obj), dict_iteritems(obj), nest_limit - 1
   File "/usr/local/lib/python3.6/dist-packages/pip/_vendor/msgpack/fallback.py", line 1021, in _pack_map_pairs
     self._pack(v, nest_limit - 1)
   File "/usr/local/lib/python3.6/dist-packages/pip/_vendor/msgpack/fallback.py", line 865, in _pack
     return self._buffer.write(obj)
 MemoryError

Есть идеи, как решить его проблему с конвейером CML на GitLab?

1 Ответ

2 голосов
/ 10 июля 2020

Трудно поставить диагноз, не зная больше о вашем .gitlab-ci.yml файле. Но судя по сообщению MemoryError, кажется, что бегуну не хватает памяти для установки Tensorflow в дополнение к другим зависимостям вашего проекта.

Вы можете попробовать установить TF с флагом --no-cache-dir

...