Модульные тесты Tensorflow аварийно завершаются - PullRequest
0 голосов
/ 16 января 2019

Я запускаю модульные тесты TensorFlow в соответствии с рекомендациями для участия . На моей локальной машине MacOS образ Docker не скомпилировался из-за проблемы с apt-key . Я установил виртуальную машину Docker on Ubuntu Server (Standard A1 (1 Core, 1.75 GiB memory)) в Azure и выполнил следующие команды:

git clone https://github.com/tensorflow/tensorflow.git
cd tensorflow/
tensorflow/tools/ci_build/ci_build.sh CPU bazel test //tensorflow/...

Моя первая попытка потерпела крах в конце строки:

Analyzing: 7629 targets (561 packages loaded, 37218 targets configured)
java.lang.OutOfMemoryError: GC overhead limit exceeded
Dumping heap to /home/mmorin/tensorflow/bazel-ci_build-cache/.cache/bazel/_bazel_mmorin/eab0d61a99b6696edb3d2aff87b585e8/java_pid25131.hprof ...
Heap dump file created [617886369 bytes in 9.053 secs]
Internal error thrown during build. Printing stack trace: java.lang.OutOfMemoryError: GC overhead limit exceeded
    at com.google.devtools.build.lib.analysis.configuredtargets.RuleConfiguredTarget.<init>(RuleConfiguredTarget.java:115)
    at com.google.devtools.build.lib.analysis.configuredtargets.RuleConfiguredTarget.<init>(RuleConfiguredTarget.java:142)
    at com.google.devtools.build.lib.analysis.RuleConfiguredTargetBuilder.build(RuleConfiguredTargetBuilder.java:174)
    at com.google.devtools.build.lib.rules.python.PyBinary.create(PyBinary.java:53)
    at com.google.devtools.build.lib.rules.python.PyBinary.create(PyBinary.java:36)
    at com.google.devtools.build.lib.analysis.ConfiguredTargetFactory.createRule(ConfiguredTargetFactory.java:323)
    at com.google.devtools.build.lib.analysis.ConfiguredTargetFactory.createConfiguredTarget(ConfiguredTargetFactory.java:207)
    at com.google.devtools.build.lib.skyframe.SkyframeBuildView.createConfiguredTarget(SkyframeBuildView.java:636)
    at com.google.devtools.build.lib.skyframe.ConfiguredTargetFunction.createConfiguredTarget(ConfiguredTargetFunction.java:783)
    at com.google.devtools.build.lib.skyframe.ConfiguredTargetFunction.compute(ConfiguredTargetFunction.java:326)
    at com.google.devtools.build.skyframe.AbstractParallelEvaluator$Evaluate.run(AbstractParallelEvaluator.java:422)
    at com.google.devtools.build.lib.concurrent.AbstractQueueVisitor$WrappedRunnable.run(AbstractQueueVisitor.java:368)
    at java.base/java.util.concurrent.ForkJoinTask$AdaptedRunnableAction.exec(Unknown Source)
    at java.base/java.util.concurrent.ForkJoinTask.doExec(Unknown Source)
    at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.localPopAndExec(Unknown Source)
    at java.base/java.util.concurrent.ForkJoinPool.runWorker(Unknown Source)
    at java.base/java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source)

INFO: Elapsed time: 873.635s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (561 packages loaded, 37254 targets configured)
Internal error thrown during build. Printing stack trace: java.lang.OutOfMemoryError: GC overhead limit exceeded
    at com.google.devtools.build.lib.analysis.configuredtargets.RuleConfiguredTarget.<init>(RuleConfiguredTarget.java:115)
    at com.google.devtools.build.lib.analysis.configuredtargets.RuleConfiguredTarget.<init>(RuleConfiguredTarget.java:142)
    at com.google.devtools.build.lib.analysis.RuleConfiguredTargetBuilder.build(RuleConfiguredTargetBuilder.java:174)
    at com.google.devtools.build.lib.rules.python.PyBinary.create(PyBinary.java:53)
    at com.google.devtools.build.lib.rules.python.PyBinary.create(PyBinary.java:36)
    at com.google.devtools.build.lib.analysis.ConfiguredTargetFactory.createRule(ConfiguredTargetFactory.java:323)
    at com.google.devtools.build.lib.analysis.ConfiguredTargetFactory.createConfiguredTarget(ConfiguredTargetFactory.java:207)
    at com.google.devtools.build.lib.skyframe.SkyframeBuildView.createConfiguredTarget(SkyframeBuildView.java:636)
    at com.google.devtools.build.lib.skyframe.ConfiguredTargetFunction.createConfiguredTarget(ConfiguredTargetFunction.java:783)
    at com.google.devtools.build.lib.skyframe.ConfiguredTargetFunction.compute(ConfiguredTargetFunction.java:326)
    at com.google.devtools.build.skyframe.AbstractParallelEvaluator$Evaluate.run(AbstractParallelEvaluator.java:422)
    at com.google.devtools.build.lib.concurrent.AbstractQueueVisitor$WrappedRunnable.run(AbstractQueueVisitor.java:368)
    at java.base/java.util.concurrent.ForkJoinTask$AdaptedRunnableAction.exec(Unknown Source)
    at java.base/java.util.concurrent.ForkJoinTask.doExec(Unknown Source)
    at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.localPopAndExec(Unknown Source)
    at java.base/java.util.concurrent.ForkJoinPool.runWorker(Unknown Source)
    at java.base/java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source)
java.lang.OutOfMemoryError: GC overhead limit exceeded
    at com.google.devtools.build.lib.analysis.configuredtargets.RuleConfiguredTarget.<init>(RuleConfiguredTarget.java:115)
    at com.google.devtools.build.lib.analysis.configuredtargets.RuleConfiguredTarget.<init>(RuleConfiguredTarget.java:142)
    at com.google.devtools.build.lib.analysis.RuleConfiguredTargetBuilder.build(RuleConfiguredTargetBuilder.java:174)
    at com.google.devtools.build.lib.rules.python.PyBinary.create(PyBinary.java:53)
    at com.google.devtools.build.lib.rules.python.PyBinary.create(PyBinary.java:36)
    at com.google.devtools.build.lib.analysis.ConfiguredTargetFactory.createRule(ConfiguredTargetFactory.java:323)
    at com.google.devtools.build.lib.analysis.ConfiguredTargetFactory.createConfiguredTarget(ConfiguredTargetFactory.java:207)
    at com.google.devtools.build.lib.skyframe.SkyframeBuildView.createConfiguredTarget(SkyframeBuildView.java:636)
    at com.google.devtools.build.lib.skyframe.ConfiguredTargetFunction.createConfiguredTarget(ConfiguredTargetFunction.java:783)
    at com.google.devtools.build.lib.skyframe.ConfiguredTargetFunction.compute(ConfiguredTargetFunction.java:326)
    at com.google.devtools.build.skyframe.AbstractParallelEvaluator$Evaluate.run(AbstractParallelEvaluator.java:422)
    at com.google.devtools.build.lib.concurrent.AbstractQueueVisitor$WrappedRunnable.run(AbstractQueueVisitor.java:368)
    at java.base/java.util.concurrent.ForkJoinTask$AdaptedRunnableAction.exec(Unknown Source)
    at java.base/java.util.concurrent.ForkJoinTask.doExec(Unknown Source)
    at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.localPopAndExec(Unknown Source)
    at java.base/java.util.concurrent.ForkJoinPool.runWorker(Unknown Source)
    at java.base/java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source)
GC overhead limit exceeded

ERROR: bazel ran out of memory and crashed.
FAILED: Build did NOT complete successfully (561 packages loaded, 37254 targets configured)

Моя вторая попытка достигла большего количества настроенных целей, но, кажется, застряла после запуска через час, поскольку теперь она увеличивает цели по одной:

Analyzing: 7629 targets (561 packages loaded, 35150 targets configured)
Analyzing: 7629 targets (561 packages loaded, 35279 targets configured)
Analyzing: 7629 targets (561 packages loaded, 35326 targets configured)
Analyzing: 7629 targets (561 packages loaded, 35340 targets configured)
Analyzing: 7629 targets (561 packages loaded, 35345 targets configured)
Analyzing: 7629 targets (561 packages loaded, 35346 targets configured)
Analyzing: 7629 targets (561 packages loaded, 35347 targets configured)
Analyzing: 7629 targets (561 packages loaded, 35347 targets configured)

Сколько времени должны пройти юнит-тесты TensorFlow? Кому-нибудь удалось запустить их в Azure, и если да, то какой образ и компьютер вы использовали?

Добро пожаловать на сайт PullRequest, где вы можете задавать вопросы и получать ответы от других членов сообщества.
...