Open MPI Version: v4.0.0
Вывод ompi_info | head
на двух машинах
mpiuser@s2:~$ ssh s1 ompi_info | head
Package: Open MPI mpiuser@s1 Distribution
Open MPI: 4.0.0
Open MPI repo revision: v4.0.0
Open MPI release date: Nov 12, 2018
Open RTE: 4.0.0
Open RTE repo revision: v4.0.0
Open RTE release date: Nov 12, 2018
OPAL: 4.0.0
OPAL repo revision: v4.0.0
OPAL release date: Nov 12, 2018
mpiuser@s2:~$ ompi_info | head
Package: Open MPI mpiuser@s2 Distribution
Open MPI: 4.0.0
Open MPI repo revision: v4.0.0
Open MPI release date: Nov 12, 2018
Open RTE: 4.0.0
Open RTE repo revision: v4.0.0
Open RTE release date: Nov 12, 2018
OPAL: 4.0.0
OPAL repo revision: v4.0.0
OPAL release date: Nov 12, 2018
Оба установлены с использованием общей общей сети.
во время выполнения командына s1 (ведущий)
mpiuser@s1:/disk3/cloud/openmpi-4.0.0/examples$ mpirun -n 2 ./hello
Hello, world, I am 1 of 2, (Open MPI v4.0.0, package: Open MPI mpiuser@s1 Distribution, ident: 4.0.0, repo rev: v4.0.0, Nov 12, 2018, 112)
Hello, world, I am 0 of 2, (Open MPI v4.0.0, package: Open MPI mpiuser@s1 Distribution, ident: 4.0.0, repo rev: v4.0.0, Nov 12, 2018, 112)
при выполнении команды отдельно на s2 (ведомый)
mpiuser@s2:~/cloud$ mpirun -n 2 ./hello
Hello, world, I am 0 of 2, (Open MPI v4.0.0, package: Open MPI mpiuser@s2 Distribution, ident: 4.0.0, repo rev: v4.0.0, Nov 12, 2018, 113)
Hello, world, I am 1 of 2, (Open MPI v4.0.0, package: Open MPI mpiuser@s2 Distribution, ident: 4.0.0, repo rev: v4.0.0, Nov 12, 2018, 113)
Вывод команды hwloc
на s2:
mpiuser@s2:~/cloud/openmpi-4.0.0$ dpkg -l | grep hwloc
mpiuser@s2:~/cloud/openmpi-4.0.0$
Вывод команды hwloc
для s1:
mpiuser@s1:/disk3/cloud/openmpi-4.0.0/examples$ dpkg -l | grep hwloc
mpiuser@s1:/disk3/cloud/openmpi-4.0.0/examples$
Обе машины работают на Ubuntu 16.04.5 LTS
, но при выполнении команды на распределенной системе выдает следующую ошибку
mpiuser@s1:/disk3/cloud/openmpi-4.0.0/examples$ mpirun -host s1,s2 ./hello
[s2:26283] [[40517,0],1] ORTE_ERROR_LOG: Data unpack would read past end of buffer in file grpcomm_direct.c at line 355
--------------------------------------------------------------------------
An internal error has occurred in ORTE:
[[40517,0],1] FORCE-TERMINATE AT Data unpack would read past end of buffer:-26 - error grpcomm_direct.c(359)
This is something that should be reported to the developers.
--------------------------------------------------------------------------