RuntimeError: попытка десериализации объекта на устройстве CUDA, но torch.cuda.is_available () имеет значение False, ошибка Dataloader и установка pin_memory = False - PullRequest
0 голосов
/ 29 мая 2020

Я новичок, пытаясь оценить этот документ о сети сегментации видеообъектов.

При следовании инструкциям на https://github.com/seoungwugoh/STM

В нем говорится, что требования следующие : -

python 3.6
pytorch 1.0.1.post2
numpy, opencv, pillow

Мне не удалось установить эту версию pytorch, поэтому я установил conda-forge pytorch версии 1.5.

и запускаю эту команду в любом Windows 10 или Ubuntu 16.04 с использованием Anaconda

(STMVOS) oneworld@oneworld:~/Documents/VideoObjectSegmentation/STMVOS$ python eval_DAVIS.py -g '1' -s val -y 16 -D ../DAVISSemiSupervisedTrainVal480

после выполнения pip install matplotlib и pip install tqdm ...

Я получаю следующее сообщение об ошибке: -

Space-time Сети памяти: инициализированы. STM: Тестирование в DAVIS. Веса загрузки: STM_weights.pth Traceback (последний вызов последним):

Файл «eval_DAVIS.py», строка 111, в model.load_state_dict (torch.load (pth_path))

Файл "/home/oneworld/anaconda3/envs/STMVOS/lib/python3.8/site-packages/torch/serialization.py", строка 593, при загрузке возвращает _legacy_load (open_file, map_location, pickle_module, * * pickle_load_args)

Файл "/home/oneworld/anaconda3/envs/STMVOS/lib/python3.8/site-packages/torch/serialization.py", строка 773, в _legacy_load result = unpickler. load ()

Файл "/home/oneworld/anaconda3/envs/STMVOS/lib/python3.8/site-packages/torch/serialization.py", строка 729, в persistent_load

deserialized_objects [root_key] = restore_location (obj, location)

File "/home/oneworld/anaconda3/envs/STMVOS/lib/python3.8/site-packages/torch/serialization.py" , строка 178, в default_restore_location result = fn (storage, location)

File "/ home / oneworld / anaconda3 / envs / STMVOS / lib / * 107 3 * .8 / site-packages / torch / serialization.py ", строка 154, в _cuda_deserialize device = validate_cuda_device (location)

File" / home / oneworld / anaconda3 / envs / STMVOS / lib / python3 .8 / site-packages / torch / serialization.py ", строка 138, в validate_cuda_device, поднять RuntimeError ('Попытка десериализации объекта на CUDA'

RuntimeError: Попытка десериализации объекта на устройстве CUDA, но torch .cuda.is_available () имеет значение False. Если вы работаете на машине с центральным процессором, используйте torch.load с map_location = torch.device ('cpu') для сопоставления ваших хранилищ с процессором

Драйвер моей видеокарты, а также система и пакеты следующим образом: -

(STMVOS) oneworld@oneworld:~/Documents/VideoObjectSegmentation/STMVOS$ nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.64.00    Driver Version: 440.64.00    CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1070    Off  | 00000000:01:00.0  On |                  N/A |
| 26%   34C    P8    10W / 151W |    392MiB /  8118MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1247      G   /usr/lib/xorg/Xorg                           229MiB |
|    0      2239      G   compiz                                       126MiB |
|    0      9385      G   /usr/lib/firefox/firefox                       2MiB |
|    0     11686      G   /proc/self/exe                                30MiB |
+-----------------------------------------------------------------------------+

Я также пробовал этот

(STMVOS) oneworld@oneworld:~/Documents/VideoObjectSegmentation/STMVOS$ python -c 'import torch; print(torch.rand(2,3).cuda())'

тензор ([[0,9178, 0,8239, 0,4761], [0,9429, 0,8877, 0,0097]], устройство = 'cuda: 0')

Это показывает, что cuda здесь работает

(STMVOS) oneworld@oneworld:~/Documents/VideoObjectSegmentation/STMVOS$ conda info
    active environment : STMVOS
    active env location : /home/oneworld/anaconda3/envs/STMVOS
            shell level : 1
       user config file : /home/oneworld/.condarc
 populated config files : 
          conda version : 4.8.2
    conda-build version : 3.18.11
         python version : 3.7.6.final.0
       virtual packages : __cuda=10.2
                          __glibc=2.23
       base environment : /home/oneworld/anaconda3  (writable)
           channel URLs : https://repo.anaconda.com/pkgs/main/linux-64
                          https://repo.anaconda.com/pkgs/main/noarch
                          https://repo.anaconda.com/pkgs/r/linux-64
                          https://repo.anaconda.com/pkgs/r/noarch
          package cache : /home/oneworld/anaconda3/pkgs
                          /home/oneworld/.conda/pkgs
       envs directories : /home/oneworld/anaconda3/envs
                          /home/oneworld/.conda/envs
               platform : linux-64
             user-agent : conda/4.8.2 requests/2.22.0 CPython/3.7.6 Linux/4.4.0-179-generic ubuntu/16.04.6 glibc/2.23
                UID:GID : 1000:1000
             netrc file : None
           offline mode : False
(STMVOS) oneworld@oneworld:~/Documents/VideoObjectSegmentation/STMVOS$ conda list

пакеты в среде в / home / oneworld / anaconda3 / envs / STMVOS:

Name Version Build Channel _libgcc_mutex 0.1 main<br> blas 1.0 mkl<br> bzip2 1.0.8 h516909a_2 conda-forge ca-certificates 2020.4.5.1 hecc5488_0 conda-forge cairo 1.16.0 hcf35c78_1003 conda-forge certifi 2020.4.5.1 py38_0<br> cudatoolkit 10.2.89 hfd86e86_1<br> cycler 0.10.0 pypi_0 pypi dbus 1.13.6 he372182_0 conda-forge expat 2.2.9 he1b5a44_2 conda-forge ffmpeg 4.2.3 h167e202_0 conda-forge fontconfig 2.13.1 h86ecdb6_1001 conda-forge freetype 2.9.1 h8a8886c_1<br> gettext 0.19.8.1 hc5be6a0_1002 conda-forge giflib 5.2.1 h516909a_2 conda-forge glib 2.64.3 h6f030ca_0 conda-forge gmp 6.2.0 he1b5a44_2 conda-forge gnutls 3.6.5 hd3a4fd2_1002 conda-forge graphite2 1.3.13 he1b5a44_1001 conda-forge gst-plugins-base 1.14.5 h0935bb2_2 conda-forge gstreamer 1.14.5 h36ae1b5_2 conda-forge harfbuzz 2.4.0 h9f30f68_3 conda-forge hdf5 1.10.6 nompi_h3c11f04_100 conda-forge icu 64.2 he1b5a44_1 conda-forge intel-openmp 2020.1 217<br> jasper 1.900.1 h07fcdf6_1006 conda-forge jpeg 9c h14c3975_1001 conda-forge kiwisolver 1.2.0 pypi_0 pypi lame 3.100 h14c3975_1001 conda-forge ld_impl_linux-64 2.33.1 h53a641e_7<br> libblas 3.8.0 15_mkl conda-forge libcblas 3.8.0 15_mkl conda-forge libclang 9.0.1 default_hde54327_0 conda-forge libedit 3.1.20181209 hc058e9b_0<br> libffi 3.2.1 he1b5a44_1007 conda-forge libgcc-ng 9.1.0 hdf63c60_0<br> libgfortran-ng 7.3.0 hdf63c60_0<br> libiconv 1.15 h516909a_1006 conda-forge liblapack 3.8.0 15_mkl conda-forge liblapacke 3.8.0 15_mkl conda-forge libllvm9 9.0.1 he513fc3_1 conda-forge libopencv 4.2.0 py38_6 conda-forge libpng 1.6.37 hbc83047_0<br> libstdcxx-ng 9.1.0 hdf63c60_0<br> libtiff 4.1.0 h2733197_0<br> libuuid 2.32.1 h14c3975_1000 conda-forge libwebp 1.0.2 h56121f0_5 conda-forge libxcb 1.13 h14c3975_1002 conda-forge libxkbcommon 0.10.0 he1b5a44_0 conda-forge libxml2 2.9.10 hee79883_0 conda-forge matplotlib 3.2.1 pypi_0 pypi mkl 2020.1 217<br> mkl-service 2.3.0 py38he904b0f_0<br> mkl_fft 1.0.15 py38ha843d7b_0<br> mkl_random 1.1.1 py38h0573a6f_0<br> ncurses 6.2 he6710b0_1<br> nettle 3.4.1 h1bed415_1002 conda-forge ninja 1.9.0 py38hfd86e86_0<br> nspr 4.25 he1b5a44_0 conda-forge nss 3.47 he751ad9_0 conda-forge numpy 1.18.1 py38h4f9e942_0<br> numpy-base 1.18.1 py38hde5b4d6_1<br> olefile 0.46 py_0<br> opencv 4.2.0 py38_6 conda-forge openh264 2.1.1 h8b12597_0 conda-forge openssl 1.1.1g h516909a_0 conda-forge pcre 8.44 he1b5a44_0 conda-forge pillow 7.1.2 py38hb39fc2d_0<br> pip 20.0.2 py38_3<br> pixman 0.38.0 h516909a_1003 conda-forge pthread-stubs 0.4 h14c3975_1001 conda-forge py-opencv 4.2.0 py38h23f93f0_6 conda-forge pyparsing 2.4.7 pypi_0 pypi python 3.8.1 h0371630_1<br> python-dateutil 2.8.1 pypi_0 pypi python_abi 3.8 1_cp38 conda-forge pytorch 1.5.0 py3.8_cuda10.2.89_cudnn7.6.5_0 pytorch qt 5.12.5 hd8c4c69_1 conda-forge readline 7.0 h7b6447c_5<br> setuptools 46.4.0 py38_0<br> six 1.14.0 py38_0<br> sqlite 3.31.1 h62c20be_1<br> tk 8.6.8 hbc83047_0<br> torchvision 0.6.0 py38_cu102 pytorch tqdm 4.46.0 pypi_0 pypi wheel 0.34.2 py38_0<br> x264 1!152.20180806 h14c3975_0 conda-forge xorg-kbproto 1.0.7 h14c3975_1002 conda-forge xorg-libice 1.0.10 h516909a_0 conda-forge xorg-libsm 1.2.3 h84519dc_1000 conda-forge xorg-libx11 1.6.9 h516909a_0 conda-forge xorg-libxau 1.0.9 h14c3975_0 conda-forge xorg-libxdmcp 1.1.3 h516909a_0 conda-forge xorg-libxext 1.3.4 h516909a_0 conda-forge xorg-libxrender 0.9.10 h516909a_1002 conda-forge xorg-renderproto 0.11.1 h14c3975_1002 conda-forge xorg-xextproto 7.3.0 h14c3975_1002 conda-forge xorg-xproto 7.0.31 h14c3975_1007 conda-forge xz 5.2.5 h7b6447c_0<br> zlib 1.2.11 h7b6447c_3<br> zstd 1.3.7 h0b5b093_0

Код, в котором он застревает в eval_DAVIS.py, выглядит следующим образом: -

print('Loading weights:', pth_path)
model.load_state_dict(torch.load(pth_path))

Я использую Ubuntu 16.04, однако я пробовал аналогичную настройку в windows 10 и получил те же сообщения об ошибках.

Любая помощь очень ценится.

С уважением

OneWorld

Ответы [ 4 ]

0 голосов
/ 08 июня 2020

из-за предложения ошибки Python

if __name__ == '__main__':
    freeze_support()

Я добавил эту строку

if __name__ == '__main__':

над строкой

for seq, V in enumerate(Testloader):

и отступил эту строку и все остальное ниже.

Затем все работало до конца [упаковки велосипеда]

Однако перед [черным лебедем] запрашивалась установка scipy

Итак, я сделал conda установите scipy

и перезапустите, и он начал go через остальные [bmx-tree], [breakdance] et c.

Получившийся файл eval_DAVIS.py выглядел так ...

from __future__ import division
import torch
from torch.autograd import Variable
from torch.utils import data

import torch.nn as nn
import torch.nn.functional as F
import torch.nn.init as init
import torch.utils.model_zoo as model_zoo
from torchvision import models

# general libs
import cv2
import matplotlib.pyplot as plt
from PIL import Image
import numpy as np
import math
import time
import tqdm
import os
import argparse
import copy


### My libs
from dataset import DAVIS_MO_Test
from model import STM


torch.set_grad_enabled(False) # Volatile

# def get_arguments():
#     parser = argparse.ArgumentParser(description="SST")
#     parser.add_argument("-g", type=str, help="0; 0,1; 0,3; etc", required=True)
#     parser.add_argument("-s", type=str, help="set", required=True)
#     parser.add_argument("-y", type=int, help="year", required=True)
#     parser.add_argument("-viz", help="Save visualization", action="store_true")
#     parser.add_argument("-D", type=str, help="path to data",default='/local/DATA')
#     return parser.parse_args()

# args = get_arguments()

# GPU = args.g
# YEAR = args.y
# SET = args.s
# VIZ = args.viz
# DATA_ROOT = args.D

GPU = '0'
YEAR = '17'
SET = 'val'
VIZ = 'store_true'
DATA_ROOT = '..\\DAVIS2017SemiSupervisedTrainVal480'

# Model and version
MODEL = 'STM'
print(MODEL, ': Testing on DAVIS')

os.environ['CUDA_VISIBLE_DEVICES'] = GPU
if torch.cuda.is_available():
    print('using Cuda devices, num:', torch.cuda.device_count())

if VIZ:
    print('--- Produce mask overaid video outputs. Evaluation will run slow.')
    print('--- Require FFMPEG for encoding, Check folder ./viz')


palette = Image.open(DATA_ROOT + '/Annotations/480p/blackswan/00000.png').getpalette()

def Run_video(Fs, Ms, num_frames, num_objects, Mem_every=None, Mem_number=None):
    # initialize storage tensors
    if Mem_every:
        to_memorize = [int(i) for i in np.arange(0, num_frames, step=Mem_every)]
    elif Mem_number:
        to_memorize = [int(round(i)) for i in np.linspace(0, num_frames, num=Mem_number+2)[:-1]]
    else:
        raise NotImplementedError

    Es = torch.zeros_like(Ms)
    Es[:,:,0] = Ms[:,:,0]

    for t in tqdm.tqdm(range(1, num_frames)):
        # memorize
        with torch.no_grad():
            prev_key, prev_value = model(Fs[:,:,t-1], Es[:,:,t-1], torch.tensor([num_objects])) 

        if t-1 == 0: # 
            this_keys, this_values = prev_key, prev_value # only prev memory
        else:
            this_keys = torch.cat([keys, prev_key], dim=3)
            this_values = torch.cat([values, prev_value], dim=3)

        # segment
        with torch.no_grad():
            logit = model(Fs[:,:,t], this_keys, this_values, torch.tensor([num_objects]))
        Es[:,:,t] = F.softmax(logit, dim=1)

        # update
        if t-1 in to_memorize:
            keys, values = this_keys, this_values

    pred = np.argmax(Es[0].cpu().numpy(), axis=0).astype(np.uint8)
    return pred, Es



Testset = DAVIS_MO_Test(DATA_ROOT, resolution='480p', imset='20{}/{}.txt'.format(YEAR,SET), single_object=(YEAR==16))
Testloader = data.DataLoader(Testset, batch_size=1, shuffle=False, num_workers=2, pin_memory=True)

model = nn.DataParallel(STM())
if torch.cuda.is_available():
    model.cuda()
model.eval() # turn-off BN

pth_path = 'STM_weights.pth'
print('Loading weights:', pth_path)
model.load_state_dict(torch.load(pth_path)) # , map_location=torch.device('cpu')

code_name = '{}_DAVIS_{}{}'.format(MODEL,YEAR,SET)
print('Start Testing:', code_name)

if torch.cuda.is_available() == False:
    print("********** CUDA is NOT available just before line of error **********")
else:
    print("********** CUDA is available, and working fine just before line of error ***********")

if __name__ == '__main__':

    for seq, V in enumerate(Testloader):
        Fs, Ms, num_objects, info = V
        seq_name = info['name'][0]
        num_frames = info['num_frames'][0].item()
        print('[{}]: num_frames: {}, num_objects: {}'.format(seq_name, num_frames, num_objects[0][0]))

        pred, Es = Run_video(Fs, Ms, num_frames, num_objects, Mem_every=5, Mem_number=None)

        # Save results for quantitative eval ######################
        test_path = os.path.join('./test', code_name, seq_name)
        if not os.path.exists(test_path):
            os.makedirs(test_path)
        for f in range(num_frames):
            img_E = Image.fromarray(pred[f])
            img_E.putpalette(palette)
            img_E.save(os.path.join(test_path, '{:05d}.png'.format(f)))

        if VIZ:
            from helpers import overlay_davis
            # visualize results #######################
            viz_path = os.path.join('./viz/', code_name, seq_name)
            if not os.path.exists(viz_path):
                os.makedirs(viz_path)

            for f in range(num_frames):
                pF = (Fs[0,:,f].permute(1,2,0).numpy() * 255.).astype(np.uint8)
                pE = pred[f]
                canvas = overlay_davis(pF, pE, palette)
                canvas = Image.fromarray(canvas)
                canvas.save(os.path.join(viz_path, 'f{}.jpg'.format(f)))

            vid_path = os.path.join('./viz/', code_name, '{}.mp4'.format(seq_name))
            frame_path = os.path.join('./viz/', code_name, seq_name, 'f%d.jpg')
            os.system('ffmpeg -framerate 10 -i {} {} -vcodec libx264 -crf 10  -pix_fmt yuv420p  -nostats -loglevel 0 -y'.format(frame_path, vid_path))

Однако ...

В конце концов я получил ошибку нехватки памяти

[car-shadow]: num_frames: 40, num_objects: 1
100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 39/39 [00:09<00:00,  3.98it/s]
Traceback (most recent call last):
  File "eval_DAVIS.py", line 129, in <module>
    for seq, V in enumerate(Testloader):
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\site-packages\torch\utils\data\dataloader.py", line 345, in __next__
    data = self._next_data()
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\site-packages\torch\utils\data\dataloader.py", line 856, in _next_data
    return self._process_data(data)
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\site-packages\torch\utils\data\dataloader.py", line 881, in _process_data
    data.reraise()
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\site-packages\torch\_utils.py", line 395, in reraise
    raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in pin memory thread for device 0.
Original Traceback (most recent call last):
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\site-packages\torch\utils\data\_utils\pin_memory.py", line 31, in _pin_memory_loop
    data = pin_memory(data)
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\site-packages\torch\utils\data\_utils\pin_memory.py", line 55, in pin_memory
    return [pin_memory(sample) for sample in data]
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\site-packages\torch\utils\data\_utils\pin_memory.py", line 55, in <listcomp>
    return [pin_memory(sample) for sample in data]
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\site-packages\torch\utils\data\_utils\pin_memory.py", line 47, in pin_memory
    return data.pin_memory()
RuntimeError: cuda runtime error (2) : out of memory at ..\aten\src\THC\THCCachingHostAllocator.cpp:278

, поэтому я установил тестовый загрузчик с pin_memory = True на false примерно в строке 108 в eval_DAVIS.py

Testloader = data.DataLoader(Testset, batch_size=1, shuffle=False, num_workers=2, pin_memory=False)

и повторно.

Кажется, все работает нормально.

0 голосов
/ 01 июня 2020

Итак, из-за ошибки и рекомендации, что Python вырвало.

RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU

Я попытался отредактировать код в строке 111 в eval_DAVIS.py из этого

model.load_state_dict(torch.load(pth_path))

на это

model.load_state_dict(torch.load(pth_path, map_location=torch.device('cpu')))

, а затем повторно введите код.

(STMVOS) C:\Users\OneWorld\Documents\DeepLearning\VideoObjectSegmentation\STMVOS>python eval_DAVIS.py -g '0' -s val -y 17 -D C:\Users\OneWorld\Documents\DeepLearning\VideoObjectSegmentation\DAVIS2017SemiSupervisedTrainVal480

, который выдерживает нагрузку на вес.

Space-Time Memory Networks: initialized.
STM : Testing on DAVIS
Loading weights: STM_weights.pth
Start Testing: STM_DAVIS_17val
Space-Time Memory Networks: initialized.
STM : Testing on DAVIS
Space-Time Memory Networks: initialized.
STM : Testing on DAVIS
Loading weights: STM_weights.pth
Loading weights: STM_weights.pth

Однако, когда он начинает тестирование, появляется следующая ошибка: -

Start Testing: STM_DAVIS_17val
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\multiprocessing\spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\multiprocessing\spawn.py", line 125, in _main
    prepare(preparation_data)
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\multiprocessing\spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\multiprocessing\spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\runpy.py", line 265, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\runpy.py", line 97, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\OneWorld\Documents\DeepLearning\VideoObjectSegmentation\STMVOS\eval_DAVIS.py", line 117, in <module>
    for seq, V in enumerate(Testloader):
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\site-packages\torch\utils\data\dataloader.py", line 279, in __iter__
    return _MultiProcessingDataLoaderIter(self)
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\site-packages\torch\utils\data\dataloader.py", line 719, in __init__
    w.start()
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\multiprocessing\process.py", line 121, in start
    self._popen = self._Popen(self)
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\multiprocessing\context.py", line 224, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\multiprocessing\context.py", line 326, in _Popen
    return Popen(process_obj)
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\multiprocessing\popen_spawn_win32.py", line 45, in __init__
    prep_data = spawn.get_preparation_data(process_obj._name)
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\multiprocessing\spawn.py", line 154, in get_preparation_data
    _check_not_importing_main()
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\multiprocessing\spawn.py", line 134, in _check_not_importing_main
    raise RunTimeError('''
RunTimeError:
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.
Start Testing: STM_DAVIS_17val
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\multiprocessing\spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\multiprocessing\spawn.py", line 125, in _main
    prepare(preparation_data)
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\multiprocessing\spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\multiprocessing\spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\runpy.py", line 265, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\runpy.py", line 97, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\OneWorld\Documents\DeepLearning\VideoObjectSegmentation\STMVOS\eval_DAVIS.py", line 117, in <module>
    for seq, V in enumerate(Testloader):
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\site-packages\torch\utils\data\dataloader.py", line 279, in __iter__
    return _MultiProcessingDataLoaderIter(self)
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\site-packages\torch\utils\data\dataloader.py", line 719, in __init__
    w.start()
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\multiprocessing\process.py", line 121, in start
    self._popen = self._Popen(self)
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\multiprocessing\context.py", line 224, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\multiprocessing\context.py", line 326, in _Popen
    return Popen(process_obj)
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\multiprocessing\popen_spawn_win32.py", line 45, in __init__
    prep_data = spawn.get_preparation_data(process_obj._name)
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\multiprocessing\spawn.py", line 154, in get_preparation_data
    _check_not_importing_main()
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\multiprocessing\spawn.py", line 134, in _check_not_importing_main
    raise RunTimeError('''
RunTimeError:
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.
Traceback (most recent call last):
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\site-packages\torch\utils\data\dataloader.py", line 761, in _try_get_data
    data = self._data_queue.get(OneWorldeout=OneWorldeout)
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\multiprocessing\queues.py", line 108, in get
    raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "eval_DAVIS.py", line 117, in <module>
    for seq, V in enumerate(Testloader):
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\site-packages\torch\utils\data\dataloader.py", line 345, in __next__
    data = self._next_data()
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\site-packages\torch\utils\data\dataloader.py", line 841, in _next_data
    idx, data = self._get_data()
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\site-packages\torch\utils\data\dataloader.py", line 808, in _get_data
    success, data = self._try_get_data()
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\site-packages\torch\utils\data\dataloader.py", line 774, in _try_get_data
    raise RunTimeError('DataLoader worker (pid(s) {}) exited unexpectedly'.format(pids_str))
RunTimeError: DataLoader worker (pid(s) 2412, 15788) exited unexpectedly

Это было используя Anaconda, поэтому приведенная ниже ошибка просто использует windows командную консоль и pip

(env) C:\Users\OneWorld\Documents\DeepLearning\VideoObjectSegmentation\STMVOS>python eval_DAVIS.py -g '0' -s val -y 17 -D C:\Users\OneWorld\Documents\DeepLearning\VideoObjectSegmentation\DAVIS2017SemiSupervisedTrainVal480
Space-OneWorlde Memory Networks: initialized.
STM : Testing on DAVIS
Loading weights: STM_weights.pth
Start Testing: STM_DAVIS_17val
Space-OneWorlde Memory Networks: initialized.
STM : Testing on DAVIS
Space-OneWorlde Memory Networks: initialized.
STM : Testing on DAVIS
Loading weights: STM_weights.pth
Loading weights: STM_weights.pth
Start Testing: STM_DAVIS_17val
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Users\OneWorld\AppData\Local\Programs\Python\Python37\lib\multiprocessing\spawn.py", line 105, in spawn_main
    exitcode = _main(fd)
  File "C:\Users\OneWorld\AppData\Local\Programs\Python\Python37\lib\multiprocessing\spawn.py", line 114, in _main
    prepare(preparation_data)
  File "C:\Users\OneWorld\AppData\Local\Programs\Python\Python37\lib\multiprocessing\spawn.py", line 225, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "C:\Users\OneWorld\AppData\Local\Programs\Python\Python37\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path
    run_name="__mp_main__")
  File "C:\Users\OneWorld\AppData\Local\Programs\Python\Python37\lib\runpy.py", line 263, in run_path
    pkg_name=pkg_name, script_name=fname)
  File "C:\Users\OneWorld\AppData\Local\Programs\Python\Python37\lib\runpy.py", line 96, in _run_module_code
    mod_name, mod_spec, pkg_name, script_name)
  File "C:\Users\OneWorld\AppData\Local\Programs\Python\Python37\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\Users\OneWorld\Documents\DeepLearning\VideoObjectSegmentation\STMVOS\eval_DAVIS.py", line 117, in <module>
    for seq, V in enumerate(Testloader):
  File "C:\Users\OneWorld\Documents\DeepLearning\VideoObjectSegmentation\STMVOS\env\lib\site-packages\torch\utils\data\dataloader.py", line 279, in __iter__
    return _MultiProcessingDataLoaderIter(self)
  File "C:\Users\OneWorld\Documents\DeepLearning\VideoObjectSegmentation\STMVOS\env\lib\site-packages\torch\utils\data\dataloader.py", line 719, in __init__
    w.start()
  File "C:\Users\OneWorld\AppData\Local\Programs\Python\Python37\lib\multiprocessing\process.py", line 112, in start
    self._popen = self._Popen(self)
  File "C:\Users\OneWorld\AppData\Local\Programs\Python\Python37\lib\multiprocessing\context.py", line 223, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "C:\Users\OneWorld\AppData\Local\Programs\Python\Python37\lib\multiprocessing\context.py", line 322, in _Popen
    return Popen(process_obj)
  File "C:\Users\OneWorld\AppData\Local\Programs\Python\Python37\lib\multiprocessing\popen_spawn_win32.py", line 46, in __init__
    prep_data = spawn.get_preparation_data(process_obj._name)
  File "C:\Users\OneWorld\AppData\Local\Programs\Python\Python37\lib\multiprocessing\spawn.py", line 143, in get_preparation_data
    _check_not_importing_main()
  File "C:\Users\OneWorld\AppData\Local\Programs\Python\Python37\lib\multiprocessing\spawn.py", line 136, in _check_not_importing_main
    is not going to be frozen to produce an executable.''')
RunTimeError:
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.
Start Testing: STM_DAVIS_17val
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Users\OneWorld\AppData\Local\Programs\Python\Python37\lib\multiprocessing\spawn.py", line 105, in spawn_main
    exitcode = _main(fd)
  File "C:\Users\OneWorld\AppData\Local\Programs\Python\Python37\lib\multiprocessing\spawn.py", line 114, in _main
    prepare(preparation_data)
  File "C:\Users\OneWorld\AppData\Local\Programs\Python\Python37\lib\multiprocessing\spawn.py", line 225, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "C:\Users\OneWorld\AppData\Local\Programs\Python\Python37\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path
    run_name="__mp_main__")
  File "C:\Users\OneWorld\AppData\Local\Programs\Python\Python37\lib\runpy.py", line 263, in run_path
    pkg_name=pkg_name, script_name=fname)
  File "C:\Users\OneWorld\AppData\Local\Programs\Python\Python37\lib\runpy.py", line 96, in _run_module_code
    mod_name, mod_spec, pkg_name, script_name)
  File "C:\Users\OneWorld\AppData\Local\Programs\Python\Python37\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\Users\OneWorld\Documents\DeepLearning\VideoObjectSegmentation\STMVOS\eval_DAVIS.py", line 117, in <module>
    for seq, V in enumerate(Testloader):
  File "C:\Users\OneWorld\Documents\DeepLearning\VideoObjectSegmentation\STMVOS\env\lib\site-packages\torch\utils\data\dataloader.py", line 279, in __iter__
    return _MultiProcessingDataLoaderIter(self)
  File "C:\Users\OneWorld\Documents\DeepLearning\VideoObjectSegmentation\STMVOS\env\lib\site-packages\torch\utils\data\dataloader.py", line 719, in __init__
    w.start()
  File "C:\Users\OneWorld\AppData\Local\Programs\Python\Python37\lib\multiprocessing\process.py", line 112, in start
    self._popen = self._Popen(self)
  File "C:\Users\OneWorld\AppData\Local\Programs\Python\Python37\lib\multiprocessing\context.py", line 223, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "C:\Users\OneWorld\AppData\Local\Programs\Python\Python37\lib\multiprocessing\context.py", line 322, in _Popen
    return Popen(process_obj)
  File "C:\Users\OneWorld\AppData\Local\Programs\Python\Python37\lib\multiprocessing\popen_spawn_win32.py", line 46, in __init__
    prep_data = spawn.get_preparation_data(process_obj._name)
  File "C:\Users\OneWorld\AppData\Local\Programs\Python\Python37\lib\multiprocessing\spawn.py", line 143, in get_preparation_data
    _check_not_importing_main()
  File "C:\Users\OneWorld\AppData\Local\Programs\Python\Python37\lib\multiprocessing\spawn.py", line 136, in _check_not_importing_main
    is not going to be frozen to produce an executable.''')
RunTimeError:
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.
Traceback (most recent call last):
  File "C:\Users\OneWorld\Documents\DeepLearning\VideoObjectSegmentation\STMVOS\env\lib\site-packages\torch\utils\data\dataloader.py", line 761, in _try_get_data
    data = self._data_queue.get(Timeout=Timeout)
  File "C:\Users\OneWorld\AppData\Local\Programs\Python\Python37\lib\multiprocessing\queues.py", line 105, in get
    raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "eval_DAVIS.py", line 117, in <module>
    for seq, V in enumerate(Testloader):
  File "C:\Users\OneWorld\Documents\DeepLearning\VideoObjectSegmentation\STMVOS\env\lib\site-packages\torch\utils\data\dataloader.py", line 345, in __next__
    data = self._next_data()
  File "C:\Users\OneWorld\Documents\DeepLearning\VideoObjectSegmentation\STMVOS\env\lib\site-packages\torch\utils\data\dataloader.py", line 841, in _next_data
    idx, data = self._get_data()
  File "C:\Users\OneWorld\Documents\DeepLearning\VideoObjectSegmentation\STMVOS\env\lib\site-packages\torch\utils\data\dataloader.py", line 808, in _get_data
    success, data = self._try_get_data()
  File "C:\Users\OneWorld\Documents\DeepLearning\VideoObjectSegmentation\STMVOS\env\lib\site-packages\torch\utils\data\dataloader.py", line 774, in _try_get_data
    raise RunOneWorldeError('DataLoader worker (pid(s) {}) exited unexpectedly'.format(pids_str))
RunTimeError: DataLoader worker (pid(s) 11448, 16644) exited unexpectedly

. Я также поместил этот код в небольшой файл под названием CUDATest.py, чтобы проверить, что факел будет выполнять простое умножение матриц функции.

# testing CUDA
import torch
device = torch.cuda.current_device()

n = 10
# 1D inputs, same as torch.dot
a = torch.rand(n).to(device)
b = torch.rand(n).to(device)
result = torch.matmul(a, b) # torch.Size([])

print("matmul result = ", result)

Я запустил код следующим образом: -

(env)C:\Users\OneWorld\Documents\DeepLearning\VideoObjectSegmentation\STMVOS>python CUDATest.py

Результат был следующим: -

matmul result =  tensor(2.4603, device='cuda:0')

Это говорит о том, что мои CUDA и Pytorch работает нормально.

0 голосов
/ 08 июня 2020

Я изменил свою python версию с 3.8. до 3.6, используя conda-forge для установки и для переустановки matplotlib.

Я запустил код eval_DAVIS.py в режиме отладки в MSVSCode вместо комментирования аргументов из командной строки, как показано ниже: -

# def get_arguments():
#     parser = argparse.ArgumentParser(description="SST")
#     parser.add_argument("-g", type=str, help="0; 0,1; 0,3; etc", required=True)
#     parser.add_argument("-s", type=str, help="set", required=True)
#     parser.add_argument("-y", type=int, help="year", required=True)
#     parser.add_argument("-viz", help="Save visualization", action="store_true")
#     parser.add_argument("-D", type=str, help="path to data",default='/local/DATA')
#     return parser.parse_args()

# args = get_arguments()

# GPU = args.g
# YEAR = args.y
# SET = args.s
# VIZ = args.viz
# DATA_ROOT = args.D

GPU = '0'
YEAR = '17'
SET = 'val'
VIZ = 'store_true'
DATA_ROOT = '..\\DAVIS2017SemiSupervisedTrainVal480'

Над строкой

for seq, V in enumerate(Testloader):

Я написал это, чтобы проверить, есть ли проблема с доступностью cuda. ​​

if torch.cuda.is_available() == False:
    print("********** CUDA is NOT available just before line of error **********")
else:
    print("********** CUDA is available, and working fine just before line of error ***********")

это создает следующий журнал терминала

Space-time Memory Networks: initialized.
STM : Testing on DAVIS
using Cuda devices, num: 1
--- Produce mask overaid video outputs. Evaluation will run slow.
--- Require FFMPEG for encoding, Check folder ./viz
Loading weights: STM_weights.pth
Start Testing: STM_DAVIS_17val
********** CUDA is available, and working fine just before line of error ***********
Space-time Memory Networks: initialized.
STM : Testing on DAVIS
using Cuda devices, num: 1
--- Produce mask overaid video outputs. Evaluation will run slow.
--- Require FFMPEG for encoding, Check folder ./viz
Space-time Memory Networks: initialized.
STM : Testing on DAVIS
using Cuda devices, num: 1
--- Produce mask overaid video outputs. Evaluation will run slow.
--- Require FFMPEG for encoding, Check folder ./viz
Loading weights: STM_weights.pth
Loading weights: STM_weights.pth
Start Testing: STM_DAVIS_17val
********** CUDA is available, and working fine just before line of error ***********
Start Testing: STM_DAVIS_17val
********** CUDA is available, and working fine just before line of error ***********

он попадает в эту строку кода

for seq, V in enumerate(Testloader):

и выдает следующее сообщение об ошибке: -

Exception has occurred: RuntimeError

        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.
  File "C:\Users\OneWorld\Documents\DeepLearning\VideoObjectSegmentation\STMVOS\eval_DAVIS.py", line 127, in <module>
    for seq, V in enumerate(Testloader):
  File "<string>", line 1, in <module>

Итак, это избавило от ошибки CUDA , без необходимости переключать код для использования ЦП.

Однако это по-прежнему вызывает ошибку freeze_support () ...

, а в журналах указывается ошибка загрузчика данных: -

Traceback (most recent call last):
  File "eval_DAVIS.py", line 127, in <module>
    for seq, V in enumerate(Testloader):
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\site-packages\torch\utils\data\dataloader.py", line 345, in __next__
    data = self._next_data()
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\site-packages\torch\utils\data\dataloader.py", line 841, in _next_data
    idx, data = self._get_data()
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\site-packages\torch\utils\data\dataloader.py", line 798, in _get_data
    success, data = self._try_get_data()
  File "C:\Users\OneWorld\anaconda3\envs\STMVOS\lib\site-packages\torch\utils\data\dataloader.py", line 774, in _try_get_data
    raise RuntimeError('DataLoader worker (pid(s) {}) exited unexpectedly'.format(pids_str))
RuntimeError: DataLoader worker (pid(s) 15916, 1232) exited unexpectedly

0 голосов
/ 29 мая 2020

Я только что создал README.md файл для успешного выполнения этого проекта, он здесь: Установите PyTorch через pip для запуска STM Paper . Я тестировал Windows 10 с Cuda версии 10.1. Просто следуйте этой README.md шаг за шагом, и вы должны быть готовы к go.

Ваша команда установки PyTorch может отличаться в зависимости от конфигурации вашей системы, получите команду установки, как показано на изображении ниже:

Pytorch installation via pip

Ваш requirements.txt файл должен выглядеть так:

requirements.txt file

ПРИМЕЧАНИЕ: я ничего не делал с [path / to / DAVIS] или чем-то еще. Возможно, вы сможете запустить сценарий eval_DAVIS.py без ошибок установки, и это все, что я тестировал. Вы также должны работать в Ubuntu, просто используйте соответствующую команду из README.md.

Удачного кодирования!

...