Получение UrlError Во время работы Python кода для извлечения Url из Ubuntu - PullRequest
1 голос
/ 09 февраля 2020

Ниже приведена трассировка стека на конце терминала в Ubuntu. Даже моя анакинда тратит слишком много времени на открытие (около 20 минут).

  Traceback (most recent call last):
  File "/home/narendra/anaconda3/lib/python3.7/urllib/request.py", line 1317, in do_open
    encode_chunked=req.has_header('Transfer-encoding'))
  File "/home/narendra/anaconda3/lib/python3.7/http/client.py", line 1244, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/home/narendra/anaconda3/lib/python3.7/http/client.py", line 1290, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/home/narendra/anaconda3/lib/python3.7/http/client.py", line 1239, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/home/narendra/anaconda3/lib/python3.7/http/client.py", line 1026, in _send_output
    self.send(msg)
  File "/home/narendra/anaconda3/lib/python3.7/http/client.py", line 966, in send
    self.connect()
  File "/home/narendra/anaconda3/lib/python3.7/http/client.py", line 1406, in connect
    super().connect()
  File "/home/narendra/anaconda3/lib/python3.7/http/client.py", line 938, in connect
    (self.host,self.port), self.timeout, self.source_address)
  File "/home/narendra/anaconda3/lib/python3.7/socket.py", line 727, in create_connection
    raise err
  File "/home/narendra/anaconda3/lib/python3.7/socket.py", line 716, in create_connection
    sock.connect(sa)
TimeoutError: [Errno 110] Connection timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "TASK_1.py", line 23, in <module>
    response = urllib.request.urlopen(line,context=gcontext)
  File "/home/narendra/anaconda3/lib/python3.7/urllib/request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "/home/narendra/anaconda3/lib/python3.7/urllib/request.py", line 525, in open
    response = self._open(req, data)
  File "/home/narendra/anaconda3/lib/python3.7/urllib/request.py", line 543, in _open
    '_open', req)
  File "/home/narendra/anaconda3/lib/python3.7/urllib/request.py", line 503, in _call_chain
    result = func(*args)
  File "/home/narendra/anaconda3/lib/python3.7/urllib/request.py", line 1360, in https_open
    context=self._context, check_hostname=self._check_hostname)
  File "/home/narendra/anaconda3/lib/python3.7/urllib/request.py", line 1319, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [Errno 110] Connection timed out>

ниже - мой код.

this Код извлекает данные Url в файл

, один за другим выбирает URL для url.txt

, затем извлекает все данные страницы из этого конкретного URL.

import urllib.request, urllib.error, urllib.parse
import io
import ssl
#localhost, 127.0.0.0/8, ::1, 10.0.0.0/8
# using readline() that reads file line by line. 
file1 = open("url.txt", "r") 
count = 0
gcontext = ssl.SSLContext()`

для i в диапазоне (18): количество + = 1

   # Getting the  next line from file 
   line = file1.readline() 
   # if line is empty 
   # end of file is reached 
   if not line: 
      break
   response = urllib.request.urlopen(line,context=gcontext)
   webContent = response.read()
   with io.open("file_" + str(i) + ".txt", 'w', encoding='utf-8') as f:
       f.write(webContent)
       f.close()
Добро пожаловать на сайт PullRequest, где вы можете задавать вопросы и получать ответы от других членов сообщества.
...