Я пытаюсь создать веб-сканер, используя scrapy. Код моего сканера выглядит следующим образом:
import scrapy
class SpiSpider(scrapy.Spider):
name = 'spi'
start_urls = ['http://www.quotes.toscrape.com/']
def parse(self, response):
titles = response.css('title:text').extract()
yield('at:',titles)
, когда я пытался запустить его, возникли следующие ошибки:
2020-02-20 16:10:30 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
2020-02-20 16:10:32 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET http://www.quotes.toscrape.com/robots.txt> (failed 1 times): [<twisted.python.failure.Failure twisted.internet.error.ConnectionDone: Connection was closed cleanly.>]
2020-02-20 16:10:32 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET http://www.quotes.toscrape.com/robots.txt> (failed 2 times): [<twisted.python.failure.Failure twisted.internet.error.ConnectionDone: Connection was closed cleanly.>]
2020-02-20 16:10:33 [scrapy.downloadermiddlewares.retry] DEBUG: Gave up retrying <GET http://www.quotes.toscrape.com/robots.txt> (failed 3 times): [<twisted.python.failure.Failure twisted.internet.error.ConnectionDone: Connection was closed cleanly.>]
2020-02-20 16:10:33 [scrapy.downloadermiddlewares.robotstxt] ERROR: Error downloading <GET http://www.quotes.toscrape.com/robots.txt>: [<twisted.python.failure.Failure twisted.internet.error.ConnectionDone: Connection was closed cleanly.>]
Traceback (most recent call last):
File "c:\users\adi\appdata\local\programs\python\python37-32\lib\site-packages\scrapy\core\downloader\middleware.py", line 44, in process_request
defer.returnValue((yield download_func(request=request, spider=spider)))
twisted.web._newclient.ResponseNeverReceived: [<twisted.python.failure.Failure twisted.internet.error.ConnectionDone: Connection was closed
cleanly.>]
2020-02-20 16:10:34 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET http://www.quotes.toscrape.com/> (failed 1 times): [<twisted.python.failure.Failure twisted.internet.error.ConnectionDone: Connection was closed cleanly.>]
2020-02-20 16:10:35 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET http://www.quotes.toscrape.com/> (failed 2 times): [<twisted.python.failure.Failure twisted.internet.error.ConnectionDone: Connection was closed cleanly.>]
2020-02-20 16:10:36 [scrapy.downloadermiddlewares.retry] DEBUG: Gave up retrying <GET http://www.quotes.toscrape.com/> (failed 3 times): [<twisted.python.failure.Failure twisted.internet.error.ConnectionDone: Connection was closed cleanly.>]
2020-02-20 16:10:36 [scrapy.core.scraper] ERROR: Error downloading <GET http://www.quotes.toscrape.com/>
Traceback (most recent call last):
File "c:\users\adi\appdata\local\programs\python\python37-32\lib\site-packages\scrapy\core\downloader\middleware.py", line 44, in process_request
defer.returnValue((yield download_func(request=request, spider=spider)))
twisted.web._newclient.ResponseNeverReceived: [<twisted.python.failure.Failure twisted.internet.error.ConnectionDone: Connection was closed
cleanly.>]
Я попытался изменить агента пользователя и попытался применить прокси, но нет, это не решило мою проблему.