Я получаю эту ошибку:
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/twisted/internet/defer.py", line 653, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "/usr/local/lib/python3.5/dist-packages/scrapy/commands/parse.py", line 195, in callback
items, requests = self.run_callback(response, cb)
File "/usr/local/lib/python3.5/dist-packages/scrapy/commands/parse.py", line 117, in run_callback
for x in iterate_spider_output(cb(response)):
File "/code/stack_gpl/stack_gpl/spiders/get_products.py", line 21, in parse_categories
print(text)
UnicodeEncodeError: 'ascii' codec can't encode character '\u2013' in position 8: ordinal not in range(128)
2020-05-01 17:06:17 [scrapy.core.engine] INFO: Closing spider (finished)
Я уже искал онлайн-справку, но ни одно из пробованных решений не сработало. Включая атрибут .encode
Выполняет шаги, необходимые для репликации этой ошибки:
стартовый проект scrapy stack_gpl
cd stack_gpl
scens genspider get_products example.com
Скопируйте этот файл в: ./stack_gpl/stack_gpl/spiders/get_products.py
# -*- coding: utf-8 -*-
import scrapy
from scrapy.shell import inspect_response
class GetProductsSpider(scrapy.Spider):
name = 'get_products'
allowed_domains = ['example.com']
start_urls = ['https://example.com/']
def parse(self, response):
pass
def parse_categories(self, response):
# inspect_response(response, self)
category_name = response.request.meta['category_product']
products = response.xpath("//div[@id='content']//a/h3[@title]")
for product in products:
text = product.xpath(".//text()").get()
print(text)
Выполнить: scrapy parse --spider = get_products - c parse_categories --meta = '{"category_product": "Themeforest"}' https://example.com/vendor/themeforest/