import requests
from bs4 import BeautifulSoup
from collections import OrderedDict
def info(novelname):
response = requests.get(
'https://m.wuxiaworld.co/{}/'.format(novelname.replace(' ', '-')),
headers=OrderedDict(
(
("User-Agent", "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.7) Gecko/2009021910 Firefox/3.0.7"),
("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"),
("Accept-Language", "en-US,en;q=0.5"),
("Accept-Encoding", "gzip, deflate"),
("Connection", "keep-alive"),
("Upgrade-Insecure-Requests", "1")
)
)
)
if response.status_code == 200:
soup = BeautifulSoup(response.content, 'html5lib')
for textp in soup.find_all('p', attrs={'class': 'review'}):
print textp.text.strip()
info('Castle of Black Iron')
Проблема в том, что ваш html-парсер ... использование html5lib дает нам
Description
After the Catastrophe, every rule in the world was rewritten.
In the Age of Black Iron, steel, iron, steam engines and fighting force became the crux in which human beings depended on to survive.
A commoner boy by the name Zhang Tie was selected by the gods of fortune and was gifted a small tree which could constantly produce various marvelous fruits. At the same time, Zhang Tie was thrown into the flames of war, a three-hundred-year war between the humans and monsters on the vacant continent. Using crystals to tap into the potentials of the human body, one must cultivate to become stronger.
The thrilling legends of mysterious clans, secrets of Oriental fantasies, numerous treasures and legacies in the underground world — All in the Castle of Black Iron!
Citadel of Black Iron
黑铁之堡