Извлечение тега под значениями тега из HTML в python - PullRequest
0 голосов
/ 30 июня 2019
<div class="book-cover-image">
<img alt="NOT IN MY BACKYARD – Solid Waste Mgmt in Indian Cities" class="img-responsive" src="https://cdn.downtoearth.org.in/library/medium/2016-05-23/0.42611000_1463993925_book-cover.jpg" title="NOT IN MY BACKYARD – Solid Waste Mgmt in Indian Cities"/>
</div>

Мне нужно извлечь это значение заголовка из всех таких тегов div.Что может быть лучшим способом выполнить эту операцию.Пожалуйста, предложите.

Я пытаюсь получить название всех книг, упомянутых на этой странице .

Я пробовал это до сих пор:

import requests 
from bs4 import BeautifulSoup as bs


url1 ="https://www.downtoearth.org.in/books"
page1 = requests.get(url1, verify=False)

#print(page1.content)

soup1= bs(page1.content, 'html.parser')
class_names = soup1.find_all('div',{'class':'book-cover-image'} )

for class_name in class_names:
    title_text = class_name.text
    print(class_name)
    print(title_text)

Ответы [ 2 ]

2 голосов
/ 30 июня 2019

Чтобы получить все атрибуты title для обложек книг, вы можете использовать селектор CSS .book-cover-image img[title] (выбрать все <img> теги с атрибутом title, которые находятся под тегом с классом book-cover-image):

import requests
from bs4 import BeautifulSoup

url = 'https://www.downtoearth.org.in/books'
soup = BeautifulSoup(requests.get(url).text, 'lxml')

for i, img in enumerate(soup.select('.book-cover-image img[title]'), 1):
    print('{:>4}\t{}'.format(i, img['title']))

Печать:

   1    State of India’s Environment 2019: In Figures (eBook)                           
   2    Victim Africa (eBook)                                                           
   3    Frames of change - Heartening tales that define new India                       
   4    STATE OF INDIA’S ENVIRONMENT 2019                                               
   5    State of India’s Environment In Figures 2018 (eBook)                            
   6    Getting to know about environment                                               
   7    CLIMATE CHANGE NOW - The Story of Carbon Colonisation                           
   8    Climate change - For the young and curious                                      
   9    Conflicts of Interest: My Journey through India’s Green Movement                
  10    Body Burden: Lifestyle Diseases                                                 
  11    STATE OF INDIA’S ENVIRONMENT 2018                                               
  12    DROUGHT BUT WHY? How India can fight the scourge by abandoning drought relief   
  13    SOE 2017 (Print version) and SOE 2017 in Figures (Digital version) combo offer  
  14    State of India's Environment 2017 In Figures (eBook)                            
  15    Environment Reader for Universities                                             
  16    Not in My Backyard  (Book & DVD combo offer)                                    
  17    The Crow, Honey Hunter and the Kitchen Garden                                   
  18    BIOSCOPE OF PIU & POM                                                           
  19    SOE 2017 and Food book combo offer                                              
  20    FIRST FOOD: Culture of Taste                                                    
  21    Annual State Of India’s Environment - SOE 2017                                  
  22    An 8-million-year-old mysterious date with monsoon  (e-book)                    
  23    Why I Should be Tolerant                                                        
  24    NOT IN MY BACKYARD – Solid Waste Mgmt in Indian Cities  
1 голос
/ 30 июня 2019

Вы можете сделать с xpath вот так.

import requests
from lxml import html

url1 ="https://www.downtoearth.org.in/books"
res = requests.get(url1, verify=False)
tree = html.fromstring(res.text)
d = tree.xpath("//div[@class='book-cover-image']//img/@title")
for title in d:
    print(title)

Выход

State of India’s Environment 2019: In Figures (eBook)
Victim Africa (eBook)
Frames of change - Heartening tales that define new India
STATE OF INDIA’S ENVIRONMENT 2019
State of India’s Environment In Figures 2018 (eBook)
Getting to know about environment
CLIMATE CHANGE NOW - The Story of Carbon Colonisation
Climate change - For the young and curious
Conflicts of Interest: My Journey through India’s Green Movement
Body Burden: Lifestyle Diseases
STATE OF INDIA’S ENVIRONMENT 2018
DROUGHT BUT WHY? How India can fight the scourge by abandoning drought relief
SOE 2017 (Print version) and SOE 2017 in Figures (Digital version) combo offer
State of India's Environment 2017 In Figures (eBook)
Environment Reader for Universities
Not in My Backyard  (Book & DVD combo offer)
The Crow, Honey Hunter and the Kitchen Garden
BIOSCOPE OF PIU & POM
SOE 2017 and Food book combo offer
FIRST FOOD: Culture of Taste
Annual State Of India’s Environment - SOE 2017
An 8-million-year-old mysterious date with monsoon  (e-book) 
Why I Should be Tolerant
NOT IN MY BACKYARD – Solid Waste Mgmt in Indian Cities
Добро пожаловать на сайт PullRequest, где вы можете задавать вопросы и получать ответы от других членов сообщества.
...