У меня есть HTML код как: «1.
<a href="/title/tt0111161/?ref_=adv_li_tt">The Shawshank Redemption</a> <span class="lister-item-year text-muted unbold">(1994)</span>
»
Как извлечь «Искупление Шоушенка» из тега «a» с помощью Beautiful soup?
Простой поиск дал бы вам
from bs4 import BeautifulSoup data = ''' <a href="/title/tt0111161/?ref_=adv_li_tt">The Shawshank Redemption</a> <span class="lister-item-year text-muted unbold">(1994)</span> ''' soup = BeautifulSoup(data, 'html.parser') print(soup.a.text) print(soup.find('a').text) for a in soup.find_all('a'): print(a.text) print(soup.a.get_text()) print(soup.find('a').get_text()) for a in soup.find_all('a'): print(a.get_text())
Примерно так будет работать:
import requests from bs4 import BeautifulSoup import csv st = r"""<a href="/title/tt0111161/?ref_=adv_li_tt">The Shawshank Redemption</a> <span class="lister-item-year text-muted unbold">(1994)</span>""" soup = BeautifulSoup(st, 'html5lib') a = soup.find_all('a') a[0].text