import requests
from bs4 import BeautifulSoup
import pandas as pd
import re
start_url = 'https://www.example.com'
downloaded_html = requests.get(start_url)
soup = BeautifulSoup(downloaded_html.text, "lxml")
full_header = soup.select('div.reference-image')
full_header
Вывод приведенного выше кода:
[<div class="reference-image"><img src="Content/image/all/reference/c101.jpg"/></div>,
<div class="reference-image"><img src="Content/image/all/reference/c102.jpg"/></div>,
<div class="reference-image"><img src="Content/image/all/reference/c102.jpg"/></div>]
Я хотел бы извлечь содержимое img src
, как показано ниже;
["Content/image/all/reference/c101.jpg",
"Content/image/all/reference/c102.jpg",
"Content/image/all/reference/c102.jpg"]
Как его извлечь