Я не уверен, где вы нашли это page_link
для использования. Попробуйте следующий подход, чтобы получить контент, который вы хотите проанализировать.
from bs4 import BeautifulSoup
import requests
urlLink = 'https://www.cfapubs.org/doi/abs/10.2469/faj.v74.n4.2'
page_response = requests.get(urlLink,headers={'User-Agent':'Mozilla/5.0'})
soup = BeautifulSoup(page_response.content, 'html.parser')
name = soup.find(class_="hlFld-ContribAuthor").find("a").text
abstract = soup.find(class_="abstractSection").find("p").text
print(f'Name : {name}\nAbstract : {abstract}')
Если вы хотите использовать селектор, попробуйте:
from bs4 import BeautifulSoup
import requests
urlLink = 'https://www.cfapubs.org/doi/abs/10.2469/faj.v74.n4.2'
page_response = requests.get(urlLink,headers={'User-Agent':'Mozilla/5.0'})
soup = BeautifulSoup(page_response.content, 'html.parser')
name = soup.select_one(".hlFld-ContribAuthor a").text
abstract = soup.select_one(".abstractSection p").text
print(f'Name : {name}\nAbstract : {abstract}')
Выход:
Name : Charles D. Ellis, CFA
Abstract : One of the consequences of the shift in corporate retirement plans from defined benefit to defined contribution is widespread retirement insecurity. Although most people in the top one-third of economic affluence will be fine, for the other two-thirds—particularly the bottom one-third—the problem is a serious threat. We can prevent this painful future if we act sensibly and soon by raising the alarm with our corporate and government leaders.
Наконец, если вы не хотите видеть разрыв между текстом внутри abstract
, замените строку на abstract = ' '.join(soup.find(class_="abstractSection").find("p").text.split())
.