Если вы просматриваете исходный код страницы, записи соседства находятся в пределах div, которые имеют класс «div-col», и ссылка содержит атрибут «title».
Кроме того, замена текста во время добавление не требуется.
Следующий код:
import requests
from bs4 import BeautifulSoup
import pandas as pd
address = 'Los Angeles, United States'
url = "https://en.wikipedia.org/wiki/List_of_districts_and_neighborhoods_of_Los_Angeles"
source = requests.get(url).text
soup = BeautifulSoup(source, 'lxml')
neighborhoodList = []
# -- append the data into the list
links = []
for row in soup.find_all("div", class_="div-col"):
for item in row.select("a"):
if item.has_attr('title'):
neighborhoodList.append(item.text)
df_neighborhood = pd.DataFrame({"Neighborhood": neighborhoodList})
print(f'First 10 Rows:')
print(df_neighborhood.head(n=10))
print(f'\nLast 10 Rows:')
print(df_neighborhood.tail(n=10))
Результаты:
First 10 Rows:
Neighborhood
0 Angelino Heights
1 Arleta
2 Arlington Heights
3 Arts District
4 Atwater Village
5 Baldwin Hills
6 Baldwin Hills/Crenshaw
7 Baldwin Village
8 Baldwin Vista
9 Beachwood Canyon
Last 10 Rows:
Neighborhood
186 Westwood Village
187 Whitley Heights
188 Wholesale District
189 Wilmington
190 Wilshire Center
191 Wilshire Park
192 Windsor Square
193 Winnetka
194 Woodland Hills
195 Yucca Corridor