Я работаю над веб-сканированием, я беру имена из текстового файла по строке за строкой, ищу их в Google и удаляю адрес из результатов. Хочу добавить этот результат перед соответствующими именами. это мой текстовый файл a.txt:
0.5BN FINHEALTH PRIVATE LIMITED
01 SYNERGY CO.
1 BY 0 SOLUTIONS
, а это мой код:
import requests
from bs4 import BeautifulSoup
USER_AGENT = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:65.0) Gecko/20100101 Firefox/65.0"
out_fl = open('a.txt','r')
for line in out_fl:
query = line
query = query.replace(' ', '+')
print(line)
URL = f"https://google.com/search?q={query}"
print(URL)
headers = {"user-agent": USER_AGENT}
resp = requests.get(URL, headers=headers)
if resp.status_code == 200:
soup = BeautifulSoup(resp.content, "html.parser")
results = []
newline = '\n'
for g in soup.find_all('span', class_="i4J0ge"):
x = f'{line}:{g.text}{newline}'
results.append(x)
print(results)
with open("results.txt","a") as result:
result.write(str(results))
Я получаю такой результат, но он неправильно отформатирован, пожалуйста, помогите мне. мой ожидаемый результат такой:
0.5BN FINHEALTH PRIVATE LIMITED : Address: 2nd Floor, BHIVE Forum, GNS Towers #18, Dairy
Circle Road, Adugodi, Koramangala, Bengaluru, Karnataka 560029Hours: Closed ⋅ Opens 9:30AM
MonSaturdayClosedSundayClosedMonday9:30am–7:30pmTuesday9:30am–7:30pmWednesday9:30am–
7:30pmThursday9:30am–7:30pmFriday9:30am–7:30pmSuggest an editUnable to add this file.
Please check that it is a valid photo
01 SYNERGY CO. : 01 SYNERGY CO.\n:Located in: Punjab Agricultural UniversityAddress: 3rd
Floor Kartar Bhawan, Ferozpur Rd, Ludhiana, Punjab 141001Hours: Closes soon ⋅ 5PM ⋅ Opens
9:30AM MonSaturday10am–5pmSundayClosedMonday9:30am–7:30pmTuesday9:30am–
7:30pmWednesday9:30am–7:30pmThursday9:30am–7:30pmFriday9:30am–7:30pmSuggest an editUnable
to add this file. Please check that it is a valid photo.Phone: 098159 18807
Или в excel. Спасибо