У меня есть следующий URL https://www.bing.com/search?q=site%3Awww.linkedin.com%20Employnet%2C+Inc.%20Monterey%20CA%20NOT%20jobs%20NOT%20pulse%20NOT%20profinder%%20NOT%20dir%20NOT%20company%20intitle%3AEmploynet%2C+Inc.
https://www.bing.com/search?q=site%3Awww.linkedin.com%20Employnet%2C+Inc.%20Monterey%20CA%20NOT%20jobs%20NOT%20pulse%20NOT%20profinder%%20NOT%20dir%20NOT%20company%20intitle%3AEmploynet%2C+Inc.
когда я перехожу на URL, поиск становится таким: site:www.linkedin.com Employnet, Inc. Monterey CA NOT jobs NOT pulse NOT profinder% NOT dir NOT company intitle:Employnet, Inc.
site:www.linkedin.com Employnet, Inc. Monterey CA NOT jobs NOT pulse NOT profinder% NOT dir NOT company intitle:Employnet, Inc.
Вот мой код:
url="https://www.bing.com/search?q=site%3Awww.linkedin.com%20Employnet%2C+Inc.%20Monterey%20CA%20NOT%20jobs%20NOT%20pulse%20NOT%20profinder%%20NOT%20dir%20NOT%20company%20intitle%3AEmploynet%2C+Inc." url=url.replace("%3A",":").replace("%20"," ").replace("%2C+",", ") search=re.search('.*?q=(.*)',url).groups()[0]
Мне кажется, что это плохой метод, есть ли более технический способ для правильного кодирования
Использование Python 3:
>>> import urllib.parse >>> url="https://www.bing.com/search?q=site%3Awww.linkedin.com%20Employnet%2C+Inc.%20Monterey%20CA%20NOT%20jobs%20NOT%20pulse%20NOT%20profinder%%20NOT%20dir%20NOT%20company%20intitle%3AEmploynet%2C+Inc." >>> urllib.parse.unquote_plus(url) 'https://www.bing.com/search?q=site:www.linkedin.com Employnet, Inc. Monterey CA NOT jobs NOT pulse NOT profinder% NOT dir NOT company intitle:Employnet, Inc.'
Или извлеките запрос и unquote_plus it:
unquote_plus
>>> urllib.parse.unquote_plus(urllib.parse.urlsplit(url).query[2:]) 'site:www.linkedin.com Employnet, Inc. Monterey CA NOT jobs NOT pulse NOT profinder% NOT dir NOT company intitle:Employnet, Inc.'