У меня ниже HTML текст -
<div class="a-fixed-left-grid-col a-col-left" id="zg-left-col" style="width:200px;margin-left:-200px;float:none;"> <ul id="zg_browseRoot"> <li class="zg_browseUp"> ‹ <a href="https://www.amazon.com/Best-Sellers/zgbs">Any Department</a> </li> <ul> <li class="zg_browseUp"> ‹ <a href="https://www.amazon.com/Best-Sellers/zgbs/amazon-devices">Amazon Devices & Accessories</a> </li> <ul> <li> <span class="zg_selected"> Amazon Devices</span> </li> <ul> <li><a href="https://www.amazon.com/Best-Sellers-Home-Security-Amazon/zgbs/amazon-devices/17386948011">Home Security from Amazon</a></li> <li><a href="https://www.amazon.com/Best-Sellers-Amazon-Echo-Alexa-Devices/zgbs/amazon-devices/9818047011">Amazon Echo & Alexa Devices</a></li> <li><a href="https://www.amazon.com/Best-Sellers-Dash-Buttons/zgbs/amazon-devices/10667898011">Dash Buttons</a></li> <li><a href="https://www.amazon.com/Best-Sellers-Fire-TV/zgbs/amazon-devices/8521791011">Fire TV</a></li> <li><a href="https://www.amazon.com/Best-Sellers-Fire-Tablets/zgbs/amazon-devices/6669703011">Fire Tablets</a></li> <li><a href="https://www.amazon.com/Best-Sellers-Kindle-readers/zgbs/amazon-devices/6669702011">Kindle E-readers</a></li> <li><a href="https://www.amazon.com/Best-Sellers-Amazon-Device-Bundles/zgbs/amazon-devices/16926003011">Device Bundles</a></li> </ul> </ul> </ul> </ul> </div>
Я хочу потянуть как-то так -
https://www.amazon.com/Best-Sellers-Home-Security-Amazon/zgbs/amazon-devices/17386948011 https://www.amazon.com/Best-Sellers-Amazon-Echo-Alexa-Devices/zgbs/amazon-devices/9818047011 https://www.amazon.com/Best-Sellers-Dash-Buttons/zgbs/amazon-devices/10667898011 https://www.amazon.com/Best-Sellers-Fire-TV/zgbs/amazon-devices/8521791011 https://www.amazon.com/Best-Sellers-Fire-Tablets/zgbs/amazon-devices/6669703011 https://www.amazon.com/Best-Sellers-Kindle-readers/zgbs/amazon-devices/6669702011 https://www.amazon.com/Best-Sellers-Amazon-Device-Bundles/zgbs/amazon-devices/16926003011
Я пытался с помощью приведенного ниже кода и его работы, но не дает результат, что я хочу.
soup.find('div', class_= 'a-fixed-left-grid-col a-col-left').find_all('ul')[3]
с использованием .select()
.select()
catLinks = soup.select('#zg_browseRoot ul ul ul li a') for link in catLinks: print(link.get('href'))
Вам нужно получить все href внутри всех тегов anchor. Попробуйте это:
href
anchor
print([a['href'] for a in soup.find('div', class_= 'a-fixed-left-grid-col a-col-left').find_all('ul')[3].find_all('a')])