Разбор с BeautifulSoup - не так, как ожидалось - PullRequest
0 голосов
/ 30 апреля 2020

Я пытаюсь разобрать любые календари airbnb - но он не возвращает то, что я ожидаю ...

Какой-то код:

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
import time
import requests
from bs4 import BeautifulSoup
from urllib.request import urlopen

Driver = webdriver.Firefox()
Driver.get("https://www.airbnb.co.uk/rooms/17394193?location=Whitby&check_in=2020-05-18&check_out=2020-05-21&source_impression_id=p3_1588098317_3ZR4OmXOPF8LDdm7&guests=1&adults=1")
url = Driver.current_url
PageSourceURL = Driver.page_source

Soup = BeautifulSoup(PageSourceURL, features='html.parser')
PageHTML = Soup.prettify()
print(PageHTML)

Вывод:

<html class="js-focus-visible" data-is-hyperloop="true" dir="ltr" lang="en-GB" xmlns:fb="http://ogp.me/ns/fb#">
 <head>
  <script async="" src="https://www.google-analytics.com/analytics.js">
  </script>
  <script async="" src="https://www.google-analytics.com/analytics.js">
  </script>
  <script>
   window.sherlock_firstbyte = window.performance && window.performance.timing ? window.performance.timing.responseStart : Number(new Date());
  </script>
  <script>
   !function(){"use strict";var e=730,n="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";var t=/(?:^| )bev=(.*?)(?:;|$)/,o=!1;function a(){return window.bev=window.bev||function(){if(o||"undefined"==typeof document)return null;o=!0;var e=(document.cookie||"").match(t);return e&&2===e.length?decodeURIComponent(e[1]):null}(),window.bev}!function(){try{if(!a()){var t=function(){for(var e=[],t=15;t>=0;t--)e.push(n[Math.floor(Math.random()*n.length)]);var o=Math.floor(Date.now()/1e3);return"".concat(o,"_").concat(e.join(""))}();o=t,r=document.location.hostname,c=".".concat(r.slice(r.indexOf("airbnb."))),(i=new Date).setDate(i.getDate()+e),document.cookie=["bev=".concat(encodeURIComponent(o)),"expires=".concat(i.toUTCString()),"path=/","domain=".concat(c),"secure"].join("; "),window.bev=t,function(e){var n=new XMLHttpRequest;n.open("POST","/tracking/events",!0),n.setRequestHeader("Content-Type","application/json; charset=utf-8");var t={event_name:"bev_created",event_data:{bev:e,page_uri:document.location.pathname,page_referrer:document.referrer}};n.send(JSON.stringify(t))}(t)}}catch(e){window.console&&console.error("Could not set bev cookie:",e)}var o,r,c,i}()}();
  </script>
  <script>
   (function() {
  var pgRequest = new XMLHttpRequest();
  var diffStamp = Date.now().toString() + Math.random().toString().substring(2);
  pgRequest.open('GET', '/pg_pixel?r=' + encodeURIComponent(document.referrer || '') + '&diff=' + diffStamp, true);
  pgRequest.send();
})()
  </script>
  <script>
   // FID init code.
(function(a,b){function c(a){l.push(a),f()}function d(a,b){i||(i=b,j=a,k=new Date,f())}function e(){i&&(i=null,j=null,k=null)}function f(){0<=j&&j<k-n&&(l.forEach(function(a){a(j,i)}),l=[])}function g(c,e){function f(){d(c,e),h()}function g(){h()}function h(){b("pointerup",f,m),b("pointercancel",g,m)}a("pointerup",f,m),a("pointercancel",g,m)}function h(a){if(a.cancelable){var b=1e12<a.timeStamp,c=b?new Date:performance.now(),e=c-a.timeStamp;"pointerdown"===a.type?g(e,a):d(e,a)}}var i,j,k,l=[],m={passive:!0,capture:!0},n=new Date;(function(a){["click","mousedown","keydown","touchstart","pointerdown"].forEach(function(b){a(b,h,m)})})(a),self.perfMetrics=self.perfMetrics||{},self.perfMetrics.onFirstInputDelay=c,self.perfMetrics.clearFirstInputDelay=e})(addEventListener,removeEventListener);
// FCP init code.
(function(a){function b(){return!!document.body&&null!==document.createNodeIterator(document.body,NodeFilter.SHOW_TEXT,function(a){return!!a&&/[^\s]/.test(a.nodeValue)&&"SCRIPT"!==a.parentNode.tagName&&"STYLE"!==a.parentNode.tagName&&0<a.parentNode.offsetHeight},!1).nextNode()}function c(){return null!==document.querySelector("input[placeholder]")}function d(){return b()||c()?void a(function(){var a=performance.now();f?f(a):g=a,performance.measure("TTFCP")}):void a(d)}function e(a){g?a(g):f=a}var f,g;a(d),self.perfMetrics=self.perfMetrics||{},self.perfMetrics.onFirstContentfulPaint=e})(requestAnimationFrame);    
// TTFMP Polyfill code.
(function(a){function b(){var c=document.getElementById("FMP-target");if(h=0,!c)e=a(b);else if(g===c)e=a(b);else if("IMG"===c.tagName&&!c.complete)e=a(b);else{var d=performance.now();g=c,f?f(d):h=d,performance.measure("TTFMP")}}function c(a){h?a(h):f=a}function d(){cancelAnimationFrame(e)}var e,f,g,h;e=a(b),self.perfMetrics=self.perfMetrics||{},self.perfMetrics.onFirstMeaningfulPaint=c,self.perfMetrics.startSearchingForFirstMeaningfulPaint=function(){g=document.getElementById("FMP-target"),b()},self.perfMetrics.stopSearchingForFirstMeaningfulPaint=d})(requestAnimationFrame);
  </script>
  <meta charset="utf-8"/>
  <meta content="en-GB" name="locale"/>
  <meta content="notranslate" name="google"/>
  <meta content="138566025676" property="fb:app_id"/>
  <meta content="Airbnb" property="og:site_name"/>
  <meta content="en_GB" property="og:locale"/>
  <meta content="https://www.airbnb.co.uk/rooms/skeleton" property="og:url"/>
  <meta content="" property="og:title"/>
  <meta content="" property="og:description"/>
  <meta content="website" property="og:type"/>
  <link crossorigin="anonymous" href="https://a0.muscache.com/airbnb/static/packages/common-59f479fe1e596df7f1f7830bd5ea15bb.css" media="all" rel="stylesheet" type="text/css"/>
  <link href="https://a0.muscache.com/airbnb/static/packages/map_search-05b2e8d7a5602d7f9224bf29250fcd41.css" media="all" rel="stylesheet" type="text/css"/>
  <link href="https://a0.muscache.com/airbnb/static/packages/dls/dls-lite_cereal-d9f6fdb2a0dd4a18c37f8ee01de8ec3d.css" media="all" rel="stylesheet" type="text/css"/>
  <link href="https://a0.muscache.com/airbnb/static/packages/dls/dls-lite_o2-leftover-3644a5fa97a2e311cd1cd1dab8abaf5f.css" media="all" rel="stylesheet" type="text/css"/>
  <meta content="138566025676" property="fb:app_id"/>
  <meta content="Airbnb" property="og:site_name"/>
  <meta content="en_GB" property="og:locale"/>
  <meta content="https://www.airbnb.co.uk/rooms/skeleton" property="og:url"/>
  <meta content="" property="og:title"/>
  <meta content="" property="og:description"/>
  <meta content="website" property="og:type"/>
  <meta content="https://a0.muscache.com/airbnb/static/logos/trips-og-1280x630-9de9c338cc3fd9b5663fb80be0cbe8c2.jpg" property="og:image"/>
  <meta content="width=device-width, initial-scale=1" name="viewport"/>
  <link href="https://a0.muscache.com/airbnb/static/logos/trips-og-200x200-a3be4fbbb3b6c5e758804438dea35adc.jpg" rel="image_src"/>
  <meta content="authenticity_token" id="csrf-param-meta-tag" name="csrf-param"/>
  <meta content="null" id="csrf-token-meta-tag" name="csrf-token"/>
  <title>
   Holiday Lets, Homes, Experiences &amp; Places - Airbnb
  </title>
  <meta content="IE=edge,chrome=1" http-equiv="X-UA-Compatible"/>
  <meta content="" id="english-canonical-url"/>
  <meta content="on" name="twitter:widgets:csp"/>
  <link href="https://www.airbnb.co.uk/rooms/skeleton" rel="canonical"/>
  <link href="https://www.airbnb.com/rooms/skeleton" hreflang="en" rel="alternate"/>
  <link href="https://www.airbnb.de/rooms/skeleton" hreflang="de" rel="alternate"/>
  <link href="https://www.airbnb.it/rooms/skeleton" hreflang="it" rel="alternate"/>
  <link href="https://www.airbnb.es/rooms/skeleton" hreflang="es-ES" rel="alternate"/>
  <link href="https://www.airbnb.fr/rooms/skeleton" hreflang="fr" rel="alternate"/>
  <link href="https://www.airbnb.com.br/rooms/skeleton" hreflang="pt" rel="alternate"/>
  <link href="https://www.airbnb.dk/rooms/skeleton" hreflang="da" rel="alternate"/>
  <link href="https://www.airbnb.co.uk/rooms/skeleton" hreflang="en-GB" rel="alternate"/>
  <link href="https://www.airbnb.ru/rooms/skeleton" hreflang="ru" rel="alternate"/>
  <link href="https://www.airbnb.pl/rooms/skeleton" hreflang="pl" rel="alternate"/>
  <link href="https://www.airbnb.co.kr/rooms/skeleton" hreflang="ko" rel="alternate"/>
  <link href="https://www.airbnb.cz/rooms/skeleton" hreflang="cs" rel="alternate"/>
  <link href="https://www.airbnb.hu/rooms/skeleton" hreflang="hu" rel="alternate"/>
  <link href="https://www.airbnb.at/rooms/skeleton" hreflang="de-AT" rel="alternate"/>
  <link href="https://www.airbnb.pt/rooms/skeleton" hreflang="pt-PT" rel="alternate"/>
  <link href="https://www.airbnb.gr/rooms/skeleton" hreflang="el" rel="alternate"/>
  <link href="https://www.airbnb.com.tr/rooms/skeleton" hreflang="tr" rel="alternate"/>
  <link href="https://www.airbnb.nl/rooms/skeleton" hreflang="nl" rel="alternate"/>
  <link href="https://www.airbnb.se/rooms/skeleton" hreflang="sv" rel="alternate"/>
  <link href="https://www.airbnb.com.tw/rooms/skeleton" hreflang="zh-TW" rel="alternate"/>
  <link href="https://www.airbnb.com.sg/rooms/skeleton" hreflang="en-SG" rel="alternate"/>
  <link href="https://www.airbnb.co.id/rooms/skeleton" hreflang="id" rel="alternate"/>
  <link href="https://www.airbnb.com.my/rooms/skeleton" hreflang="ms" rel="alternate"/>
  <link href="https://www.airbnb.com.au/rooms/skeleton" hreflang="en-AU" rel="alternate"/>
  <link href="https://www.airbnb.jp/rooms/skeleton" hreflang="ja" rel="alternate"/>
  <link href="https://www.airbnb.is/rooms/skeleton" hreflang="is" rel="alternate"/>
  <link href="https://www.airbnb.no/rooms/skeleton" hreflang="no" rel="alternate"/>
  <link href="https://www.airbnb.ch/rooms/skeleton" hreflang="de-CH" rel="alternate"/>
  <link href="https://fr.airbnb.ch/rooms/skeleton" hreflang="fr-CH" rel="alternate"/>
  <link href="https://it.airbnb.ch/rooms/sk
None

Теперь я хочу выполнить поиск и найти такие вещи, как:

enter image description here

В проанализированных данных из Beautifulsoup нет элементов класса div и т. д. c.

Я новичок в чистке - поэтому я не уверен, что мои ожидания не в том месте, я неправильно использовал парсер или что-то еще. ..

Следовательно, любые указатели будут высоко оценены. Спасибо Роб

ОБНОВЛЕНИЕ Следуя приведенной ниже рекомендации, добавьте задержку ожидания до загрузки таблицы. «Суп» с изменениями:

<html class="js-focus-visible" data-is-hyperloop="true" dir="ltr" lang="en-GB" xmlns:fb="http://ogp.me/ns/fb#">
 <head>
  <script async="" src="https://www.google-analytics.com/analytics.js">
  </script>
  <script async="" src="https://www.google-analytics.com/analytics.js">
  </script>
  <script>
   window.sherlock_firstbyte = window.performance && window.performance.timing ? window.performance.timing.responseStart : Number(new Date());
  </script>
  <script>
   !function(){"use strict";var e=730,n="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";var t=/(?:^| )bev=(.*?)(?:;|$)/,o=!1;function a(){return window.bev=window.bev||function(){if(o||"undefined"==typeof document)return null;o=!0;var e=(document.cookie||"").match(t);return e&&2===e.length?decodeURIComponent(e[1]):null}(),window.bev}!function(){try{if(!a()){var t=function(){for(var e=[],t=15;t>=0;t--)e.push(n[Math.floor(Math.random()*n.length)]);var o=Math.floor(Date.now()/1e3);return"".concat(o,"_").concat(e.join(""))}();o=t,r=document.location.hostname,c=".".concat(r.slice(r.indexOf("airbnb."))),(i=new Date).setDate(i.getDate()+e),document.cookie=["bev=".concat(encodeURIComponent(o)),"expires=".concat(i.toUTCString()),"path=/","domain=".concat(c),"secure"].join("; "),window.bev=t,function(e){var n=new XMLHttpRequest;n.open("POST","/tracking/events",!0),n.setRequestHeader("Content-Type","application/json; charset=utf-8");var t={event_name:"bev_created",event_data:{bev:e,page_uri:document.location.pathname,page_referrer:document.referrer}};n.send(JSON.stringify(t))}(t)}}catch(e){window.console&&console.error("Could not set bev cookie:",e)}var o,r,c,i}()}();
  </script>
  <script>
   (function() {
  var pgRequest = new XMLHttpRequest();
  var diffStamp = Date.now().toString() + Math.random().toString().substring(2);
  pgRequest.open('GET', '/pg_pixel?r=' + encodeURIComponent(document.referrer || '') + '&diff=' + diffStamp, true);
  pgRequest.send();
})()
  </script>
  <script>
   // FID init code.
(function(a,b){function c(a){l.push(a),f()}function d(a,b){i||(i=b,j=a,k=new Date,f())}function e(){i&&(i=null,j=null,k=null)}function f(){0<=j&&j<k-n&&(l.forEach(function(a){a(j,i)}),l=[])}function g(c,e){function f(){d(c,e),h()}function g(){h()}function h(){b("pointerup",f,m),b("pointercancel",g,m)}a("pointerup",f,m),a("pointercancel",g,m)}function h(a){if(a.cancelable){var b=1e12<a.timeStamp,c=b?new Date:performance.now(),e=c-a.timeStamp;"pointerdown"===a.type?g(e,a):d(e,a)}}var i,j,k,l=[],m={passive:!0,capture:!0},n=new Date;(function(a){["click","mousedown","keydown","touchstart","pointerdown"].forEach(function(b){a(b,h,m)})})(a),self.perfMetrics=self.perfMetrics||{},self.perfMetrics.onFirstInputDelay=c,self.perfMetrics.clearFirstInputDelay=e})(addEventListener,removeEventListener);
// FCP init code.
(function(a){function b(){return!!document.body&&null!==document.createNodeIterator(document.body,NodeFilter.SHOW_TEXT,function(a){return!!a&&/[^\s]/.test(a.nodeValue)&&"SCRIPT"!==a.parentNode.tagName&&"STYLE"!==a.parentNode.tagName&&0<a.parentNode.offsetHeight},!1).nextNode()}function c(){return null!==document.querySelector("input[placeholder]")}function d(){return b()||c()?void a(function(){var a=performance.now();f?f(a):g=a,performance.measure("TTFCP")}):void a(d)}function e(a){g?a(g):f=a}var f,g;a(d),self.perfMetrics=self.perfMetrics||{},self.perfMetrics.onFirstContentfulPaint=e})(requestAnimationFrame);    
// TTFMP Polyfill code.
(function(a){function b(){var c=document.getElementById("FMP-target");if(h=0,!c)e=a(b);else if(g===c)e=a(b);else if("IMG"===c.tagName&&!c.complete)e=a(b);else{var d=performance.now();g=c,f?f(d):h=d,performance.measure("TTFMP")}}function c(a){h?a(h):f=a}function d(){cancelAnimationFrame(e)}var e,f,g,h;e=a(b),self.perfMetrics=self.perfMetrics||{},self.perfMetrics.onFirstMeaningfulPaint=c,self.perfMetrics.startSearchingForFirstMeaningfulPaint=function(){g=document.getElementById("FMP-target"),b()},self.perfMetrics.stopSearchingForFirstMeaningfulPaint=d})(requestAnimationFrame);
  </script>
  <meta charset="utf-8"/>
  <meta content="en-GB" name="locale"/>
  <meta content="notranslate" name="google"/>
  <meta content="138566025676" property="fb:app_id"/>
  <meta content="Airbnb" property="og:site_name"/>
  <meta content="en_GB" property="og:locale"/>
  <meta content="https://www.airbnb.co.uk/rooms/17394193" property="og:url"/>
  <meta content="" property="og:title"/>
  <meta content="" property="og:description"/>
  <meta content="website" property="og:type"/>
  <link crossorigin="anonymous" href="https://a0.muscache.com/airbnb/static/packages/common-59f479fe1e596df7f1f7830bd5ea15bb.css" media="all" rel="stylesheet" type="text/css"/>
  <link href="https://a0.muscache.com/airbnb/static/packages/map_search-05b2e8d7a5602d7f9224bf29250fcd41.css" media="all" rel="stylesheet" type="text/css"/>
  <link href="https://a0.muscache.com/airbnb/static/packages/dls/dls-lite_cereal-d9f6fdb2a0dd4a18c37f8ee01de8ec3d.css" media="all" rel="stylesheet" type="text/css"/>
  <link href="https://a0.muscache.com/airbnb/static/packages/dls/dls-lite_o2-leftover-3644a5fa97a2e311cd1cd1dab8abaf5f.css" media="all" rel="stylesheet" type="text/css"/>
  <meta content="138566025676" property="fb:app_id"/>
  <meta content="Airbnb" property="og:site_name"/>
  <meta content="en_GB" property="og:locale"/>
  <meta content="https://www.airbnb.co.uk/rooms/17394193?location=Whitby&amp;check_in=2020-05-18&amp;check_out=2020-05-21&amp;source_impression_id=p3_1588098317_3ZR4OmXOPF8LDdm7&amp;guests=1&amp;adults=1" property="og:url"/>
  <meta content="" property="og:title"/>
  <meta content="" property="og:description"/>
  <meta content="website" property="og:type"/>
  <meta content="noindex, nofollow" name="robots"/>
  <meta content="https://a0.muscache.com/airbnb/static/logos/trips-og-1280x630-9de9c338cc3fd9b5663fb80be0cbe8c2.jpg" property="og:image"/>
  <meta content="width=device-width, initial-scale=1" name="viewport"/>
  <link href="https://a0.muscache.com/airbnb/static/logos/trips-og-200x200-a3be4fbbb3b6c5e758804438dea35adc.jpg" rel="image_src"/>
  <meta content="authenticity_token" id="csrf-param-meta-tag" name="csrf-param"/>
  <meta content="null" id="csrf-token-meta-tag" name="csrf-token"/>
  <title>
   Holiday Lets, Homes, Experiences &amp; Places - Airbnb
  </title>
  <meta content="IE=edge,chrome=1" http-equiv="X-UA-Compatible"/>
  <meta content="" id="english-canonical-url"/>
  <meta content="on" name="twitter:widgets:csp"/>
  <meta content="summary" name="twitter:card"/>
  <meta content="Holiday Lets, Homes, Experiences &amp; Places - Airbnb" name="twitter:title"/>
  <meta content="@airbnb" name="twitter:site"/>
  <meta content="Airbnb" name="twitter:app:name:iphone"/>
  <meta content="Airbnb" name="twitter:app:name:ipad"/>
  <meta content="Airbnb" name="twitter:app:name:googleplay"/>
  <meta content="401626263" name="twitter:app:id:iphone"/>
  <meta content="401626263" name="twitter:app:id:ipad"/>
  <meta content="com.airbnb.android" name="twitter:app:id:googleplay"/>
  <meta content="https://www.airbnb.co.uk/rooms/17394193" name="twitter:url"/>
  <link href="/opensearch.xml" rel="search" title="Airbnb" type="application/opensearchdescription+xml"/>
  <link href="/manifest.json" rel="manifest"/>
  <meta content="yes" name="mobile-web-app-capable"/>
  <meta content="yes" name="apple-mobile-web-app-capable"/>
  <meta content="Airbnb" name="application-name"/>
  <meta content="Airbnb" name="apple-mobile-web-app-title"/>
  <meta content="#ffffff" name="theme-color"/>
  <meta content="#ffffff" name="msapplication-navbutton-color"/>
  <meta content="black-translucent" name="apple-mobile-web-app-status-bar-style"/>
  <meta content="/?utm_source=homescreen" name="msapplication-starturl"/>
  <link href="https://a0.muscache.com/airbnb/static/icons/apple-touch-icon-76x76-3b313d93b1b5823293524b9764352ac9.png" rel="apple-touch-icon"/>
  <link href="https://a0.muscache.com/airbnb/static/icons/apple-touch-icon-76x76-3b313d93b1b5823293524b9764352ac9.png" rel="apple-touch-icon" sizes="76x76"/>
  <link href="https://a0.muscache.com/airbnb/static/icons/apple-touch-icon-120x120-52b1adb4fe3a8f825fc4b143de12ea4b.png" rel="apple-touch-icon" sizes="120x120"/>
  <link href="https://a0.muscache.com/airbnb/static/icons/apple-touch-icon-152x152-7b7c6444b63d8b6ebad9dae7169e5ed6.png" rel="apple-touch-icon" sizes="152x152"/>
  <link href="https://a0.muscache.com/airbnb/static/icons/apple-touch-icon-180x180-bcbe0e3960cd084eb8eaf1353cf3c730.png" rel="apple-touch-icon" sizes="180x180"/>
  <link href="https://a0.muscache.com/airbnb/static/icons/android-icon-192x192-c0465f9f0380893768972a31a614b670.png" rel="icon" sizes="192x192"/>
  <link href="https://a0.muscache.com/airbnb/static/logotype_favicon-21cc8e6c6a2cca43f061d2dcabdf6e58.ico" rel="shortcut icon" sizes="76x76" type="image/x-icon"/>
  <link color="#FF5A5F" href="https://a0.muscache.com/airbnb/static/icons/airbnb-0611901eac33ccfa5e93d793a2e21f09.svg" rel="mask-icon"/>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/runtime-b57b0e08.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/shims_post_modules-13ce83d5.js" type="module">
  </script>
  <script crossorigin="anonymous" defer="" nomodule="nomodule" src="https://a0.muscache.com/airbnb/static/packages/shims_pre_modules-e891e725.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/moment/en-gb-83fb5bb3.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/core-guest-loop/phrases_manifest/en-GB/core-guest-spa/hyperloop/index-de105ecd10.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/moment-e464f4fa.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/react-53943884.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/aphrodite-effb7b96.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/d4ba-c3fe1044.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/d964-e35ee405.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/eece-9b1186c9.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/9026-0056b213.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/4894-c93ff27e.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/initializers-e188352f.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/53d5-e1ee27c8.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/9946-9c87f159.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/fc17-44f0a0fe.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/Corgi-App-5578808d.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/0e11-7806c3a4.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/bingo_pdp_route-prepare-d22facd5.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/71a3-4e5e722d.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/2cba-0b2d32e5.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/5a56-302bda2d.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/248a-079fbf22.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/3f20-43fca534.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/3fe7-7c5e5dbd.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/db73-649d484e.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/8a31-6d28d6eb.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/bingo_pdp_route-093e4aa4.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/2ce3-462fca54.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/13fb-41bb616c.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/f404-bcb05627.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/7c5a-023179b8.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/cb2d-db65c0cf.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/a295-ed403836.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/a8c2-81cd55c6.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/b7a5-dddbeaf4.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/f297-33c6a6e5.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/7c67-f40b955a.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/f8fa-fe0c81e4.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/fcb4-5d183261.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/698c-e577587b.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/e176-de3bd41d.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/fba3-29d6cafb.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/e99d-0e6236ae.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/bf4b-0f2b80d5.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/459b-ee73efd5.js">
  </script>
  <script crossorigin="anonymous" defer="" src="https://a0.muscache.com/airbnb/static/packages/95b9-a39f5abd.js">
  </scr

Ответы [ 2 ]

0 голосов
/ 30 апреля 2020

Вам нужно вызвать WebDriverWait () и подождать, пока presence_of_element_located () загрузит элемент, прежде чем вы получите исходный код страницы.

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from bs4 import BeautifulSoup


Driver = webdriver.Firefox()
Driver.get("https://www.airbnb.co.uk/rooms/17394193?location=Whitby&check_in=2020-05-18&check_out=2020-05-21&source_impression_id=p3_1588098317_3ZR4OmXOPF8LDdm7&guests=1&adults=1")
url = Driver.current_url
WebDriverWait(Driver,20).until(EC.presence_of_element_located((By.CSS_SELECTOR,"table[role='presentation']")))
PageSourceURL = Driver.page_source
Soup = BeautifulSoup(PageSourceURL, features='html.parser')
PageHTML = Soup.prettify()
print(PageHTML)
0 голосов
/ 30 апреля 2020

Из этого html фрагмента кажется, что вы не на той странице, которой хотите быть:

<title>
    Holiday Lets, Homes, Experiences &amp; Places - Airbnb
</title>

С заголовка url , который вы хотите На самом деле записка выглядит так:

<title>
    Bumble Bee Studio Apartment - Flats for Rent in Whitby, England, United Kingdom
</title>

Я предполагаю, что вы перенаправлены на главную страницу airbnb.

Добро пожаловать на сайт PullRequest, где вы можете задавать вопросы и получать ответы от других членов сообщества.
...