питон селен очистить всю таблицу - PullRequest
0 голосов
/ 21 сентября 2018

Цель этого кода - очистить таблицу данных от некоторых ссылок, а затем превратить ее в фрейм данных Pandas.

Проблема в том, что этот код очищает только первые 7 строк, которые находятся вПервая страница таблицы, и я хочу захватить всю таблицу.поэтому, когда я попытался перебрать страницы таблицы, я получил ошибку.

Вот код:

from selenium import webdriver

urls = open(r"C:\Users\Sayed\Desktop\script\sample.txt").readlines()
for url in urls:
    driver = webdriver.Chrome(r"D:\Projects\Tutorial\Driver\chromedriver.exe")
    driver.get(url)
    for item in driver.find_element_by_xpath('//*[contains(@id,"showMoreHistory")]/a'):
        driver.execute_script("arguments[0].click();", item)

    for table in driver.find_elements_by_xpath('//*[contains(@id,"eventHistoryTable")]//tr'):
        data = [item.text for item in table.find_elements_by_xpath(".//*[self::td or self::th]")]
        print(data)

Вот ошибка:

Traceback (самая последняяпоследний вызов):

Файл "D: /Projects/Tutorial/ff.py", строка 8, для элемента в driver.find_element_by_xpath ('// * [содержит (@id, "showMoreHistory")]/ a '):

TypeError: объект' WebElement 'не повторяется

Ответы [ 2 ]

0 голосов
/ 21 сентября 2018

В соответствии с вашим вопросом и URL https://www.investing.com/economic-calendar/investing.com-eur-usd-index-1155, чтобы очистить всю таблицу, вы можете использовать следующее решение:

  • Кодовый блок:

    # -*- coding: UTF-8 -*-
    from selenium import webdriver
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.support import expected_conditions as EC
    from selenium.common.exceptions import TimeoutException
    
    table_rows = []
    options = webdriver.ChromeOptions() 
    options.add_argument("start-maximized")
    options.add_argument('disable-infobars')
    driver=webdriver.Chrome(chrome_options=options, executable_path=r'C:\WebDrivers\chromedriver.exe')
    driver.get("https://www.investing.com/economic-calendar/investing.com-eur-usd-index-1155")
    show_more_button = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "table.genTbl.openTbl.ecHistoryTbl#eventHistoryTable1155 tr>th.left.symbol")))
    driver.execute_script("arguments[0].scrollIntoView(true);",show_more_button);
    myLength = len(WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "table.genTbl.openTbl.ecHistoryTbl#eventHistoryTable1155 tr[event_attr_id='1155']"))))
    while True:
        try:
            WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "div#showMoreHistory1155>a"))).click()
            WebDriverWait(driver, 20).until(lambda driver: len(driver.find_elements_by_css_selector("table.genTbl.openTbl.ecHistoryTbl#eventHistoryTable1155 tr[event_attr_id='1155']")) > myLength)
            table_rows = driver.find_elements_by_css_selector("table.genTbl.openTbl.ecHistoryTbl#eventHistoryTable1155 tr[event_attr_id='1155']")
            myLength = len(table_rows)
        except TimeoutException:
            break
    for row in table_rows:
        print(row.text)
    driver.quit()
    
  • Выход на консоль:

    Sep 24, 2018 01:30
    Sep 17, 2018 01:30 53.1%   55.3%
    Sep 10, 2018 01:30 55.3%   49.0%
    Sep 03, 2018 01:30 49.0%   43.3%
    Aug 27, 2018 01:30 43.3%   49.7%
    Aug 20, 2018 01:30 49.7%   52.5%
    Aug 13, 2018 01:30 52.5%   59.9%
    Aug 06, 2018 01:30 59.9%   62.6%
    Jul 30, 2018 01:30 62.6%   52.8%
    Jul 23, 2018 01:30 52.8%   52.7%
    Jul 16, 2018 01:30 52.7%   46.2%
    Jul 10, 2018 01:30 46.2%   55.3%
    Jul 02, 2018 01:30 55.3%   53.1%
    Jun 25, 2018 01:30 53.1%   66.2%
    Jun 18, 2018 01:30 66.2%   65.2%
    Jun 11, 2018 01:30 65.2%   61.2%
    Jun 04, 2018 01:30 61.2%   63.9%
    May 28, 2018 01:30 63.9%   67.0%
    May 21, 2018 01:30 67.0%   63.2%
    May 14, 2018 01:30 63.2%   61.3%
    May 07, 2018 01:30 61.3%   57.6%
    Apr 30, 2018 01:30 57.6%   64.8%
    Apr 23, 2018 01:30 64.8%   65.2%
    Apr 16, 2018 01:30 65.2%   60.4%
    Apr 09, 2018 01:30 60.4%   63.3%
    Apr 02, 2018 01:30 63.3%   62.1%
    Mar 26, 2018 01:30 62.1%   65.7%
    Mar 19, 2018 02:30 65.7%   56.0%
    Mar 12, 2018 02:30 56.0%   62.3%
    Mar 05, 2018 02:30 62.3%   59.1%
    Feb 26, 2018 02:30 59.1%   52.8%
    Feb 19, 2018 02:30 52.8%   55.8%
    Feb 12, 2018 02:30 55.8%   51.7%
    Feb 05, 2018 02:30 51.7%   56.8%
    Jan 29, 2018 02:30 56.8%   52.2%
    Jan 22, 2018 02:30 52.2%   56.1%
    Jan 15, 2018 02:30 56.1%   60.2%
    Jan 08, 2018 02:30 60.2%   54.6%
    Jan 01, 2018 02:30 54.6%   48.4%
    Dec 25, 2017 02:30 48.4%   66.4%
    Dec 18, 2017 02:30 66.4%   58.9%
    Dec 11, 2017 02:30 58.9%   53.8%
    Dec 04, 2017 02:30 53.8%   55.9%
    Nov 28, 2017 02:30 55.9%   53.7%
    Nov 20, 2017 02:30 53.7%   58.6%
    Nov 14, 2017 02:30 58.6%   52.8%
    Nov 06, 2017 02:30 52.8%   57.6%
    Oct 30, 2017 01:30 57.6%   54.7%
    Oct 23, 2017 01:30 54.7%   58.9%
    Oct 16, 2017 01:30 58.9%   57.3%
    Oct 09, 2017 01:30 57.3%   64.0%
    Oct 02, 2017 01:30 64.0%   47.5%
    Sep 25, 2017 01:30 47.5%   52.2%
    Sep 18, 2017 01:30 52.2%   55.5%
    Sep 11, 2017 01:30 55.5%   54.3%
    Sep 04, 2017 01:30 54.3%   54.2%
    Aug 28, 2017 01:30 54.2%   51.4%
    Aug 21, 2017 01:30 51.4%   57.4%
    Aug 14, 2017 01:30 57.4%   51.2%
    Aug 07, 2017 01:30 51.2%   51.3%
    Jul 31, 2017 01:30 51.3%   52.8%
    Jul 24, 2017 01:30 52.8%   53.3%
    Jul 17, 2017 01:30 53.3%   54.1%
    Jul 10, 2017 01:30 54.1%   51.9%
    Jul 03, 2017 01:30 51.9%   40.6%
    Jun 26, 2017 01:30 40.6%   52.6%
    Jun 19, 2017 01:30 52.6%   51.0%
    Jun 12, 2017 01:30 51.0%   52.1%
    Jun 05, 2017 01:30 52.1%   59.1%
    May 29, 2017 01:30 59.1%   46.9%
    May 22, 2017 01:30 46.9%   53.0%
    May 15, 2017 01:30 53.0%   44.9%
    May 08, 2017 01:30 44.9%   37.0%
    May 01, 2017 01:30 37.0%   43.0%
    Apr 24, 2017 01:30 43.0%   52.4%
    Apr 10, 2017 01:30 52.4%   55.1%
    Apr 03, 2017 01:30 55.1%   43.5%
    Mar 27, 2017 02:30 43.5%   36.0%
    Mar 20, 2017 02:30 36.0%   32.3%
    Mar 13, 2017 02:30 32.3%   42.8%
    Mar 06, 2017 02:30 42.8%   39.1%
    Feb 27, 2017 02:30 39.1%   41.7%
    Feb 20, 2017 02:30 41.7%   43.2%
    Feb 13, 2017 02:30 43.2%   36.6%
    Feb 06, 2017 02:30 36.6%   39.7%
    Jan 30, 2017 02:30 39.7%   33.5%
    Jan 23, 2017 02:30 33.5%   36.8%
    Jan 16, 2017 03:30 36.8%   37.0%
    Jan 09, 2017 02:30 37.0%   41.6%
    Jan 02, 2017 02:30 41.6%   35.8%
    Dec 26, 2016 02:30 35.8%   42.3%
    Dec 19, 2016 02:30 42.3%   39.7%
    Dec 12, 2016 04:15 39.7%   33.8%
    Dec 05, 2016 02:30 33.8%   37.1%
    Nov 29, 2016 02:30 37.1%   41.9%
    Nov 21, 2016 02:30 41.9%   39.1%
    Nov 15, 2016 02:00 39.1%   20.5%
    Nov 07, 2016 02:30 20.5%   27.4%
    Oct 31, 2016 02:30 27.4%   33.4%
    Oct 25, 2016 02:30 33.4%   30.8%
    Oct 18, 2016 02:30 30.8%   26.6%
    Oct 10, 2016 02:30 26.6%   28.6%
    Oct 05, 2016 02:00 28.6%   26.2%
    Sep 26, 2016 02:30 26.2%   34.8%
    Sep 19, 2016 02:30 34.8%   21.2%
    Sep 13, 2016 02:30 21.2%   27.0%
    Sep 05, 2016 02:30 27.0%   32.7%
    Aug 29, 2016 02:30 32.7%   23.9%
    Aug 22, 2016 02:30 23.9%   28.8%
    Aug 15, 2016 02:30 28.8%   30.8%
    Aug 08, 2016 02:30 30.8%   20.3%
    Aug 01, 2016 02:30 20.3%   30.2%
    Jul 25, 2016 02:30 30.2%   29.5%
    Jul 18, 2016 02:30 29.5%   26.2%
    Jul 11, 2016 02:30 26.2%   27.5%
    Jul 04, 2016 02:30 27.5%   26.8%
    Jun 27, 2016 02:30 26.8%   35.1%
    Jun 20, 2016 02:30 35.1%   22.8%
    Jun 13, 2016 02:30 22.8%   32.5%
    Jun 06, 2016 02:30 32.5%   35.6%
    May 30, 2016 02:30 35.6%   39.5%
    May 23, 2016 02:30 39.5%   37.8%
    May 16, 2016 03:30 37.8%   39.5%
    May 09, 2016 02:30 39.5%   30.3%
    May 02, 2016 02:30 30.3%   32.9%
    Apr 25, 2016 02:30 32.9%   29.6%
    Apr 18, 2016 06:00 29.6%   30.5%
    Apr 11, 2016 02:30 30.5%   22.7%
    Apr 04, 2016 03:30 22.7%   32.1%
    Mar 28, 2016 03:30 32.1%   23.2%
    Mar 21, 2016 03:30 23.2%   26.7%
    Mar 14, 2016 03:30 26.7%   22.6%
    Mar 07, 2016 03:30 22.6%   33.7%
    Feb 29, 2016 03:30 33.7%   34.8%
    Feb 22, 2016 03:30 34.8%   33.3%
    Feb 15, 2016 03:30 33.3%   33.3%
    Feb 08, 2016 03:30 33.3%   34.3%
    Feb 01, 2016 03:30 34.3%   33.2%
    Jan 25, 2016 03:30 33.2%   27.0%
    Jan 18, 2016 03:30 27.0%   27.2%
    Jan 11, 2016 03:30 27.2%   30.0%
    Jan 05, 2016 03:30 30.0%   24.0%
    Dec 29, 2015 03:30 24.0%   33.3%
    Dec 21, 2015 03:30 33.3%   31.2%
    Dec 14, 2015 04:30 31.2%   27.1%
    Dec 07, 2015 03:00 27.1%   29.8%
    Dec 01, 2015 03:00 29.8%   27.5%
    Nov 23, 2015 03:00 27.5%   33.1%
    Nov 17, 2015 04:00 33.1%   26.8%
    Nov 09, 2015 02:30 26.8%   24.3%
    Nov 02, 2015 01:30 24.3%   36.4%
    Oct 26, 2015 01:30 36.4%   28.6%
    Oct 19, 2015 01:30 28.6%   25.5%
    Oct 11, 2015 04:30 25.5%   29.6%
    Oct 06, 2015 01:00 29.6%   28.5%
    Sep 28, 2015 01:30 28.5%   29.1%
    Sep 21, 2015 01:30 29.1%   21.2%
    Sep 14, 2015 01:30 21.2%   29.8%
    Sep 07, 2015 01:30 29.8%   36.3%
    Aug 31, 2015 01:30 36.3%   35.6%
    Aug 24, 2015 01:30 35.6%   26.4%
    Aug 17, 2015 01:30 26.4%   24.8%
    Aug 10, 2015 01:30 24.8%   29.7%
    Aug 03, 2015 01:30 29.7%   24.8%
    Jul 27, 2015 01:30 24.8%   30.7%
    Jul 20, 2015 01:30 30.7%   27.9%
    Jul 13, 2015 01:30 27.9%   27.4%
    Jul 07, 2015 01:30 27.4%   26.8%
    Jun 29, 2015 01:30 26.8%   33.1%
    Jun 22, 2015 01:30 33.1%   33.6%
    Jun 15, 2015 03:30 33.6%   28.9%
    Jun 08, 2015 01:30 28.9%   23.0%
    Jun 01, 2015 01:30 23.0%   34.0%
    May 25, 2015 04:00 34.0%   28.9%
    May 18, 2015 01:30 28.9%   28.8%
    May 11, 2015 01:30 28.8%   28.3%
    May 04, 2015 02:00 28.3%   23.7%
    Apr 27, 2015 01:30 23.7%   27.2%
    Apr 20, 2015 01:30 27.2%   33.7%
    Apr 13, 2015 02:00 33.7%   23.2%
    Apr 06, 2015 02:00 23.2%   19.8%
    Mar 30, 2015 02:30 19.8%   24.1%
    Mar 23, 2015 02:30 24.1%   27.2%
    Mar 16, 2015 03:00 27.2%   35.6%
    Mar 09, 2015 02:30 35.6%   34.4%
    Mar 02, 2015 02:30 34.4%   30.2%
    Feb 23, 2015 02:30 30.2%   26.6%
    Feb 16, 2015 03:30 26.6%   23.8%
    Feb 09, 2015 02:30 23.8%   26.4%
    Feb 02, 2015 02:30 26.4%   23.9%
    Jan 26, 2015 02:30 23.9%   28.9%
    Jan 19, 2015 02:30 28.9%   35.5%
    Jan 12, 2015 02:30 35.5%   38.1%
    Jan 06, 2015 03:30 38.1%   40.6%
    Jan 01, 2015 02:30 40.6%   45.2%
    Dec 22, 2014 02:00 45.2%   39.8%
    Dec 15, 2014 02:00 39.8%   41.7%
    Dec 07, 2014 21:00 41.7%   33.8%
    Dec 02, 2014 03:00 33.8%   38.6%
    Nov 24, 2014 01:30 38.6%   39.2%
    Nov 17, 2014 01:00 39.2%   33.1%
    Nov 10, 2014 01:00 33.1%   35.4%
    Nov 04, 2014 03:00 35.4%   37.3%
    Oct 27, 2014 02:00 37.3%   33.7%
    Oct 19, 2014 22:00 33.7%   36.2%
    Oct 13, 2014 01:00 36.2%   44.5%
    Oct 06, 2014 01:00 44.5%   41.3%
    Sep 29, 2014 01:00 41.3%   50.3%
    Sep 21, 2014 22:35 50.3%   39.5%
    Sep 15, 2014 00:45 39.5%   39.9%
    Sep 08, 2014 01:00 39.9%   42.8%
    Sep 01, 2014 02:35 42.8%   41.9%
    Aug 25, 2014 01:00 41.9%   38.9%
    Aug 18, 2014 01:00 38.9%   34.0%
    Aug 11, 2014 01:00 34.0%   38.2%
    Aug 04, 2014 01:00 38.2%   38.4%
    Jul 28, 2014 01:00 38.4%   42.3%
    Jul 21, 2014 01:00 42.3%   37.2%
    Jul 14, 2014 01:00 37.2%   39.6%
    Jul 07, 2014 01:00 39.6%   39.8%
    Jun 30, 2014 01:00 39.8%   36.1%
    Jun 23, 2014 00:30 36.1%   37.6%
    Jun 16, 2014 00:30 37.6%   36.5%
    Jun 09, 2014 00:30 36.5%   44.1%
    Jun 01, 2014 22:00 44.1%   49.4%
    May 26, 2014 00:30 49.4%   41.0%
    May 19, 2014 00:00 41.0%   55.0%
    May 12, 2014 00:00 55.0%   41.1%
    May 04, 2014 06:00 41.1%   43.5%
    Apr 27, 2014 06:00 43.5%   40.3%
    Apr 06, 2014 06:00 40.3%
    
0 голосов
/ 21 сентября 2018

Проверьте приведенный ниже скрипт, чтобы получить всю таблицу с этой веб-страницы.Я использовал скрытую задержку в своем сценарии, что не является хорошей практикой.Однако вы всегда можете определить Explicit Wait, чтобы сделать код более устойчивым:

import time
from selenium import webdriver

url = 'https://www.investing.com/economic-calendar/investing.com-eur-usd-index-1155'

driver = webdriver.Chrome()
driver.get(url)
item = driver.find_element_by_xpath('//*[contains(@id,"showMoreHistory")]/a')
driver.execute_script("arguments[0].click();", item)
time.sleep(2)
for table in driver.find_elements_by_xpath('//*[contains(@id,"eventHistoryTable")]//tr'):
    data = [item.text for item in table.find_elements_by_xpath(".//*[self::td or self::th]")]
    print(data)

driver.quit()

Чтобы получить все данные, исчерпывающие кнопку show more вместе с определением Explicit Wait, вы можете попробовать следующий скрипт:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

url = 'https://www.investing.com/economic-calendar/investing.com-eur-usd-index-1155'

driver = webdriver.Chrome()
driver.get(url)
wait = WebDriverWait(driver,10)

while True:
    try:
        item = wait.until(EC.visibility_of_element_located((By.XPATH,'//*[contains(@id,"showMoreHistory")]/a')))
        driver.execute_script("arguments[0].click();", item)
    except Exception:break

for table in wait.until(EC.visibility_of_all_elements_located((By.XPATH,'//*[contains(@id,"eventHistoryTable")]//tr'))):
    data = [item.text for item in table.find_elements_by_xpath(".//*[self::td or self::th]")]
    print(data)

driver.quit()
...