Это сайт о регистрации данных о загрязнении воздуха.Данные являются почасовыми данными за несколько лет.Есть несколько выпадающих меню, таких как местоположение (например, Тайбэй (臺北市)), год, месяцы и дни.Я написал код, который я знаю до сих пор и прикрепил ниже.Интересно, как выбрать место для сбора всех данных с сайта.Я также прикрепил веб-контент ниже, моя цель - собрать данные из последних 20 строк контента.
import requests
from bs4 import BeautifulSoup as bs
res = requests.post('https://erdb.epa.gov.tw/DataRepository/Air/Flue_CEMS_DATA.aspx')
soup = BeautifulSoup(res.text, 'lxml')
table = soup.find_all(text='台北市')[0].parent.parent.parent
for x in range(1,51):
for y in range(0,13):
data1 = table.select('tr')[x].select('td')[y].text
print(data1)
Я ожидаю, что смогу удалить все данные в выбранном месте.Ниже приводится содержание сайта.Я только за исключением информации в последних 20 строках.
<select name="ctl00$ContentPlaceHolder1$ucSearchCondition$ddlEPB" id="ctl00_ContentPlaceHolder1_ucSearchCondition_ddlEPB">
<option selected="selected" value="臺北市">臺北市</option>
<option value="新北市">新北市</option>
<option value="基隆市">基隆市</option>
<option value="桃園市">桃園市</option>
<option value="新竹市">新竹市</option>
<option value="新竹縣">新竹縣</option>
<option value="苗栗縣">苗栗縣</option>
<option value="臺中市">臺中市</option>
<option value="彰化縣">彰化縣</option>
<option value="雲林縣">雲林縣</option>
<option value="南投縣">南投縣</option>
<option value="嘉義市">嘉義市</option>
<option value="嘉義縣">嘉義縣</option>
<option value="臺南市">臺南市</option>
<option value="高雄市">高雄市</option>
<option value="屏東縣">屏東縣</option>
<option value="宜蘭縣">宜蘭縣</option>
<option value="花蓮縣">花蓮縣</option>
<option value="臺東縣">臺東縣</option>
<option value="澎湖縣">澎湖縣</option>
<option value="連江縣">連江縣</option>
<option value="金門縣">金門縣</option>
</select>
</td>
</tr>
<tr>
<td>日期:</td>
<td>
<select name="ctl00$ContentPlaceHolder1$ucSearchCondition$ddlYearS" id="ctl00_ContentPlaceHolder1_ucSearchCondition_ddlYearS">
<option selected="selected" value="2019">2019</option>
<option value="2018">2018</option>
<option value="2017">2017</option>
<option value="2016">2016</option>
<option value="2015">2015</option>
<option value="2014">2014</option>
<option value="2013">2013</option>
<option value="2012">2012</option>
<option value="2011">2011</option>
<option value="2010">2010</option>
<option value="2009">2009</option>
<option value="2008">2008</option>
<option value="2007">2007</option>
<option value="2006">2006</option>
<option value="2005">2005</option>
<option value="2004">2004</option>
</select>年
<select name="ctl00$ContentPlaceHolder1$ucSearchCondition$ddlMonthS" id="ctl00_ContentPlaceHolder1_ucSearchCondition_ddlMonthS">
<option value="01">01</option>
<option value="02">02</option>
<option value="03">03</option>
<option selected="selected" value="04">04</option>
<option value="05">05</option>
<option value="06">06</option>
<option value="07">07</option>
<option value="08">08</option>
<option value="09">09</option>
<option value="10">10</option>
<option value="11">11</option>
<option value="12">12</option>
</select>月
<select name="ctl00$ContentPlaceHolder1$ucSearchCondition$ddlDayS" id="ctl00_ContentPlaceHolder1_ucSearchCondition_ddlDayS">
<option value="01">01</option>
<option value="02">02</option>
<option value="03">03</option>
<option value="04">04</option>
<option value="05">05</option>
<option value="06">06</option>
<option value="07">07</option>
<option value="08">08</option>
<option value="09">09</option>
<option value="10">10</option>
<option value="11">11</option>
<option value="12">12</option>
<option value="13">13</option>
<option value="14">14</option>
<option value="15">15</option>
<option value="16">16</option>
<option selected="selected" value="17">17</option>
<option value="18">18</option>
<option value="19">19</option>
<option value="20">20</option>
<option value="21">21</option>
<option value="22">22</option>
<option value="23">23</option>
<option value="24">24</option>
<option value="25">25</option>
<option value="26">26</option>
<option value="27">27</option>
<option value="28">28</option>
<option value="29">29</option>
<option value="30">30</option>
</select>日
</td>
</tr>
<tr style="display:none;">
<td>日期區間(迄):</td>
<td>
<span id="ctl00_ContentPlaceHolder1_ucSearchCondition_lbYearE">2019</span>年
<span id="ctl00_ContentPlaceHolder1_ucSearchCondition_lbMonthE">04</span>月
<span id="ctl00_ContentPlaceHolder1_ucSearchCondition_lbDayE">17</span>日
</td>
</tr>
</table>
</div>
</div>
</div>
<input type="image" name="ctl00$ContentPlaceHolder1$imgSearch" id="ctl00_ContentPlaceHolder1_imgSearch" src="../../Resource/images/search.png" onclick="javascript:WebForm_DoPostBackWithOptions(new WebForm_PostBackOptions("ctl00$ContentPlaceHolder1$imgSearch", "", true, "", "", false, false))" border="0" />
</div>
<div id="description" style="float: right; width: 57%;">
<link href="/Resource/css/MetaStyle.css" rel="stylesheet" type="text/css" />
<div>
<table class="ExplainBox" id="tbMetaData" border="0">
<tr>
<td style="width:200px" class="title-r">資料集名稱</td>
<td>
<span id="ctl00_ContentPlaceHolder1_ucMetaData_lblDataSetName">固定污染源CEMS監測數據紀錄值資料集</span>
</td>
</tr>
<tr>
<td class="title-r">資料集描述</td>
<td>
<span id="ctl00_ContentPlaceHolder1_ucMetaData_lblMetaDesc">本資料集收錄CEMS監測數據紀錄值資料,因資料整備特性,提供七日前資料。</span>
</td>
</tr>
<tbody id="ctl00_ContentPlaceHolder1_ucMetaData_tbodyMIS" align="left">
<tr>
<td class="auto-style1 title-r">主要欄位說明</td>
<td class="auto-style1">
<span id="ctl00_ContentPlaceHolder1_ucMetaData_lblFieldDesc">所屬環保局(Epb)、管制編號(CNO)、公司簡稱(Abbr)、煙囪序號(PolNo)、監測項目名稱(ItemDesc)、監測項目編號(Item)、監測時間(M_Time)、監測數值(M_Val)、排放標準值(Std)、單位(Unit)、資料辨識碼(Code2)、排放標準依據(Std_s)。</span>
</td>
</tr>
<tr>
<td class="title-r">收錄期間</td>
<td>
<span id="ctl00_ContentPlaceHolder1_ucMetaData_lblDataPeriod">2004/01/01至2019/04/17</span>
</td>
</tr>
</tbody>
<tr>
<td class="title-r">更新頻率</td>
<td>
<span id="ctl00_ContentPlaceHolder1_ucMetaData_lblUpdateFrequencyId">每天</span>
</td>
</tr>
<tr>
<td class="title-r">資料集內容最後更新日期</td>
<td>
<span id="ctl00_ContentPlaceHolder1_ucMetaData_lblUpdateTime">2019/04/17</span>
</td>
</tr>
<tr>
<td class="title-r">提供機關</td>
<td>
<span id="ctl00_ContentPlaceHolder1_ucMetaData_lblDatasetAgencyId">行政院環境保護署</span>
</td>
</tr>
</table>
</div>
</div>
<div class="clr"></div>
</div>
<div class="title" style="float: left;">
<ul>
<li class="active"><a href="#">固定污染源CEMS監測數據紀錄值資料集 </a></li>
</ul>
</div>
<div style="float: right;">
<script type="text/javascript" src="/Resource/js/gvColspan.js" charset="UTF-8"></script>
<script>
$(function () {
//註解
if ($.trim($('#ctl00_ContentPlaceHolder1_ShareAndExport_Label2').html()) == "") {
$('.tbResult').each(function () {
var comment1 = $(this).parents('tr:first').next().children(':first').html()
var comment2 = $(this).parents('tr:first').next().next().children(':first').html()
if (comment1 != null && $.trim(comment1) != "") {
if (comment2 != null) {
$('#ctl00_ContentPlaceHolder1_ShareAndExport_Label1').html(comment1)
$('#ctl00_ContentPlaceHolder1_ShareAndExport_comment1').val(comment1.replace(/<br>/ig, "$"))
$('#ctl00_ContentPlaceHolder1_ShareAndExport_Label2').html(comment2)
$('#ctl00_ContentPlaceHolder1_ShareAndExport_comment2').val(comment2.replace(/<br>/ig, "$"))
} else {
$('#ctl00_ContentPlaceHolder1_ShareAndExport_Label2').html(comment1);
$('#ctl00_ContentPlaceHolder1_ShareAndExport_comment2').val(comment1.replace(/<br>/ig, "$"))
}
}
})
}
})
function myPrint() {
var newWindow = window.open("../../Resource/viewPrint.aspx", "_blank");
return false;
}
function printScreen(printlist) {
var value = printlist.innerHTML;
var printPage = window.open("", "Printing...", "");
printPage.document.open();
printPage.document.write("<HTML><head>");
printPage.document.write("<link rel='stylesheet' href='../../../Resource/css/PageStyle.css' />");
printPage.document.write("</head><BODY><input type='button' value='列印報表' onclick='window.print();window.close();'></input></br></br>");
printPage.document.write(value);
printPage.document.close("</BODY></HTML>");
}
function newwindow() {
var tagname = $('#sitemap').text().split('/')[4].trim();
var description = $('#ctl00_ContentPlaceHolder1_ucMetaData_lblMetaDesc').text().trim();
var picture = $('.logo img').attr('src');
var caption = $('#ctl00_ContentPlaceHolder1_ucMetaData_lblDatasetAgencyId').text().trim();;
//window.open("https://www.facebook.com/dialog/feed?app_id=1470773149913325&redirect_uri=https://www.facebook.com&display=popup&caption=" + encodeURIComponent(tagname) + "&name=" + encodeURIComponent(tagname) + "&description" + encodeURIComponent(tagname) + "&link=" + encodeURIComponent(location.href));
window.open("https://www.facebook.com/dialog/feed?app_id=1470773149913325&redirect_uri=https://www.facebook.com&display=popup&caption=" + encodeURIComponent(caption) + "&picture" + encodeURIComponent(picture) + "&name=" + encodeURIComponent(tagname) + "&description=" + encodeURIComponent(description) + "&link=" + encodeURIComponent(location.href));
}
</script>
<style type="text/css">
#ctl00_ContentPlaceHolder1_ShareAndExport_gvPrint th {
background-color: #DEDEDE;
}
</style>
<div>
<a href="javascript:newwindow()">
<img src="../../Resource/images/btnFB.jpg" /></a>
<a href="javascript: void(window.open('https://plus.google.com/share?url='.concat(encodeURIComponent(location.href)),'gplusshare'))">
<img src="../../Resource/images/btnGooglePlus.jpg" /></a>
<input type="image" name="ctl00$ContentPlaceHolder1$ShareAndExport$ibtnExcel" id="ctl00_ContentPlaceHolder1_ShareAndExport_ibtnExcel" src="../../Resource/images/btnCSV.png" onclick="javascript:WebForm_DoPostBackWithOptions(new WebForm_PostBackOptions("ctl00$ContentPlaceHolder1$ShareAndExport$ibtnExcel", "", true, "", "", false, false))" border="0" />
<input type="image" name="ctl00$ContentPlaceHolder1$ShareAndExport$ImageButton1" id="ctl00_ContentPlaceHolder1_ShareAndExport_ImageButton1" src="../../Resource/images/btnPrint.jpg" alt="列印" onclick="printScreen(ContentPlaceHolder1_myHead_printGV); return false;WebForm_DoPostBackWithOptions(new WebForm_PostBackOptions("ctl00$ContentPlaceHolder1$ShareAndExport$ImageButton1", "", true, "", "", false, false))" border="0" />
</div>
<div id="ctl00_ContentPlaceHolder1_ShareAndExport_printdiv" style="display: none;">
<div>
<table class="gvColspan" cellspacing="0" rules="all" border="1" id="ctl00_ContentPlaceHolder1_ShareAndExport_gvPrint" width="90%">
<tr>
<th class="RowSpan" scope="col">所屬環保局</th><th class="RowSpan" scope="col">管制編號</th><th scope="col">公司簡稱</th><th scope="col">煙囪序號</th><th scope="col">監測項目</th><th scope="col">監測項目編號</th><th scope="col">監測時間</th><th scope="col">監測數值</th><th scope="col">排放標準</th><th scope="col">單位</th><th scope="col">資料識別碼</th><th scope="col">資料識別碼</th><th scope="col">排放標準依據</th>
**</tr><tr>
<td align="center" width="60">台北市</td><td align="center" width="80">A4000283</td><td align="center" width="80">臺北市政府環境保護局木柵垃圾焚化廠</td><td align="center" width="80">P002</td><td align="center" width="80">氮氧化物監測設施十五分鐘數據紀錄值</td><td align="center" width="60">923 </td><td align="center" width="30">2019-04-17 00:00:00</td><td align="center" width="30">0.00</td><td align="center" width="50">220</td><td align="center" width="50">ppm </td><td align="center" width="50">00</td><td align="center" width="50">固定污染源暫停運轉時監測設施之量測值</td><td align="center" width="50">廢棄物焚化爐空氣污染物排放標準</td>
</tr><tr>
<td align="center" width="60">台北市</td><td align="center" width="80">A4000283</td><td align="center" width="80">臺北市政府環境保護局木柵垃圾焚化廠</td><td align="center" width="80">P002</td><td align="center" width="80">氮氧化物監測設施一小時數據平均值</td><td align="center" width="60">223 </td><td align="center" width="30">2019-04-17 00:00:00</td><td align="center" width="30">0.00</td><td align="center" width="50">220</td><td align="center" width="50">ppm </td><td align="center" width="50">00</td><td align="center" width="50">固定污染源暫停運轉時監測設施之量測值</td><td align="center" width="50">廢棄物焚化爐空氣污染物排放標準</td>
</tr><tr>
<td align="center" width="60">台北市</td><td align="center" width="80">A4000283</td><td align="center" width="80">臺北市政府環境保護局木柵垃圾焚化廠</td><td align="center" width="80">P002</td><td align="center" width="80">氯化氫監測設施十五分鐘數據紀錄值</td><td align="center" width="60">926 </td><td align="center" width="30">2019-04-17 00:00:00</td><td align="center" width="30">0.00</td><td align="center" width="50">60</td><td align="center" width="50">ppm </td><td align="center" width="50">00</td><td align="center" width="50">固定污染源暫停運轉時監測設施之量測值</td><td align="center" width="50">廢棄物焚化爐空氣污染物排放標準</td>
</tr><tr>
<td align="center" width="60">台北市</td><td align="center" width="80">A4000283</td><td align="center" width="80">臺北市政府環境保護局木柵垃圾焚化廠</td><td align="center" width="80">P004</td><td align="center" width="80">不透光率六分鐘數據紀錄值</td><td align="center" width="60">911 </td><td align="center" width="30">2019-04-17 00:00:00</td><td align="center" width="30">0.00</td><td align="center" width="50">20</td><td align="center" width="50">% </td><td align="center" width="50">00</td><td align="center" width="50">固定污染源暫停運轉時監測設施之量測值</td><td align="center" width="50">廢棄物焚化爐空氣污染物排放標準</td>
</tr><tr>
<td align="center" width="60">台北市</td><td align="center" width="80">A4000283</td><td align="center" width="80">臺北市政府環境保護局木柵垃圾焚化廠</td><td align="center" width="80">P004</td><td align="center" width="80">氮氧化物監測設施一小時數據平均值</td><td align="center" width="60">223 </td><td align="center" width="30">2019-04-17 00:00:00</td><td align="center" width="30">0.00</td><td align="center" width="50">220</td><td align="center" width="50">ppm </td><td align="center" width="50">00</td><td align="center" width="50">固定污染源暫停運轉時監測設施之量測值</td><td align="center" width="50">廢棄物焚化爐空氣污染物排放標準</td>
</tr><tr>
<td align="center" width="60">台北市</td><td align="center" width="80">A4000283</td><td align="center" width="80">臺北市政府環境保護局木柵垃圾焚化廠</td><td align="center" width="80">P004</td><td align="center" width="80">氧氣監測設施一小時數據平均值</td><td align="center" width="60">236 </td><td align="center" width="30">2019-04-17 00:00:00</td><td align="center" width="30">0.00</td><td align="center" width="50">無排放標準</td><td align="center" width="50">% </td><td align="center" width="50">00</td><td align="center" width="50">固定污染源暫停運轉時監測設施之量測值</td><td align="center" width="50">無</td>
</tr><tr>
<td align="center" width="60">台北市</td><td align="center" width="80">A4000283</td><td align="center" width="80">臺北市政府環境保護局木柵垃圾焚化廠</td><td align="center" width="80">P004</td><td align="center" width="80">氧氣監測設施十五分鐘數據紀錄值</td><td align="center" width="60">936 </td><td align="center" width="30">2019-04-17 00:00:00</td><td align="center" width="30">0.00</td><td align="center" width="50">無排放標準</td><td align="center" width="50">% </td><td align="center" width="50">00</td><td align="center" width="50">固定污染源暫停運轉時監測設施之量測值</td><td align="center" width="50">無</td>
</tr><tr>
<td align="center" width="60">台北市</td><td align="center" width="80">A4000283</td><td align="center" width="80">臺北市政府環境保護局木柵垃圾焚化廠</td><td align="center" width="80">P004</td><td align="center" width="80">排放流率監測設施一小時數據平均值</td><td align="center" width="60">248 </td><td align="center" width="30">2019-04-17 00:00:00</td><td align="center" width="30">0.00</td><td align="center" width="50">無排放標準</td><td align="center" width="50">CMH </td><td align="center" width="50">00</td><td align="center" width="50">固定污染源暫停運轉時監測設施之量測值</td><td align="center" width="50">無</td>
</tr><tr>
....
....
....
....
....
**