Я скачал набор данных, который дает вам файлы .XLS, но когда я открываю его с помощью блокнота, я вижу HTML теги / код.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
<head>
<title>Monthly Volume Report NRA 000000001508 2019-08-01 to 2019-08-31</title>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
<style type="text/css">
/*<![CDATA[*/
.key td { white-space: nowrap; }
.key td.A,.key td.W,.key td.S,.key td.C,.key td.H { white-space: pre; }
td.A { background-color: #DD9CB3; } /* Salmon */
td.W { background-color: #80F7B5; } /* SeaGreen */
td.S { background-color: #FFFF80; } /* Yellow */
td.C { background-color: #A6CAF0; } /* Blue */
td.H { background-color: #CCCCCC; } /* Weekend grey */
td.WeA { background-color: #99586F; }
td.WeW { background-color: #3CB371; }
td.WeS { background-color: #CCCC40; }
td.WeC { background-color: #6286AC; }
td.WeH { background-color: #CCCCCC; }
td.We,th.We { background-color: #CCCCCC; }
td.P { border: 1px solid #808080; } /* Patched - border */
td.TLI { color:#A0A0A0; } /* Too Little Information - grey text */
td.NOI { color:red; } /* Too Little Information - red text */
.grid td { border:none; } /* No borders by default */
.grid .TBL td, .grid .TBL { border-left:2px solid #000000; } /* Thick left border for intermediate totals */
.grid .TBR td, .grid .TBR { border-right:2px solid #000000; } /* Thick right border for intermediate totals */
.grid .TBT td, .grid .TBT { border-top:2px solid #000000; } /* Thick top border for intermediate totals */
.grid .TBB td, .grid .TBB { border-bottom:2px solid #000000; } /* Thick bottom border for intermediate totals */
.grid .BL td, .grid .BL { border-left:1px solid #000000; } /* Thin left border for intermediate totals */
.grid .BR td, .grid .BR { border-right:1px solid #000000; } /* Thin right border for intermediate totals */
.grid .BT td, .grid .BT { border-top:1px solid #000000; } /* Thin top border for intermediate totals */
.grid .BB td, .grid .BB { border-bottom:1px solid #000000; } /* Thin bottom border for intermediate totals */
.grid .TBP th, .grid .TBP { padding-top: 1.5ex; padding-bottom: 1.5ex; mso-ignore: padding; } /* Top and bottom padding */
th { mso-number-format: \@; } /* So dates in headers are interpreted as strings */
th.XlDM { mso-number-format:"d mmm"; } /* Force short dates */
th.WeXlDM { background-color: #CCCCCC; mso-number-format:"d mmm"; } /* Force short dates */
/*]]>*/
</style>
</head>
<body class="fixed">
<div id="wrapper">
<b>Monthly Volume Report NRA 000000001508 2019-08-01 to 2019-08-31</b> <div id="container">
<div id="content">
<div class="panel panel-default"> <div class="panel-heading"> <table>
<tr><td>Site Name</td><td style='mso-number-format:"\@";' colspan=3>TMU M50 015.0 S </td></tr>
<tr><td>Site ID</td><td style='mso-number-format:"\@";' colspan=3>000000001508</td></tr>
<tr><td>Grid</td><td style='mso-number-format:"\@";' colspan=3>308164236836</td></tr>
<tr><td>Description</td><td colspan=3>M50 Between Jn06 N03/M50 and Jn07 N04/M50, Castleknock, Co. Dublin</td></tr>
</table>
</div> <div class="panel-body"> <div id="selector">
<ul id="dropdownmenu">
<table><tr><td>Setup</td><td colspan=3>M50 1508</td></tr></table>
<table><tr><td>Channel</td><td colspan=3>Each Direction
</td></tr></table>
<table><tr><td>Time Period</td><td colspan=3>1 hour
</td></tr></table>
<table><tr><td>Class</td><td colspan=3>Any</td></tr></table>
<table><tr><td>Exclude data:</td><td colspan=3>None</td></tr></table>
</ul>
</div> <!--selector-->
<table class="grid" id="gridTable" BORDER=1>
<tr><th style="text-align:left; font-size:110%;" colspan=35>All directions</th></tr>
<tr>
<th></th>
<th>Thu</th>
<th>Fri</th>
<th class="We">Sat</th>
<th class="We">Sun</th>
<th class="We">Mon</th>
<th>Tue</th>
<th>Wed</th>
<th>Thu</th>
<th>Fri</th>
<th class="We">Sat</th>
<th class="We">Sun</th>
<th>Mon</th>
<th>Tue</th>
<th>Wed</th>
<th>Thu</th>
<th>Fri</th>
<th class="We">Sat</th>
<th class="We">Sun</th>
<th>Mon</th>
<th>Tue</th>
<th>Wed</th>
<th>Thu</th>
<th>Fri</th>
<th class="We">Sat</th>
<th class="We">Sun</th>
<th>Mon</th>
<th>Tue</th>
<th>Wed</th>
<th>Thu</th>
<th>Fri</th>
<th class="We">Sat</th>
<th colspan=2>Average</th>
<th>Total</th>
</tr>
<tr class="BP">
<th></th>
<th>2019-08-01</th>
<th>2019-08-02</th>
<th class="We">2019-08-03</th>
<th class="We">2019-08-04</th>
<th class="We">2019-08-05</th>
<th>2019-08-06</th>
<th>2019-08-07</th>
<th>2019-08-08</th>
<th>2019-08-09</th>
<th class="We">2019-08-10</th>
<th class="We">2019-08-11</th>
<th>2019-08-12</th>
<th>2019-08-13</th>
<th>2019-08-14</th>
<th>2019-08-15</th>
<th>2019-08-16</th>
<th class="We">2019-08-17</th>
<th class="We">2019-08-18</th>
<th>2019-08-19</th>
<th>2019-08-20</th>
<th>2019-08-21</th>
<th>2019-08-22</th>
<th>2019-08-23</th>
<th class="We">2019-08-24</th>
<th class="We">2019-08-25</th>
<th>2019-08-26</th>
<th>2019-08-27</th>
<th>2019-08-28</th>
<th>2019-08-29</th>
<th>2019-08-30</th>
<th class="We">2019-08-31</th>
<th>Workday</th>
<th>7 Day</th>
<th>Count</th>
</tr>
<tr class="T">
<td class="RHR" style='mso-number-format:"\@";'>00:00:00</td>
<td class="S">1535</td>
<td class="S">1549</td>
<td class="WeS">1673</td>
<td class="WeS">1536</td>
<td class="WeS">1437</td>
<td class="S">1410</td>
<td>1326</td>
<td>1492</td>
<td>1612</td>
<td class="We">1904</td>
<td class="We">1705</td>
<td>1389</td>
<td>1410</td>
<td>1386</td>
<td>1600</td>
<td>1476</td>
<td class="We">1957</td>
<td class="We">1792</td>
<td>1346</td>
<td>1437</td>
<td>1465</td>
<td>1594</td>
<td>1625</td>
<td class="We">1871</td>
<td class="We">1737</td>
<td>1449</td>
<td>1349</td>
<td>1462</td>
<td>1358</td>
<td>1489</td>
<td class="We">1749</td>
<td>1463</td>
<td>1551</td>
<td>48120</td>
</tr>
<tr>
<td class="RHR" style='mso-number-format:"\@";'>01:00:00</td>
<td class="S">1096</td>
<td class="S">1076</td>
<td class="WeS">1262</td>
<td class="WeS">1194</td>
<td class="WeS">1042</td>
<td class="S">846</td>
<td>926</td>
<td>948</td>
<td>1102</td>
<td class="We">1282</td>
<td class="We">1190</td>
<td>891</td>
<td>984</td>
<td>869</td>
<td>1002</td>
<td>1095</td>
<td class="We">1080</td>
<td class="We">1344</td>
<td>948</td>
<td>952</td>
<td>904</td>
<td>983</td>
<td>1121</td>
<td class="We">1083</td>
<td class="We">1333</td>
<td>757</td>
<td>864</td>
<td>868</td>
<td>889</td>
<td>1005</td>
<td class="We">1103</td>
<td>962</td>
<td>1032</td>
<td>32039</td>
</tr>
<tr>
<td class="RHR" style='mso-number-format:"\@";'>02:00:00</td>
<td class="S">591</td>
<td class="S">759</td>
<td class="WeS">776</td>
<td class="WeS">851</td>
<td class="WeS">706</td>
<td class="S">567</td>
<td>652</td>
<td>674</td>
<td>758</td>
<td class="We">1007</td>
<td class="We">836</td>
<td>507</td>
<td>801</td>
<td>623</td>
<td>638</td>
<td>726</td>
<td class="We">844</td>
<td class="We">812</td>
<td>538</td>
<td>677</td>
<td>620</td>
<td>688</td>
<td>720</td>
<td class="We">733</td>
<td class="We">843</td>
<td>518</td>
<td>593</td>
<td>666</td>
<td>664</td>
<td>675</td>
<td class="We">809</td>
<td>653</td>
<td>705</td>
<td>21872</td>
</tr>
<tr class="T">
<td class="RHR" style='mso-number-format:"\@";'>03:00:00</td>
<td class="S">787</td>
<td class="S">979</td>
<td class="WeS">831</td>
<td class="WeS">844</td>
<td class="WeS">788</td>
<td class="S">781</td>
<td>831</td>
<td>817</td>
<td>974</td>
<td class="We">947</td>
<td class="We">886</td>
<td>868</td>
<td>763</td>
<td>802</td>
<td>762</td>
<td>904</td>
<td class="We">882</td>
<td class="We">863</td>
<td>828</td>
<td>772</td>
<td>745</td>
<td>798</td>
<td>905</td>
<td class="We">854</td>
<td class="We">796</td>
<td>837</td>
<td>740</td>
<td>740</td>
<td>727</td>
<td>881</td>
<td class="We">896</td>
<td>819</td>
<td>833</td>
<td>25828</td>
</tr>
<tr>
<td class="RHR" style='mso-number-format:"\@";'>04:00:00</td>
<td class="S">1356</td>
<td class="S">1449</td>
<td class="WeS">1318</td>
<td class="WeS">1133</td>
<td class="WeS">1200</td>
<td class="S">1432</td>
<td>1351</td>
<td>1321</td>
<td>1637</td>
<td class="We">1287</td>
<td class="We">1117</td>
<td>1448</td>
<td>1358</td>
<td>1342</td>
<td>1337</td>
<td>1389</td>
<td class="We">1348</td>
<td class="We">1064</td>
<td>1367</td>
<td>1408</td>
<td>1447</td>
<td>1346</td>
<td>1538</td>
<td class="We">1275</td>
<td class="We">1048</td>
<td>1317</td>
<td>1391</td>
<td>1196</td>
<td>1394</td>
<td>1498</td>
<td class="We">1375</td>
<td>1387</td>
<td>1339</td>
<td>41487</td>
</tr>
<tr>
<td class="RHR" style='mso-number-format:"\@";'>05:00:00</td>
<td class="S">2516</td>
<td class="S">2444</td>
<td class="WeS">1659</td>
<td class="WeS">1136</td>
<td class="WeS">1512</td>
<td class="S">2609</td>
<td>2518</td>
<td>2490</td>
<td>2466</td>
<td class="We">1608</td>
<td class="We">1203</td>
<td>2651</td>
<td>2554</td>
<td>2441</td>
<td>2441</td>
<td>2419</td>
<td class="We">1628</td>
<td class="We">1254</td>
<td>2676</td>
<td>2528</td>
<td>2496</td>
<td>2443</td>
<td>2452</td>
Вот как это выглядит при открытии в Excel
Поэтому у меня вопрос, как мне прочитать эту таблицу, которая отображается в Excel через python. Когда я попытался разобрать его с помощью функции чтения XLS в pandas, он выдал мне ошибки
edit: все содержимое файла XLS можно найти здесь на pastebin
https://pastebin.com/2rpyAZhE