Как извлечь тег из нескольких таблиц с помощью Beautifulsoup? - PullRequest
0 голосов
/ 12 мая 2019

Я пытаюсь извлечь данные из cvedetails.com для окон продукта 10 и в источнике страницы есть таблица. Существует один tr для описания уязвимостей и один tr для описания уязвимости. Я хочу быть в состоянии извлечь оба tr, поскольку они коррелированы

#!/usr/bin/python
import requests
r = requests.get('https://www.cvedetails.com/vulnerability-list.php? vendor_id=26&product_id=32238&version_id=&page=1&hasexp=0&opdos=0&opec=0&opov= 0&opcsrf=0&opgpriv=0&opsqli=0&opxss=0&opdirt=0&opmemc=0&ophttprs=0&opbyp=0&opfileinc=0&opginf=0&cvssscoremin=0&cvssscoremax=0&year=0&month=0&cweid=0&order=1&trc=845&sha=41e451b72c2e412c0a1cb8cb1dcfee3d16d51c44')

#print(r.text[0:500])
from  bs4 import BeautifulSoup
soup = BeautifulSoup(r.text,'html.parser')

#results = soup.find_all('tr',attrs={'class':'srrowns'})
#resultdesc = soup.find_all('td',attrs={'class':'cvesummarylong'})
#print(results[0:3])
#print(resultdesc[0:3])

results = soup.find_all(('tr',attrs={'class':'srrowns'}),('td',attrs= 
{'class':'cvesummarylong'}))
print(results[0:3])

Приведенные выше строки с комментариями - те, которые были выполнены успешно, но в виде отдельных значений

</tr>
                    <tr class="srrowns">
                <td class="num">
                                        <a name="y2019"> </a>
                                        1                   </td>
                                    <td nowrap><a href="/cve/CVE-2019-0879/"  title="CVE-2019-0879 security vulnerability details">CVE-2019-0879</a></td>
                <td><a href="//www.cvedetails.com/cwe-details/119/cwe.html" title="CWE-119 - CWE definition">119</a></td>
                <td class="num">
                    <b style="color:red">
                                            </b>
                </td>
                <td>
                    Exec Code Overflow                  </td>
                                    <td>2019-04-09</td>
                <td>2019-05-08</td>
                <td><div class="cvssbox" style="background-color:#ff9c20">7.2</div></td>
                <td align="center">None</td>
                <td align="center">Local</td>
                <td align="center">Low</td>
                <td align="center">Not required</td>
                <td align="center">Complete</td>
                <td align="center">Complete</td>
                <td align="center">Complete</td>
            </tr>
                            <tr>
                <td class="cvesummarylong" colspan="20">
                    A remote code execution vulnerability exists when the Windows Jet Database Engine improperly handles objects in memory, aka &#039;Jet Database Engine Remote Code Execution Vulnerability&#039;. This CVE ID is unique from CVE-2019-0846, CVE-2019-0847, CVE-2019-0851, CVE-2019-0877.                 </td>
            </tr>

                        <tr class="srrowns">
                <td class="num">
                                        <a name="y2019"> </a>
                                        2                   </td>
                                    <td nowrap><a href="/cve/CVE-2019-0877/"  title="CVE-2019-0877 security vulnerability details">CVE-2019-0877</a></td>
                <td><a href="//www.cvedetails.com/cwe-details/119/cwe.html" title="CWE-119 - CWE definition">119</a></td>
                <td class="num">
                    <b style="color:red">
                                            </b>
                </td>
                <td>
                    Exec Code Overflow                  </td>
                                    <td>2019-04-09</td>
                <td>2019-05-08</td>
                <td><div class="cvssbox" style="background-color:#ff9c20">7.2</div></td>
                <td align="center">None</td>
                <td align="center">Local</td>
                <td align="center">Low</td>
                <td align="center">Not required</td>
                <td align="center">Complete</td>
                <td align="center">Complete</td>
                <td align="center">Complete</td>
            </tr>
                            <tr>
                <td class="cvesummarylong" colspan="20">
                    A remote code execution vulnerability exists when the Windows Jet Database Engine improperly handles objects in memory, aka &#039;Jet Database Engine Remote Code Execution Vulnerability&#039;. This CVE ID is unique from CVE-2019-0846, CVE-2019-0847, CVE-2019-0851, CVE-2019-0879.                 </td>
            </tr>

Я хочу, чтобы результаты извлекались в одну строку с номерами cve, серьезностью и т. Д. С описанием. но единственный метод, который я пробовал, чтобы извлечь оба разделены

Конечный результат - мне нужны подробности в таблице и описание, и я могу вывести их в CSV-файле.

Ответы [ 3 ]

0 голосов
/ 12 мая 2019

Вы можете просмотреть всю таблицу и получить доступ к нужным параметрам:

from bs4 import BeautifulSoup as soup
import requests, re
d = soup(requests.get('https://www.cvedetails.com/vulnerability-list.php?%20vendor_id=26&product_id=32238&version_id=&page=1&hasexp=0&opdos=0&opec=0&opov=%200&opcsrf=0&opgpriv=0&opsqli=0&opxss=0&opdirt=0&opmemc=0&ophttprs=0&opbyp=0&opfileinc=0&opginf=0&cvssscoremin=0&cvssscoremax=0&year=0&month=0&cweid=0&order=1&trc=845&sha=41e451b72c2e412c0a1cb8cb1dcfee3d16d51c44').text, 'html.parser')
_t = d.find('table', {'id':'vulnslisttable'})
headers, [_, *data] = [re.sub('^[\t\n]+|[\t\n]+$', '', i.text) for i in _t.find_all('th')], [[re.sub('^[\s\t\n]+|[\t\n]+$', '', i.text) for i in b.find_all('td')] for b in _t.find_all('tr')]
result = [{**dict(zip(headers, data[i])), 'summary':data[i+1][0]} for i in range(0, len(data), 2)]

Вывод (укороченный из-за ограничения символов SO):

[{'#': '1', 'CVE ID': 'CVE-2019-0879', 'CWE ID': '119', '# of Exploits': '', 'Vulnerability Type(s)': 'Exec Code Overflow ', 'Publish Date': '2019-04-09', 'Update Date': '2019-05-08', 'Score': '7.2', 'Gained Access Level': 'None', 'Access': 'Local', 'Complexity': 'Low', 'Authentication': 'Not required', 'Conf.': 'Complete', 'Integ.': 'Complete', 'Avail.': 'Complete', 'summary': "A remote code execution vulnerability exists when the Windows Jet Database Engine improperly handles objects in memory, aka 'Jet Database Engine Remote Code Execution Vulnerability'. This CVE ID is unique from CVE-2019-0846, CVE-2019-0847, CVE-2019-0851, CVE-2019-0877."}, {'#': '2', 'CVE ID': 'CVE-2019-0877', 'CWE ID': '119', '# of Exploits': '', 'Vulnerability Type(s)': 'Exec Code Overflow ', 'Publish Date': '2019-04-09', 'Update Date': '2019-05-08', 'Score': '7.2', 'Gained Access Level': 'None', 'Access': 'Local', 'Complexity': 'Low', 'Authentication': 'Not required', 'Conf.': 'Complete', 'Integ.': 'Complete', 'Avail.': 'Complete', 'summary': "A remote code execution vulnerability exists when the Windows Jet Database Engine improperly handles objects in memory, aka 'Jet Database Engine Remote Code Execution Vulnerability'. This CVE ID is unique from CVE-2019-0846, CVE-2019-0847, CVE-2019-0851, CVE-2019-0879."}, {'#': '3', 'CVE ID': 'CVE-2019-0859', 'CWE ID': '264', '# of Exploits': '', 'Vulnerability Type(s)': '', 'Publish Date': '2019-04-09', 'Update Date': '2019-04-10', 'Score': '7.2', 'Gained Access Level': 'None', 'Access': 'Local', 'Complexity': 'Low', 'Authentication': 'Not required', 'Conf.': 'Complete', 'Integ.': 'Complete', 'Avail.': 'Complete', 'summary': "An elevation of privilege vulnerability exists in Windows when the Win32k component fails to properly handle objects in memory, aka 'Win32k Elevation of Privilege Vulnerability'. This CVE ID is unique from CVE-2019-0685, CVE-2019-0803."}, {'#': '4', 'CVE ID': 'CVE-2019-0856', 'CWE ID': '119', '# of Exploits': '', 'Vulnerability Type(s)': 'Exec Code Overflow ', 'Publish Date': '2019-04-09', 'Update Date': '2019-04-10', 'Score': '9.0', 'Gained Access Level': 'None', 'Access': 'Remote', 'Complexity': 'Low', 'Authentication': 'Single system', 'Conf.': 'Complete', 'Integ.': 'Complete', 'Avail.': 'Complete', 'summary': "A remote code execution vulnerability exists when Windows improperly handles objects in memory, aka 'Windows Remote Code Execution Vulnerability'."}, {'#': '5', 'CVE ID': 'CVE-2019-0853', 'CWE ID': '119', '# of Exploits': '', 'Vulnerability Type(s)': 'Exec Code Overflow ', 'Publish Date': '2019-04-09', 'Update Date': '2019-04-15', 'Score': '9.3', 'Gained Access Level': 'None', 'Access': 'Remote', 'Complexity': 'Medium', 'Authentication': 'Not required', 'Conf.': 'Complete', 'Integ.': 'Complete', 'Avail.': 'Complete', 'summary': "A remote code execution vulnerability exists in the way that the Windows Graphics Device Interface (GDI) handles objects in the memory, aka 'GDI+ Remote Code Execution Vulnerability'."}, {'#': '6', 'CVE ID': 'CVE-2019-0851', 'CWE ID': '119', '# of Exploits': '', 'Vulnerability Type(s)': 'Exec Code Overflow ', 'Publish Date': '2019-04-09', 'Update Date': '2019-04-10', 'Score': '9.3', 'Gained Access Level': 'None', 'Access': 'Remote', 'Complexity': 'Medium', 'Authentication': 'Not required', 'Conf.': 'Complete', 'Integ.': 'Complete', 'Avail.': 'Complete', 'summary': "A remote code execution vulnerability exists when the Windows Jet Database Engine improperly handles objects in memory, aka 'Jet Database Engine Remote Code Execution Vulnerability'. This CVE ID is unique from CVE-2019-0846, CVE-2019-0847, CVE-2019-0877, CVE-2019-0879."}, {'#': '7', 'CVE ID': 'CVE-2019-0849', 'CWE ID': '200', '# of Exploits': '', 'Vulnerability Type(s)': '+Info ', 'Publish Date': '2019-04-09', 'Update Date': '2019-04-10', 'Score': '4.3', 'Gained Access Level': 'None', 'Access': 'Remote', 'Complexity': 'Medium', 'Authentication': 'Not required', 'Conf.': 'Partial', 'Integ.': 'None', 'Avail.': 'None', 'summary': "An information disclosure vulnerability exists when the Windows GDI component improperly discloses the contents of its memory, aka 'Windows GDI Information Disclosure Vulnerability'. This CVE ID is unique from CVE-2019-0802."}, {'#': '8', 'CVE ID': 'CVE-2019-0848', 'CWE ID': '200', '# of Exploits': '', 'Vulnerability Type(s)': '+Info ', 'Publish Date': '2019-04-09', 'Update Date': '2019-04-10', 'Score': '2.1', 'Gained Access Level': 'None', 'Access': 'Local', 'Complexity': 'Low', 'Authentication': 'Not required', 'Conf.': 'Partial', 'Integ.': 'None', 'Avail.': 'None', 'summary': "An information disclosure vulnerability exists when the win32k component improperly provides kernel information, aka 'Win32k Information Disclosure Vulnerability'. This CVE ID is unique from CVE-2019-0814."}, {'#': '9', 'CVE ID': 'CVE-2019-0847', 'CWE ID': '119', '# of Exploits': '', 'Vulnerability Type(s)': 'Exec Code Overflow ', 'Publish Date': '2019-04-09', 'Update Date': '2019-04-10', 'Score': '9.3', 'Gained Access Level': 'None', 'Access': 'Remote', 'Complexity': 'Medium', 'Authentication': 'Not required', 'Conf.': 'Complete', 'Integ.': 'Complete', 'Avail.': 'Complete', 'summary': "A remote code execution vulnerability exists when the Windows Jet Database Engine improperly handles objects in memory, aka 'Jet Database Engine Remote Code Execution Vulnerability'. This CVE ID is unique from CVE-2019-0846, CVE-2019-0851, CVE-2019-0877, CVE-2019-0879."}, {'#': '10', 'CVE ID': 'CVE-2019-0846', 'CWE ID': '119', '# of Exploits': '', 'Vulnerability Type(s)': 'Exec Code Overflow ', 'Publish Date': '2019-04-09', 'Update Date': '2019-04-10', 'Score': '9.3', 'Gained Access Level': 'None', 'Access': 'Remote', 'Complexity': 'Medium', 'Authentication': 'Not required', 'Conf.': 'Complete', 'Integ.': 'Complete', 'Avail.': 'Complete', 'summary': "A remote code execution vulnerability exists when the Windows Jet Database Engine improperly handles objects in memory, aka 'Jet Database Engine Remote Code Execution Vulnerability'. This CVE ID is unique from CVE-2019-0847, CVE-2019-0851, CVE-2019-0877, CVE-2019-0879."}, {'#': '11', 'CVE ID': 'CVE-2019-0845', 'CWE ID': '20', '# of Exploits': '', 'Vulnerability Type(s)': 'Exec Code ', 'Publish Date': '2019-04-09', 'Update Date': '2019-05-08', 'Score': '9.3', 'Gained Access Level': 'None', 'Access': 'Remote', 'Complexity': 'Medium', 'Authentication': 'Not required', 'Conf.': 'Complete', 'Integ.': 'Complete', 'Avail.': 'Complete', 'summary': "A remote code execution vulnerability exists when the IOleCvt interface renders ASP webpage content, aka 'Windows IOleCvt Interface Remote Code Execution Vulnerability'."}, {'#': '12', 'CVE ID': 'CVE-2019-0844', 'CWE ID': '200', '# of Exploits': '', 'Vulnerability Type(s)': '+Info ', 'Publish Date': '2019-04-09', 'Update Date': '2019-04-10', 'Score': '2.1', 'Gained Access Level': 'None', 'Access': 'Local', 'Complexity': 'Low', 'Authentication': 'Not required', 'Conf.': 'Partial', 'Integ.': 'None', 'Avail.': 'None', 'summary': "An information disclosure vulnerability exists when the Windows kernel improperly handles objects in memory, aka 'Windows Kernel Information Disclosure Vulnerability'. This CVE ID is unique from CVE-2019-0840."}, {'#': '13', 'CVE ID': 'CVE-2019-0842', 'CWE ID': '119', '# of Exploits': '', 'Vulnerability Type(s)': 'Exec Code Overflow ', 'Publish Date': '2019-04-09', 'Update Date': '2019-04-10', 'Score': '9.3', 'Gained Access Level': 'None', 'Access': 'Remote', 'Complexity': 'Medium', 'Authentication': 'Not required', 'Conf.': 'Complete', 'Integ.': 'Complete', 'Avail.': 'Complete', 'summary': "A remote code execution vulnerability exists in the way that the VBScript engine handles objects in memory, aka 'Windows VBScript Engine Remote Code Execution Vulnerability'."}, {'#': '14', 'CVE ID': 'CVE-2019-0841', 'CWE ID': '264', '# of Exploits': '', 'Vulnerability Type(s)': '', 'Publish Date': '2019-04-09', 'Update Date': '2019-04-15', 'Score': '7.2', 'Gained Access Level': 'None', 'Access': 'Local', 'Complexity': 'Low', 'Authentication': 'Not required', 'Conf.': 'Complete', 'Integ.': 'Complete', 'Avail.': 'Complete', 'summary': "An elevation of privilege vulnerability exists when Windows AppX Deployment Service (AppXSVC) improperly handles hard links, aka 'Windows Elevation of Privilege Vulnerability'. This CVE ID is unique from CVE-2019-0730, CVE-2019-0731, CVE-2019-0796, CVE-2019-0805, CVE-2019-0836."}, {'#': '15', 'CVE ID': 'CVE-2019-0840', 'CWE ID': '200', '# of Exploits': '', 'Vulnerability Type(s)': '+Info ', 'Publish Date': '2019-04-09', 'Update Date': '2019-04-10', 'Score': '2.1', 'Gained Access Level': 'None', 'Access': 'Local', 'Complexity': 'Low', 'Authentication': 'Not required', 'Conf.': 'Partial', 'Integ.': 'None', 'Avail.': 'None', 'summary': "An information disclosure vulnerability exists when the Windows kernel improperly handles objects in memory, aka 'Windows Kernel Information Disclosure Vulnerability'. This CVE ID is unique from CVE-2019-0844."}, {'#': '16', 'CVE ID': 'CVE-2019-0839', 'CWE ID': '200', '# of Exploits': '', 'Vulnerability Type(s)': '+Info ', 'Publish Date': '2019-04-09', 'Update Date': '2019-05-08', 'Score': '2.1', 'Gained Access Level': 'None', 'Access': 'Local', 'Complexity': 'Low', 'Authentication': 'Not required', 'Conf.': 'Partial', 'Integ.': 'None', 'Avail.': 'None', 'summary': "An information disclosure vulnerability exists when the Terminal Services component improperly discloses the contents of its memory, aka 'Windows Information Disclosure Vulnerability'. This CVE ID is unique from CVE-2019-0838."}, {'#': '17', 'CVE ID': 'CVE-2019-0838', 'CWE ID': '200', '# of Exploits': '', 'Vulnerability Type(s)': '+Info ', 'Publish Date': '2019-04-09', 'Update Date': '2019-05-08', 'Score': '2.1', 'Gained Access Level': 'None', 'Access': 'Local', 'Complexity': 'Low', 'Authentication': 'Not required', 'Conf.': 'Partial', 'Integ.': 'None', 'Avail.': 'None', 'summary': "An information disclosure vulnerability exists when Windows Task Scheduler improperly discloses credentials to Windows Credential Manager, aka 'Windows Information Disclosure Vulnerability'. This CVE ID is unique from CVE-2019-0839."}, {'#': '18', 'CVE ID': 'CVE-2019-0837', 'CWE ID': '200', '# of Exploits': '', 'Vulnerability Type(s)': '+Info ', 'Publish Date': '2019-04-09', 'Update Date': '2019-04-10', 'Score': '2.1', 'Gained Access Level': 'None', 'Access': 'Local', 'Complexity': 'Low', 'Authentication': 'Not required', 'Conf.': 'Partial', 'Integ.': 'None', 'Avail.': 'None', 'summary': "An information disclosure vulnerability exists when DirectX improperly handles objects in memory, aka 'DirectX Information Disclosure Vulnerability'."}, {'#': '19', 'CVE ID': 'CVE-2019-0836', 'CWE ID': '264', '# of Exploits': '', 'Vulnerability Type(s)': '', 'Publish Date': '2019-04-09', 'Update Date': '2019-05-08', 'Score': '4.6', 'Gained Access Level': 'None', 'Access': 'Local', 'Complexity': 'Low', 'Authentication': 'Not required', 'Conf.': 'Partial', 'Integ.': 'Partial', 'Avail.': 'Partial', 'summary': "An elevation of privilege vulnerability exists when Windows improperly handles calls to the LUAFV driver (luafv.sys), aka 'Windows Elevation of Privilege Vulnerability'. This CVE ID is unique from CVE-2019-0730, CVE-2019-0731, CVE-2019-0796, CVE-2019-0805, CVE-2019-0841."}, {'#': '20', 'CVE ID': 'CVE-2019-0821', 'CWE ID': '200', '# of Exploits': '', 'Vulnerability Type(s)': '+Info ', 'Publish Date': '2019-04-08', 'Update Date': '2019-04-09', 'Score': '4.0', 'Gained Access Level': 'None', 'Access': 'Remote', 'Complexity': 'Low', 'Authentication': 'Single system', 'Conf.': 'Partial', 'Integ.': 'None', 'Avail.': 'None', 'summary': "An information disclosure vulnerability exists in the way that the Windows SMB Server handles certain requests, aka 'Windows SMB Information Disclosure Vulnerability'. This CVE ID is unique from CVE-2019-0703, CVE-2019-0704."}, {'#': '21', 'CVE ID': 'CVE-2019-0814', 'CWE ID': '200', '# of Exploits': '', 'Vulnerability Type(s)': '+Info ', 'Publish Date': '2019-04-09', 'Update Date': '2019-04-11', 'Score': '2.1', 'Gained Access Level': 'None', 'Access': 'Local', 'Complexity': 'Low', 'Authentication': 'Not required', 'Conf.': 'Partial', 'Integ.': 'None', 'Avail.': 'None', 'summary': "An information disclosure vulnerability exists when the win32k component improperly provides kernel information, aka 'Win32k Information Disclosure Vulnerability'. This CVE ID is unique from CVE-2019-0848."}, {'#': '22', 'CVE ID': 'CVE-2019-0805', 'CWE ID': '264', '# of Exploits': '', 'Vulnerability Type(s)': '', 'Publish Date': '2019-04-09', 'Update Date': '2019-05-08', 'Score': '4.6', 'Gained Access Level': 'None', 'Access': 'Local', 'Complexity': 'Low', 'Authentication': 'Not required', 'Conf.': 'Partial', 'Integ.': 'Partial', 'Avail.': 'Partial', 'summary': "An elevation of privilege vulnerability exists when Windows improperly handles calls to the LUAFV driver (luafv.sys), aka 'Windows Elevation of Privilege Vulnerability'. This CVE ID is unique from CVE-2019-0730, CVE-2019-0731, CVE-2019-0796, CVE-2019-0836, CVE-2019-0841."}, {'#': '23', 'CVE ID': 'CVE-2019-0803', 'CWE ID': '264', '# of Exploits': '', 'Vulnerability Type(s)': '', 'Publish Date': '2019-04-09', 'Update Date': '2019-04-10', 'Score': '7.2', 'Gained Access Level': 'None', 'Access': 'Local', 'Complexity': 'Low', 'Authentication': 'Not required', 'Conf.': 'Complete', 'Integ.': 'Complete', 'Avail.': 'Complete', 'summary': "An elevation of privilege vulnerability exists in Windows when the Win32k component fails to properly handle objects in memory, aka 'Win32k Elevation of Privilege Vulnerability'. This CVE ID is unique from CVE-2019-0685, CVE-2019-0859."}, {'#': '24', 'CVE ID': 'CVE-2019-0802', 'CWE ID': '200', '# of Exploits': '', 'Vulnerability Type(s)': '+Info ', 'Publish Date': '2019-04-09', 'Update Date': '2019-04-10', 'Score': '4.3', 'Gained Access Level': 'None', 'Access': 'Remote', 'Complexity': 'Medium', 'Authentication': 'Not required', 'Conf.': 'Partial', 'Integ.': 'None', 'Avail.': 'None', 'summary': "An information disclosure vulnerability exists when the Windows GDI component improperly discloses the contents of its memory, aka 'Windows GDI Information Disclosure Vulnerability'. This CVE ID is unique from CVE-2019-0849."}, {'#': '25', 'CVE ID': 'CVE-2019-0797', 'CWE ID': '264', '# of Exploits': '', 'Vulnerability Type(s)': '', 'Publish Date': '2019-04-08', 'Update Date': '2019-05-08', 'Score': '7.2', 'Gained Access Level': 'None', 'Access': 'Local', 'Complexity': 'Low', 'Authentication': 'Not required', 'Conf.': 'Complete', 'Integ.': 'Complete', 'Avail.': 'Complete', 'summary': "An elevation of privilege vulnerability exists in Windows when the Win32k component fails to properly handle objects in memory, aka 'Win32k Elevation of Privilege Vulnerability'. This CVE ID is unique from CVE-2019-0808."}, {'#': '26', 'CVE ID': 'CVE-2019-0796', 'CWE ID': '264', '# of Exploits': '', 'Vulnerability Type(s)': '', 'Publish Date': '2019-04-09', 'Update Date': '2019-05-08', 'Score': '2.1', 'Gained Access Level': 'None', 'Access': 'Local', 'Complexity': 'Low', 'Authentication': 'Not required', 'Conf.': 'None', 'Integ.': 'Partial', 'Avail.': 'None', 'summary': "An elevation of privilege vulnerability exists when Windows improperly handles calls to the LUAFV driver (luafv.sys), aka 'Windows Elevation of Privilege Vulnerability'. This CVE ID is unique from CVE-2019-0730, CVE-2019-0731, CVE-2019-0805, CVE-2019-0836, CVE-2019-0841."}, {'#': '27', 'CVE ID': 'CVE-2019-0795', 'CWE ID': '611', '# of Exploits': '', 'Vulnerability Type(s)': 'Exec Code ', 'Publish Date': '2019-04-09', 'Update Date': '2019-04-11', 'Score': '9.3', 'Gained Access Level': 'None', 'Access': 'Remote', 'Complexity': 'Medium', 'Authentication': 'Not required', 'Conf.': 'Complete', 'Integ.': 'Complete', 'Avail.': 'Complete', 'summary': "A remote code execution vulnerability exists when the Microsoft XML Core Services MSXML parser processes user input, aka 'MS XML Remote Code Execution Vulnerability'. This CVE ID is unique from CVE-2019-0790, CVE-2019-0791, CVE-2019-0792, CVE-2019-0793."}, {'#': '28', 'CVE ID': 'CVE-2019-0794', 'CWE ID': '119', '# of Exploits': '', 'Vulnerability Type(s)': 'Exec Code Overflow ', 'Publish Date': '2019-04-09', 'Update Date': '2019-04-11', 'Score': '9.3', 'Gained Access Level': 'None', 'Access': 'Remote', 'Complexity': 'Medium', 'Authentication': 'Not required', 'Conf.': 'Complete', 'Integ.': 'Complete', 'Avail.': 'Complete', 'summary': "A remote code execution vulnerability exists when OLE automation improperly handles objects in memory, aka 'OLE Automation Remote Code Execution Vulnerability'."}, {'#': '29', 'CVE ID': 'CVE-2019-0793', 'CWE ID': '611', '# of Exploits': '', 'Vulnerability Type(s)': 'Exec Code ', 'Publish Date': '2019-04-09', 'Update Date': '2019-04-10', 'Score': '9.3', 'Gained Access Level': 'None', 'Access': 'Remote', 'Complexity': 'Medium', 'Authentication': 'Not required', 'Conf.': 'Complete', 'Integ.': 'Complete', 'Avail.': 'Complete', 'summary': "A remote code execution vulnerability exists when the Microsoft XML Core Services MSXML parser processes user input, aka 'MS XML Remote Code Execution Vulnerability'. This CVE ID is unique from CVE-2019-0790, CVE-2019-0791, CVE-2019-0792, CVE-2019-0795."}, {'#': '30', 'CVE ID': 'CVE-2019-0792', 'CWE ID': '611', '# of Exploits': '', 'Vulnerability Type(s)': 'Exec Code ', 'Publish Date': '2019-04-09', 'Update Date': '2019-04-10', 'Score': '9.3', 'Gained Access Level': 'None', 'Access': 'Remote', 'Complexity': 'Medium', 'Authentication': 'Not required', 'Conf.': 'Complete', 'Integ.': 'Complete', 'Avail.': 'Complete', 'summary': "A remote code execution vulnerability exists when the Microsoft XML Core Services MSXML parser processes user input, aka 'MS XML Remote Code Execution Vulnerability'. This CVE ID is unique from CVE-2019-0790, CVE-2019-0791, CVE-2019-0793, CVE-2019-0795."}, {'#': '31', 'CVE ID': 'CVE-2019-0791', 'CWE ID': '611', '# of Exploits': '', 'Vulnerability Type(s)': 'Exec Code ', 'Publish Date': '2019-04-09', 'Update Date': '2019-04-10', 'Score': '9.3', 'Gained Access Level': 'None', 'Access': 'Remote', 'Complexity': 'Medium', 'Authentication': 'Not required', 'Conf.': 'Complete', 'Integ.': 'Complete', 'Avail.': 'Complete', 'summary': "A remote code execution vulnerability exists when the Microsoft XML Core Services MSXML parser processes user input, aka 'MS XML Remote Code Execution Vulnerability'. This CVE ID is unique from CVE-2019-0790, CVE-2019-0792, CVE-2019-0793, CVE-2019-0795."}, {'#': '32', 'CVE ID': 'CVE-2019-0790', 'CWE ID': '611', '# of Exploits': '', 'Vulnerability Type(s)': 'Exec Code ', 'Publish Date': '2019-04-09', 'Update Date': '2019-04-10', 'Score': '9.3', 'Gained Access Level': 'None', 'Access': 'Remote', 'Complexity': 'Medium', 'Authentication': 'Not required', 'Conf.': 'Complete', 'Integ.': 'Complete', 'Avail.': 'Complete', 'summary': "A remote code execution vulnerability exists when the Microsoft XML Core Services MSXML parser processes user input, aka 'MS XML Remote Code Execution Vulnerability'. This CVE ID is unique from CVE-2019-0791, CVE-2019-0792, CVE-2019-0793, CVE-2019-0795."}, {'#': '33', 'CVE ID': 'CVE-2019-0786', 'CWE ID': '264', '# of Exploits': '', 'Vulnerability Type(s)': '', 'Publish Date': '2019-04-09', 'Update Date': '2019-04-11', 'Score': '7.5', 'Gained Access Level': 'None', 'Access': 'Remote', 'Complexity': 'Low', 'Authentication': 'Not required', 'Conf.': 'Partial', 'Integ.': 'Partial', 'Avail.': 'Partial', 'summary': "An elevation of privilege vulnerability exists in the Microsoft Server Message Block (SMB) Server when an attacker with valid credentials attempts to open a specially crafted file over the SMB protocol on the same machine, aka 'SMB Server Elevation of Privilege Vulnerability'."}, {'#': '34', 'CVE ID': 'CVE-2019-0784', 'CWE ID': '119', '# of Exploits': '', 'Vulnerability Type(s)': 'Exec Code Overflow ', 'Publish Date': '2019-04-08', 'Update Date': '2019-04-10', 'Score': '7.6', 'Gained Access Level': 'None', 'Access': 'Remote', 'Complexity': 'High', 'Authentication': 'Not required', 'Conf.': 'Complete', 'Integ.': 'Complete', 'Avail.': 'Complete', 'summary': "A remote code execution vulnerability exists in the way that the ActiveX Data objects (ADO) handles objects in memory, aka 'Windows ActiveX Remote Code Execution Vulnerability'."}, {'#': '35', 'CVE ID': 'CVE-2019-0782', 'CWE ID': '200', '# of Exploits': '', 'Vulnerability Type(s)': '+Info ', 'Publish Date': '2019-04-08', 'Update Date': '2019-04-09', 'Score': '2.1', 'Gained Access Level': 'None', 'Access': 'Local', 'Complexity': 'Low', 'Authentication': 'Not required', 'Conf.': 'Partial', 'Integ.': 'None', 'Avail.': 'None', 'summary': "An information disclosure vulnerability exists when the Windows kernel fails to properly initialize a memory address, aka 'Windows Kernel Information Disclosure Vulnerability'. This CVE ID is unique from CVE-2019-0702, CVE-2019-0755, CVE-2019-0767, CVE-2019-0775."}, {'#': '36', 'CVE ID': 'CVE-2019-0776', 'CWE ID': '200', '# of Exploits': '', 'Vulnerability Type(s)': '+Info ', 'Publish Date': '2019-04-08', 'Update Date': '2019-04-09', 'Score': '2.1', 'Gained Access Level': 'None', 'Access': 'Local', 'Complexity': 'Low', 'Authentication': 'Not required', 'Conf.': 'Partial', 'Integ.': 'None', 'Avail.': 'None', 'summary': "An information disclosure vulnerability exists when the win32k component improperly provides kernel information, aka 'Win32k Information Disclosure Vulnerability'."}, {'#': '37', 'CVE ID': 'CVE-2019-0775', 'CWE ID': '200', '# of Exploits': '', 'Vulnerability Type(s)': '+Info ', 'Publish Date': '2019-04-08', 'Update Date': '2019-04-09', 'Score': '1.9', 'Gained Access Level': 'None', 'Access': 'Local', 'Complexity': 'Medium', 'Authentication': 'Not required', 'Conf.': 'Partial', 'Integ.': 'None', 'Avail.': 'None', 'summary': "An information disclosure vulnerability exists when the Windows kernel improperly handles objects in memory, aka 'Windows Kernel Information Disclosure Vulnerability'. This CVE ID is unique from CVE-2019-0702, CVE-2019-0755, CVE-2019-0767, CVE-2019-0782."}, {'#': '38', 'CVE ID': 'CVE-2019-0774', 'CWE ID': '200', '# of Exploits': '', 'Vulnerability Type(s)': '+Info ', 'Publish Date': '2019-04-08', 'Update Date': '2019-04-09', 'Score': '4.3', 'Gained Access Level': 'None', 'Access': 'Remote', 'Complexity': 'Medium', 'Authentication': 'Not required', 'Conf.': 'Partial', 'Integ.': 'None', 'Avail.': 'None', 'summary': "An information disclosure vulnerability exists when the Windows GDI component improperly discloses the contents of its memory, aka 'Windows GDI Information Disclosure Vulnerability'. This CVE ID is unique from CVE-2019-0614."}, {'#': '39', 'CVE ID': 'CVE-2019-0772', 'CWE ID': '119', '# of Exploits': '', 'Vulnerability Type(s)': 'Exec Code Overflow ', 'Publish Date': '2019-04-08', 'Update Date': '2019-04-09', 'Score': '9.3', 'Gained Access Level': 'None', 'Access': 'Remote', 'Complexity': 'Medium', 'Authentication': 'Not required', 'Conf.': 'Complete', 'Integ.': 'Complete', 'Avail.': 'Complete', 'summary': "A remote code execution vulnerability exists in the way that the VBScript engine handles objects in memory, aka 'Windows VBScript Engine Remote Code Execution Vulnerability'. This CVE ID is unique from CVE-2019-0665, CVE-2019-0666, CVE-2019-0667."}, {'#': '40', 'CVE ID': 'CVE-2019-0767', 'CWE ID': '200', '# of Exploits': '', 'Vulnerability Type(s)': '+Info ', 'Publish Date': '2019-04-08', 'Update Date': '2019-04-10', 'Score': '2.1', 'Gained Access Level': 'None', 'Access': 'Local', 'Complexity': 'Low', 'Authentication': 'Not required', 'Conf.': 'Partial', 'Integ.': 'None', 'Avail.': 'None', 'summary': "An information disclosure vulnerability exists when the Windows kernel improperly initializes objects in memory.To exploit this vulnerability, an authenticated attacker could run a specially crafted application, aka 'Windows Kernel Information Disclosure Vulnerability'. This CVE ID is unique from CVE-2019-0702, CVE-2019-0755, CVE-2019-0775, CVE-2019-0782."}, {'#': '41', 'CVE ID': 'CVE-2019-0766', 'CWE ID': '264', '# of Exploits': '', 'Vulnerability Type(s)': '', 'Publish Date': '2019-04-08', 'Update Date': '2019-04-09', 'Score': '7.2', 'Gained Access Level': 'None', 'Access': 'Local', 'Complexity': 'Low', 'Authentication': 'Not required', 'Conf.': 'Complete', 'Integ.': 'Complete', 'Avail.': 'Complete', 'summary': "An elevation of privilege vulnerability exists in Windows AppX Deployment Server that allows file creation in arbitrary locations. To exploit the vulnerability, an attacker would first have to log on to the system, aka 'Microsoft Windows Elevation of Privilege Vulnerability'."}, {'#': '42', 'CVE ID': 'CVE-2019-0765', 'CWE ID': '119', '# of Exploits': '', 'Vulnerability Type(s)': 'Exec Code Overflow ', 'Publish Date': '2019-04-08', 'Update Date': '2019-04-10', 'Score': '9.3', 'Gained Access Level': 'None', 'Access': 'Remote', 'Complexity': 'Medium', 'Authentication': 'Not required', 'Conf.': 'Complete', 'Integ.': 'Complete', 'Avail.': 'Complete', 'summary': "A remote code execution vulnerability exists in the way that comctl32.dll handles objects in memory, aka 'Comctl32 Remote Code Execution Vulnerability'."}, ...]
0 голосов
/ 13 мая 2019

Не ясно, какую именно информацию вы ищете, но на этой странице есть несколько таблиц, и вы можете извлечь их - без регулярных выражений. Например:

import requests
from bs4 import BeautifulSoup as bs
import pandas as pd

r = requests.get('your url')
soup = bs(r.content, 'lxml')

tables = soup.find_all('table')
my_table = pd.read_html(str(tables[4]))

Чтобы получить первый ряд этой конкретной таблицы:

print(my_table[0].iloc[0,:].dropna(axis=0,how='all'))

Выход:

#                                         1
CVE ID                        CVE-2019-0879
CWE ID                                  119
Vulnerability Type(s)    Exec Code Overflow
Publish Date                     2019-04-09
Update Date                      2019-05-08
Score                                   7.2
Gained Access Level                    None
Access                                Local
Complexity                              Low
Authentication                 Not required
Conf.                              Complete
Integ.                             Complete
Avail.                             Complete

Вы можете поиграть с порядковыми номерами таблиц и посмотреть, что еще вы можете обнаружить ...

0 голосов
/ 12 мая 2019

Что-то вроде следующего?В bs4 4.7.1 вы можете использовать: nth-child (нечетный) и: nth-child (четный) для обработки вопроса о строке и добавления описания к соответствующей строке.

import requests
from bs4 import BeautifulSoup as bs
import re
import pandas as pd

r = requests.get('https://www.cvedetails.com/vulnerability-list.php?%20vendor_id=26&product_id=32238&version_id=&page=1&hasexp=0&opdos=0&opec=0&opov=%200&opcsrf=0&opgpriv=0&opsqli=0&opxss=0&opdirt=0&opmemc=0&ophttprs=0&opbyp=0&opfileinc=0&opginf=0&cvssscoremin=0&cvssscoremax=0&year=0&month=0&cweid=0&order=1&trc=845&sha=41e451b72c2e412c0a1cb8cb1dcfee3d16d51c44')
soup = bs(r.content, 'lxml')
descs = [re.sub(r'\t+|(\n+)?',' ',item.text.strip()) for item in soup.select('#vulnslisttable tr:nth-child(odd)')[1:]]  #
items = [ item for item in soup.select('#vulnslisttable tr:nth-child(even)')]
results = []
i = 0

for desc in descs:
    row = [re.sub(r'\t+|(\n+)?',' ',item.text.strip()) for item in items[i].select('td')]
    row.append(desc)
    results.append(row)
    i+=1

df = pd.DataFrame(results)
headers = [re.sub(r'\t+|(\n+)?',' ',item.text.strip()) for item in soup.select('#vulnslisttable th')]
headers.append('description')
df.columns = headers
print(df)

объяснение регулярного выражения

Пример вывода:

enter image description here

Добро пожаловать на сайт PullRequest, где вы можете задавать вопросы и получать ответы от других членов сообщества.
...