Я пытаюсь создать файл json для индекса этой страницы, который является несортированным списком.Это должно включать иерархию.Это код, который у меня сейчас есть:
def parse_list(self, tag):
lis = tag.find_all('li', Recursive = False)
return list(map(self.parse_list_items,lis))
def parse_list_items(self, tag):
if tag.a['href'] in cache:
return
else:
aS = tag.find_all('a', Recursive = False)
text = ''
for a in aS:
if a.parent == tag:
text += a.text.strip()
cache[tag.a['href']] = text
inner = tag.find('ul', Recursive = False)
if inner is not None:
return {text: self.parse_list(inner)}
else:
return text
Однако, когда я запускаю, я получаю следующий результат:
[{'Account Network Topologies': ['fishersci.com Dev dtd-fs-dev-tfs (763838357644)']}, None, 'fishersci.com Prod dtd-fs-prod-tfs (821055950882)' . . .
Который на самом деле должен начинаться так:
[{'Account Network Topologies': ['fishersci.com Dev dtd-fs-dev-tfs (763838357644)', None, 'fishersci.com Prod dtd-fs-prod-tfs (821055950882)']}, . . .
Пример HTML здесь:
<!DOCTYPE html>
<html>
<head>
<title>DEDO (Digital Engineering DevOps)</title>
<link rel="stylesheet" href="styles/site.css" type="text/css" />
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body class="theme-default aui-theme-default">
<div id="page">
<div id="main" class="aui-page-panel">
<div id="main-header">
<h1 id="title-heading" class="pagetitle">
<span id="title-text">Space Details:</span>
</h1>
</div>
<div id="content">
<div id="main-content" class="pageSection">
<table class="confluenceTable">
<tr>
<th class="confluenceTh">Key</th>
<td class="confluenceTd">DEDO</td>
</tr>
<tr>
<th class="confluenceTh">Name</th>
<td class="confluenceTd">Digital Engineering DevOps</td>
</tr>
<tr>
<th class="confluenceTh">Description</th>
<td class="confluenceTd"></td>
</tr>
<tr>
<th class="confluenceTh">Created by</th>
<td class="confluenceTd">dave.prigg@thermofisher.com (May 02, 2018)</td>
</tr>
</table>
</div>
<br/>
<br/>
<div class="pageSection">
<div class="pageSectionHeader">
<h2 class="pageSectionTitle">Available Pages:</h2>
</div>
<ul>
<li>
<a href="Digital-Engineering-DevOps_127352316.html">Digital Engineering DevOps</a>
<img src="images/icons/contenttypes/home_page_16.png" height="16" width="16" border="0" align="absmiddle"/> <ul>
<li>
<a href="Account-Network-Topologies_138968150.html">Account Network Topologies</a>
<ul>
<li>
<a href="138968183.html">fishersci.com Dev dtd-fs-dev-tfs (763838357644)</a>
</li>
</ul>
<ul>
<li>
<a href="138968198.html">fishersci.com Prod dtd-fs-prod-tfs (821055950882)</a>
</li>
</ul>
<ul>
<li>
<a href="138968190.html">fishersci.com QA dtd-fs-qa-tfs (311631232506)</a>
</li>
</ul>
<ul>
<li>
<a href="142119108.html">TFC All Accounts (Current)</a>
</li>
</ul>
<ul>
<li>
<a href="142118420.html">TFC All Accounts (FUTURE)</a>
</li>
</ul>
<ul>
<li>
<a href="138968157.html">thermofisher.com Dev dtd-dev (066574023230)</a>
</li>
</ul>
<ul>
<li>
<a href="138968171.html">thermofisher.com Production dtd-prod (956741099536)</a>
</li>
</ul>
<ul>
<li>
<a href="138968167.html">thermofisher.com QA dtd-qa (926796168120)</a>
</li>
</ul>
</li>
</ul>
<ul>
<li>
<a href="138966923.html">Compute Platform (DECP)</a>
<ul>
<li>
<a href="Access-to-DECP-dashboard-in-TFC-via-tunneling_143305312.html">Access to DECP dashboard in TFC via tunneling</a>
</li>
</ul>
<ul>
<li>
<a href="138968264.html">App (microservice) Deployment</a>
</li>
</ul>
<ul>
<li>
<a href="138968309.html">App (microservice) Deployment Troubleshooting</a>
</li>
</ul>
<ul>
<li>
<a href="138968257.html">App (microservice) Monitoring and Debugging</a>
</li>
</ul>
<ul>
<li>
<a href="143305322.html">App (microservice) Push Docker Image(s)</a>
</li>
</ul>
<ul>
<li>
<a href="Contact-Information_138968281.html">Contact Information</a>
</li>
</ul>
<ul>
<li>
<a href="Deployment-Descriptor-Spec_138968267.html">Deployment Descriptor Spec</a>
</li>
</ul>
<ul>
<li>
<a href="Glossary_138968285.html">Glossary</a>
</li>
</ul>
<ul>
<li>
<a href="Platforms_138968291.html">Platforms</a>
</li>
</ul>
<ul>
<li>
<a href="Pre-requisites_138968294.html">Pre-requisites</a>
</li>
</ul>
<ul>
<li>
<a href="Proxy-Configuration_138968297.html">Proxy Configuration</a>
<ul>
<li>
<a href="Java-JAX-RS-Proxy-Configuration_143300654.html">Java JAX-RS Proxy Configuration</a>
</li>
</ul>
<ul>
<li>
<a href="Java-Spring-Proxy-Configuration_138968301.html">Java Spring Proxy Configuration</a>
</li>
</ul>
<ul>
<li>
<a href="NodeJS-Express-Proxy-Configuration_138968306.html">NodeJS Express Proxy Configuration</a>
</li>
</ul>
</li>
</ul>
</li>
</ul>
<ul>
<li>
<a href="Environment-Monitoring_138967838.html">Environment Monitoring</a>
</li>
</ul>
<ul>
<li>
<a href="Environments_138966928.html">Environments</a>
</li>
</ul>
<ul>
<li>
<a href="Request-Details_138966918.html">Request Details</a>
<ul>
<li>
<a href="138966921.html">Atlassian JIRA, Confluence, and Stash Request</a>
</li>
</ul>
<ul>
<li>
<a href="Commerce-AWS-Access-Keys-for-Local-Development_142118405.html">Commerce AWS Access Keys for Local Development</a>
</li>
</ul>
<ul>
<li>
<a href="Commerce-Jenkins-Access-Request_138966965.html">Commerce Jenkins Access Request</a>
</li>
</ul>
<ul>
<li>
<a href="Commerce-Product-Team-AWS-Console-Access-Request_138966981.html">Commerce Product Team AWS Console Access Request</a>
</li>
</ul>
<ul>
<li>
<a href="Commerce-Team-Cloud-Splunk-Access_138967826.html">Commerce Team Cloud Splunk Access</a>
</li>
</ul>
<ul>
<li>
<a href="Commerce-Team-Datadog-Access_138967828.html">Commerce Team Datadog Access</a>
</li>
</ul>
<ul>
<li>
<a href="TFC-New-Jenkins-Server-Request_138966959.html">TFC New Jenkins Server Request</a>
</li>
</ul>
<ul>
<li>
<a href="TFC-Team-Cloud-Splunk-Access_138967858.html">TFC Team Cloud Splunk Access</a>
</li>
</ul>
<ul>
<li>
<a href="138968006.html">VPN Access to AWS Resources (i.e. disable VPN split-tunneling)</a>
</li>
</ul>
</li>
</ul>
<ul>
<li>
<a href="Self-Service-Details_138967721.html">Self Service Details</a>
<ul>
<li>
<a href="AWS-Account-Error-Message-Decryption_138967802.html">AWS Account Error Message Decryption</a>
</li>
</ul>
<ul>
<li>
<a href="AWS-Account-Security-Policy-Overview_138967791.html">AWS Account Security Policy Overview</a>
</li>
</ul>
<ul>
<li>
<a href="Chef-Cookbook-Development-and-Best-Practices_138968025.html">Chef Cookbook Development and Best Practices</a>
</li>
</ul>
<ul>
<li>
<a href="Cloud-Splunk-Sample-Queries_138967832.html">Cloud Splunk Sample Queries</a>
</li>
</ul>
<ul>
<li>
<a href="Commerce-Product-AWS-Security-Policy-Creation_138966983.html">Commerce Product AWS Security Policy Creation</a>
</li>
</ul>
<ul>
<li>
<a href="138967728.html">Commerce Sub-Prod Account (AMERTEST) Password Change</a>
</li>
</ul>
<ul>
<li>
<a href="138967734.html">Commerce Sub-Prod Account (AMERTEST) Password Reset</a>
</li>
</ul>
<ul>
<li>
<a href="138967971.html">Commerce Team (Product) Names and Resource Prefixes</a>
</li>
</ul>
<ul>
<li>
<a href="CPM-Backup-for-EC2-EBS-Volumes_138968116.html">CPM Backup for EC2 EBS Volumes</a>
</li>
</ul>
<ul>
<li>
<a href="EC2-Instance-Creation_138967762.html">EC2 Instance Creation</a>
</li>
</ul>
<ul>
<li>
<a href="Lambda-Function-Creation_138967772.html">Lambda Function Creation</a>
</li>
</ul>
<ul>
<li>
<a href="OpsWorks-Stack-Creation_138968023.html">OpsWorks Stack Creation</a>
</li>
</ul>
<ul>
<li>
<a href="138967736.html">Production Account (AMER, EMEA, APAC) Password Change</a>
</li>
</ul>
<ul>
<li>
<a href="138967738.html">Production Account (AMER, EMEA, APAC) Password Reset</a>
</li>
</ul>
<ul>
<li>
<a href="RDS-Instance-Creation_138967781.html">RDS Instance Creation</a>
</li>
</ul>
<ul>
<li>
<a href="ssh-to-Commerce-EC2-Instances_138967811.html">ssh to Commerce EC2 Instances</a>
</li>
</ul>
<ul>
<li>
<a href="ssh-to-TFC-EC2-Instances_138967848.html">ssh to TFC EC2 Instances</a>
</li>
</ul>
</li>
</ul>
</li>
</ul>
</div>
</div> </div>
<div id="footer" role="contentinfo">
<section class="footer-body">
<p>Document generated by Confluence on Sep 26, 2018 17:33</p>
<div id="footer-logo"><a href="http://www.atlassian.com/">Atlassian</a></div>
</section>
</div>
</div> </body>
</html>
Спасибо! ``