У меня есть следующая строка:
{"name":"INPROCEEDINGS","__typename":"PublicationConferencePaper"},"hasPermiss
ionToLike":true,"hasPermissionToFollow":true,"publicationCategory":"researchSu
mmary","hasPublicFulltexts":false,"canClaim":false,"publicationType":"inProcee
dings","fulltextRequesterCount":0,"requests":{"__pagination__":
[{"offset":0,"limit":1,"list":[]}]},"activeFiguresCount":0,"activeFigures":
{"__pagination__":[{"offset":0,"limit":100,"list":
[]}]},"abstract":"Heterogeneous Multiprocessor System-on-Chip (MPSoC) are
progressively becoming predominant in most modern mobile devices. These
devices are required to perform processing of applications within thermal,
energy and performance constraints. However, most stock power and thermal
management mechanisms either neglect some of these constraints or rely on
frequency scaling to achieve energy-efficiency and temperature reduction on
the device. Although this inefficient technique can reduce temporal thermal
gradient, but at the same time hurts the performance of the executing task.
In this paper, we propose a thermal and energy management mechanism which
achieves reduction in thermal gradient as well as energy-efficiency through
resource mapping and thread-partitioning of applications with online
optimization in heterogeneous MPSoCs. The efficacy of the proposed approach is
experimentally appraised using different applications from Polybench benchmark
suite on Odroid-XU4 developmental platform. Results show 28% performance
improvement, 28.32% energy saving and reduced thermal variance of over 76%
when compared to the existing approaches. Additionally, the method is able to
free more than 90% in memory storage on the MPSoC, which would have been
previously utilized to store several task-to-thread mapping
configurations.","hasRequestedAbstract":false,"lockedFields"
Я пытаюсь извлечь подстроку между "abstract": " и ", "hasRequestedAbstract" .Для этого я использую следующий код:
import requests
#some more codes here........
to_visit_url = 'https://www.researchgate.net/publication/328749434_TEEM_Online_Thermal-_and_Energy-Efficiency_Management_on_CPU-GPU_MPSoCs'
this_page = requests.get(to_visit_url)
content = str(page.content, encoding="utf-8")
abstract = re.search('\"abstract\":\"(.*)\",\"hasRequestedAbstract\"', content)
print('Abstract:\n' + str(abstract))
Но в абстрактной переменной оно содержит значение None.В чем может быть проблема?Как я могу получить подстроку, как упомянуто выше?
Примечание. Хотя кажется, что я могу прочитать его как объект JSON, но это не вариант, поскольку приведенный выше пример текста является лишь небольшой частью полного HTMLсодержимое, из которого очень трудно извлечь объект JSON.
PS Полное содержимое страницы, например page.content, можно загрузить здесь: https://docs.google.com/document/d/1awprvKsLPNoV6NZRmCkktYwMwWJo5aujGyNwGhDf7cA/edit?usp=sharing
Или источниктакже может быть загружен непосредственно с URL: https://www.researchgate.net/publication/328749434_TEEM_Online_Thermal-_and_Energy-Efficiency_Management_on_CPU-GPU_MPSoCs