вставка ключей и значений для цикла - PullRequest
2 голосов
/ 14 января 2020

Я пытаюсь получить данные с сайта localhost,

#scrapy shell localhost.aspx

for i in response.xpath('//*[text()="Core Units"]/parent::*/parent::*/parent::*/div'):
    i.xpath('.//text()').extract()

это вывод

['Core Units']
['AB43342', 'Identify learning objectives']
['Elective Units']
['AB43343', 'Engage with texts for personal purposes']
['AB43344', 'Engage with texts for learning purposes']
['AB43345', 'Engage with texts for employment purposes']
['AB43346', 'Engage with texts to participate in the community']
['Extra Units']
['AB43348', 'Create  texts for personal purposes']
['AB43349', 'Create  texts for learning purposes']
['AB43350', 'Create  texts for employment purposes']

Я хочу создать 1 словарь следующим образом:

di={'Core Units':['Code:AB4334 desc: Identify learning objectives'],
'Elective Units':['Code: AB43343 desc: Engage with texts for personal purposes',
'Code: AB43344 desc:Engage with texts for learning purposes',
...,]
'Extra Units': ['Code: AB43348 desc: Create  texts for personal purposes',
...]
}

Я не знаю, какие ключи могут отображаться, поэтому я не могу создать пустой словарь и начать заполнять его, я должен взять их из for l oop

Ответы [ 2 ]

0 голосов
/ 14 января 2020

Отказ от ответственности: использует f-string форматирование с Python 3,6

Вот кое-что, что должно помочь с данной информацией.

inp = [['Core Units'],
       ['AB43342', 'Identify learning objectives'],
       ['Elective Units'],
       ['AB43343', 'Engage with texts for personal purposes'],
       ['AB43344', 'Engage with texts for learning purposes'],
       ['AB43345', 'Engage with texts for employment purposes'],
       ['AB43346', 'Engage with texts to participate in the community'],
       ['Extra Units'],
       ['AB43348', 'Create  texts for personal purposes'],
       ['AB43349', 'Create  texts for learning purposes'],
       ['AB43350', 'Create  texts for employment purposes']]

from collections import defaultdict
di = defaultdict(list)    # Helpful to just append value to new key in dict
unit = ''
for line in inp:
    if len(line) == 1:
        unit = line[0]    # Sets the current unit (dict key) for upcoming lines
    else:
        di[unit].append(f"Code:{line[0]} desc: {line[1]}")  # Adds line to unit
print(di)

Выходы:

{'Core Units':     ['Code:AB43342 desc: Identify learning objectives'],  
 'Elective Units': ['Code:AB43343 desc: Engage with texts for personal purposes',  
                    'Code:AB43344 desc: Engage with texts for learning purposes',  
                    'Code:AB43345 desc: Engage with texts for employment purposes',  
                    'Code:AB43346 desc: Engage with texts to participate in the community'],  
 'Extra Units': ['Code:AB43348 desc: Create  texts for personal purposes',  
                 'Code:AB43349 desc: Create  texts for learning purposes',  
                 'Code:AB43350 desc: Create  texts for employment purposes']}
0 голосов
/ 14 января 2020

Попробуйте это:

result = {}
for i in response.xpath('//*[text()="Core Units"]/parent::*/parent::*/parent::*/div'):
    line=i.xpath('.//text()').extract()
    if len(line) == 1 :
        last_key = line[0]
        result[last_key] = []
    else :
        result[last_key].append("Code:" + line[0] + " desc: " + line[1])
...