перебирая xml и получая только выбранные предметы - PullRequest
0 голосов
/ 27 марта 2020

Я запрашиваю систему, которая возвращает XML вывод с большим количеством элементов, и у меня возникают проблемы с получением только определенных элементов.

Когда я просто ищу по тегу host_sn, он добавляет только снимок с именем только host_sn и поколение 0, тогда как поколений много. Вывод xml ниже и пример поколений 0 и 1.

Как я могу перебрать подэлементы и получить все поколения и имя снимка в словаре. Ниже приведен пример вывода, который я хотел бы получить.

Вывод, который я хотел бы получить:


    #There should only be generation 0,1,2,3,etc.. never will see two gen 0,1,2
    [{'generation': '0', 'timestamp': 'Thu Mar 26 22:10:55 2020', 'snapshot_link': 'No', 'snapshot_name': 'host_sn'}]
    [{'generation': '1', 'timestamp': 'Thu Mar 26 22:20:55 2020', 'snapshot_link': 'No', 'snapshot_name': 'host_sn'}]
    [{'generation': '2', 'timestamp': 'Thu Mar 26 22:30:55 2020', 'snapshot_link': 'No', 'snapshot_name': 'host_sn'}]

У меня есть код, который только добавляет ген 0 в словарь (я играл вокруг с этим и получит все элементы, добавленные вместо уникального, как желаемый вывод выше:

snapshot = []

    #Using python Elementree, run_to_xml just adds -output xml_element in the command
    #example cmd: snapcmd -sg host_sg list -detail -output xml_element -sid 6161
    snap_xml = self.run_to_xml("snapcmd -sg " + source_sg + " list -detail", True)

    if snap_xml is not None:
        for sn_item in snap_xml.findall('SG/Snapvx/Snapshot'):
            sn_name = sn_item.find('snapshot_name').text
            sn_timestamp = sn_item.find('timestamp').text
            sn_generation = sn_item.find('generation').text
            sn_link = sn_item.find('link').text

            sn_list = {}
                if sn_name.endswith(SNAPSHOTVX_NAME_POSTFIX): #Postfix is _sn
                    if sn_name not in [sn_list['snapshot_name'] for sn_list in snapshot]:
                        sn_list['generation'] = sn_generation
                        sn_list['snapshot_name'] = sn_name
                        sn_list['timestamp'] = sn_timestamp
                        sn_list['snapshot_link'] = sn_link
                        snapshot.append(sn_list)

XML пример вывода:


    <?xml version="1.0" standalone="yes" ?>
    <SymCLI_ML>
      <SG>
        <SG_Info>
          <name>host_sn</name>
          <symid>0001##00####</symid>
          <microcode_version>6161</microcode_version>
        </SG_Info>
        <Snapvx>
          <Snapshot>
            <source>000F9</source>
            <snapshot_name>host_sn</snapshot_name>
            <timestamp>Thu Mar 26 16:05:37 2020</timestamp>
            <generation>0</generation>
            <link>No</link>
            <restore>No</restore>
            <failed>No</failed>
            <GCM>False</GCM>
            <zDP>False</zDP>
            <total_deltas_mb>34</total_deltas_mb>
            <total_deltas_gb>0.0</total_deltas_gb>
            <total_deltas_tb>0.00</total_deltas_tb>
            <total_deltas_tracks>268</total_deltas_tracks>
            <non_shared_mb>10</non_shared_mb>
            <non_shared_gb>0.0</non_shared_gb>
            <non_shared_tb>0.00</non_shared_tb>
            <non_shared_tracks>76</non_shared_tracks>
            <expiration_date>Fri Mar 27 16:05:37 2020</expiration_date>
          </Snapshot>
          <Snapshot>
            <source>000F9</source>
            <snapshot_name>host_sn</snapshot_name>
            <timestamp>Thu Mar 26 15:53:39 2020</timestamp>
            <generation>1</generation>
            <link>No</link>
            <restore>No</restore>
            <failed>No</failed>
            <GCM>False</GCM>
            <zDP>False</zDP>
            <total_deltas_mb>45</total_deltas_mb>
            <total_deltas_gb>0.0</total_deltas_gb>
            <total_deltas_tb>0.00</total_deltas_tb>
            <total_deltas_tracks>361</total_deltas_tracks>
            <non_shared_mb>21</non_shared_mb>
            <non_shared_gb>0.0</non_shared_gb>
            <non_shared_tb>0.00</non_shared_tb>
            <non_shared_tracks>169</non_shared_tracks>
            <expiration_date>Fri Mar 27 15:53:39 2020</expiration_date>
          </Snapshot>
          <Snapshot>
            <source>000FA</source>
            <snapshot_name>host_sn</snapshot_name>
            <timestamp>Thu Mar 26 16:05:37 2020</timestamp>
            <generation>0</generation>
            <link>No</link>
            <restore>No</restore>
            <failed>No</failed>
            <GCM>False</GCM>
            <zDP>False</zDP>
            <total_deltas_mb>7</total_deltas_mb>
            <total_deltas_gb>0.0</total_deltas_gb>
            <total_deltas_tb>0.00</total_deltas_tb>
            <total_deltas_tracks>53</total_deltas_tracks>
            <non_shared_mb>3</non_shared_mb>
            <non_shared_gb>0.0</non_shared_gb>
            <non_shared_tb>0.00</non_shared_tb>
            <non_shared_tracks>21</non_shared_tracks>
            <expiration_date>Fri Mar 27 16:05:37 2020</expiration_date>
          </Snapshot>
          <Snapshot>
            <source>000FA</source>
            <snapshot_name>host_sn</snapshot_name>
            <timestamp>Thu Mar 26 15:53:39 2020</timestamp>
            <generation>1</generation>
            <link>No</link>
            <restore>No</restore>
            <failed>No</failed>
            <GCM>False</GCM>
            <zDP>False</zDP>
            <total_deltas_mb>8</total_deltas_mb>
            <total_deltas_gb>0.0</total_deltas_gb>
            <total_deltas_tb>0.00</total_deltas_tb>
            <total_deltas_tracks>61</total_deltas_tracks>
            <non_shared_mb>4</non_shared_mb>
            <non_shared_gb>0.0</non_shared_gb>
            <non_shared_tb>0.00</non_shared_tb>
            <non_shared_tracks>29</non_shared_tracks>
            <expiration_date>Fri Mar 27 15:53:39 2020</expiration_date>
          </Snapshot>
          <Snapshot>
            <source>000FB</source>
            <snapshot_name>host_sn</snapshot_name>
            <timestamp>Thu Mar 26 16:05:37 2020</timestamp>
            <generation>0</generation>
            <link>No</link>
            <restore>No</restore>
            <failed>No</failed>
            <GCM>False</GCM>
            <zDP>False</zDP>
            <total_deltas_mb>0</total_deltas_mb>
            <total_deltas_gb>0.0</total_deltas_gb>
            <total_deltas_tb>0.00</total_deltas_tb>
            <total_deltas_tracks>3</total_deltas_tracks>
            <non_shared_mb>0</non_shared_mb>
            <non_shared_gb>0.0</non_shared_gb>
            <non_shared_tb>0.00</non_shared_tb>
            <non_shared_tracks>1</non_shared_tracks>
            <expiration_date>Fri Mar 27 16:05:37 2020</expiration_date>
          </Snapshot>
          <Snapshot>
            <source>000FB</source>
            <snapshot_name>host_sn</snapshot_name>
            <timestamp>Thu Mar 26 15:53:39 2020</timestamp>
            <generation>1</generation>
            <link>No</link>
            <restore>No</restore>
            <failed>No</failed>
            <GCM>False</GCM>
            <zDP>False</zDP>
            <total_deltas_mb>0</total_deltas_mb>
            <total_deltas_gb>0.0</total_deltas_gb>
            <total_deltas_tb>0.00</total_deltas_tb>
            <total_deltas_tracks>3</total_deltas_tracks>
            <non_shared_mb>0</non_shared_mb>
            <non_shared_gb>0.0</non_shared_gb>
            <non_shared_tb>0.00</non_shared_tb>
            <non_shared_tracks>1</non_shared_tracks>
            <expiration_date>Fri Mar 27 15:53:39 2020</expiration_date>
          </Snapshot>
          <Snapshot>
            <source>000FC</source>
            <snapshot_name>host_sn</snapshot_name>
            <timestamp>Thu Mar 26 16:05:37 2020</timestamp>
            <generation>0</generation>
            <link>No</link>
            <restore>No</restore>
            <failed>No</failed>
            <GCM>False</GCM>
            <zDP>False</zDP>
            <total_deltas_mb>20</total_deltas_mb>
            <total_deltas_gb>0.0</total_deltas_gb>
            <total_deltas_tb>0.00</total_deltas_tb>
            <total_deltas_tracks>163</total_deltas_tracks>
            <non_shared_mb>10</non_shared_mb>
            <non_shared_gb>0.0</non_shared_gb>
            <non_shared_tb>0.00</non_shared_tb>
            <non_shared_tracks>78</non_shared_tracks>
            <expiration_date>Fri Mar 27 16:05:37 2020</expiration_date>
          </Snapshot>
          <Snapshot>
            <source>000FC</source>
            <snapshot_name>host_sn</snapshot_name>
            <timestamp>Thu Mar 26 15:53:39 2020</timestamp>
            <generation>1</generation>
            <link>No</link>
            <restore>No</restore>
            <failed>No</failed>
            <GCM>False</GCM>
            <zDP>False</zDP>
            <total_deltas_mb>25</total_deltas_mb>
            <total_deltas_gb>0.0</total_deltas_gb>
            <total_deltas_tb>0.00</total_deltas_tb>
            <total_deltas_tracks>198</total_deltas_tracks>
            <non_shared_mb>14</non_shared_mb>
            <non_shared_gb>0.0</non_shared_gb>
            <non_shared_tb>0.00</non_shared_tb>
            <non_shared_tracks>113</non_shared_tracks>
            <expiration_date>Fri Mar 27 15:53:39 2020</expiration_date>
          </Snapshot>
        </Snapvx>
        <Snapvx_Totals>
          <total_deltas_mb>145698</total_deltas_mb>
          <total_deltas_gb>142.3</total_deltas_gb>
          <total_deltas_tb>0.14</total_deltas_tb>
          <total_deltas_tracks>1165587</total_deltas_tracks>
          <non_shared_mb>362</non_shared_mb>
          <non_shared_gb>0.4</non_shared_gb>
          <non_shared_tb>0.00</non_shared_tb>
          <non_shared_tracks>2893</non_shared_tracks>
        </Snapvx_Totals>
      </SG>
    </SymCLI_ML>

1 Ответ

0 голосов
/ 27 марта 2020

Вы можете попасть туда, используя l xml. Обратите внимание, что ваш xml по-прежнему недействителен (отсутствует закрывающий <Snapvx> и что там нет snapshot_link. Но обычно:

generations = """[your xml above, fixed]"""
from lxml import etree
doc = etree.fromstring(generations)
targets = doc.xpath('//Snapshot')
rows = []
for target in targets:
    items = {}
    gen = target.xpath('generation')[0]
    ts = target.xpath('timestamp')[0]
    sn = target.xpath('snapshot_name')[0]
    items[gen.tag] = gen.text
    items[ts.tag] = ts.text
    items[sn.tag] = sn.text
    if items not in rows:
       rows.append(items)
for row in rows:
    print(row)

Вывод:

{'generation': '0', 'timestamp': 'Thu Mar 26 16:05:37 2020', 'snapshot_name': 'host_sn'}
{'generation': '1', 'timestamp': 'Thu Mar 26 15:53:39 2020', 'snapshot_name': 'host_sn'}
{'generation': '2', 'timestamp': 'Thu Mar 26 15:53:39 2020', 'snapshot_name': 'host_sn'}
{'generation': '2', 'timestamp': 'Thu Mar 26 16:05:37 2020', 'snapshot_name': 'host_sn'}
{'generation': '3', 'timestamp': 'Thu Mar 26 16:05:37 2020', 'snapshot_name': 'host_sn'}
...