Я пытаюсь разобрать строку XML с деревом элементов. Эта строка происходит из множества значений dict, объединенных вместе Нет узла root, но он работал нормально в первый раз.
Первый раз, когда я сделал это, и он работал:
for value in data.values():
myxml = ' '.join(value)
tree = ET.fromstring(myxml)
Но в том же случае, просто другой словарь не работает Мой код для этого просто:
values = [x for x in dict_fasi.values()]
myxml_fasi = ' '.join(values)
tree2 = ET.fromstring(myxml_fasi)
Я также пытался с l oop, как и раньше, и он не работал. Ошибка говорит: xml .etree.ElementTree.ParseError: нежелательная после элемента документа: строка 8, столбец 20 .
Строка 8 должна быть:
</new_line> <new_line>
И строка XML:
<new_line>
<text font="NUMPTY+ImprintMTnum" bbox="297.284,540.828,300.188,553.310" colourspace="DeviceGray" ncolour="0" size="12.482">della quale non conosce che una parte;] </text>
<text font="PYNIYO+ImprintMTnum-Italic" bbox="322.455,540.839,328.251,553.566" colourspace="DeviceGray" ncolour="0" size="12.727">prima</text>
<text font="NUMPTY+ImprintMTnum" bbox="331.206,545.345,334.683,552.834" colourspace="DeviceGray" ncolour="0" size="7.489">1</text>
<text font="NUMPTY+ImprintMTnum" bbox="177.602,528.028,180.850,540.510" colourspace="DeviceGray" ncolour="0" size="12.482">che nonconosce ancora appieno;</text>
<text font="NUMPTY+ImprintMTnum" bbox="189.430,532.545,192.908,540.034" colourspace="DeviceGray" ncolour="0" size="7.489">2</text>
<text font="NUMPTY+ImprintMTnum" bbox="203.879,528.028,208.975,540.510" colourspace="DeviceGray" ncolour="0" size="12.482">che</text>
</new_line> <new_line>
<text font="QKWQNQ+ImprintMTnum-Bold" bbox="315.109,462.272,319.863,472.957" colourspace="DeviceGray" ncolour="0" size="10.685">5</text>
<text font="NUMPTY+ImprintMTnum" bbox="368.916,461.828,372.743,474.310" colourspace="DeviceGray" ncolour="0" size="12.482">avederci]</text>
<text font="PYNIYO+ImprintMTnum-Italic" bbox="86.577,449.039,92.373,461.766" colourspace="DeviceGray" ncolour="0" size="12.727">sps.a</text>
<text font="NUMPTY+ImprintMTnum" bbox="167.611,449.028,172.707,461.510" colourspace="DeviceGray" ncolour="0" size="12.482">dove io andava a</text>
<text font="QKWQNQ+ImprintMTnum-Bold" bbox="68.031,421.672,72.786,432.357" colourspace="DeviceGray" ncolour="0" size="10.685">5</text>
<text font="NUMPTY+ImprintMTnum" bbox="137.296,421.228,140.200,433.710" colourspace="DeviceGray" ncolour="0" size="12.482">tante libertà] </text>
<text font="PYNIYO+ImprintMTnum-Italic" bbox="161.868,421.239,167.664,433.966" colourspace="DeviceGray" ncolour="0" size="12.727">prima</text>
<text font="NUMPTY+ImprintMTnum" bbox="170.784,425.745,174.262,433.234" colourspace="DeviceGray" ncolour="0" size="7.489">1</text>
<text font="NUMPTY+ImprintMTnum" bbox="174.297,421.228,183.920,433.710" colourspace="DeviceGray" ncolour="0" size="12.482">m</text>
<text font="MUVAOR+Symbol" bbox="194.367,421.612,199.376,431.672" colourspace="DeviceGray" ncolour="0" size="10.060"><></text>
<text font="NUMPTY+ImprintMTnum" bbox="208.349,425.745,211.827,433.234" colourspace="DeviceGray" ncolour="0" size="7.489">2</text>
<text font="NUMPTY+ImprintMTnum" bbox="244.601,421.228,250.976,433.710" colourspace="DeviceGray" ncolour="0" size="12.482">certe lib</text>
<text font="MUVAOR+Symbol" bbox="250.901,421.612,255.910,431.672" colourspace="DeviceGray" ncolour="0" size="10.060"><</text>
<text font="NUMPTY+ImprintMTnum" bbox="269.331,421.228,274.426,433.710" colourspace="DeviceGray" ncolour="0" size="12.482">ertà</text>
<text font="MUVAOR+Symbol" bbox="274.363,421.612,279.373,431.672" colourspace="DeviceGray" ncolour="0" size="10.060">></text>
</new_line> <new_line>
Первая строка XML работ, вместо этого, выглядит следующим образом:
<new_line>
<text font="QKWQNQ+ImprintMTnum-Bold" bbox="234.782,118.872,239.536,129.558" colourspace="DeviceGray" ncolour="0" size="10.685">80</text>
<text font="NUMPTY+ImprintMTnum" bbox="360.280,118.428,363.184,130.911" colourspace="DeviceGray" ncolour="0" size="12.482">pazienza, e la prudenza.] </text>
<text font="PYNIYO+ImprintMTnum-Italic" bbox="369.339,118.440,375.135,131.167" colourspace="DeviceGray" ncolour="0" size="12.727">da</text>
<text font="NUMPTY+ImprintMTnum" bbox="113.588,105.629,118.684,118.111" colourspace="DeviceGray" ncolour="0" size="12.482">pa-zienza</text>
<text font="MUVAOR+Symbol" bbox="120.415,105.707,124.422,117.543" colourspace="DeviceGray" ncolour="0" size="11.835">=</text>
</new_line>
<new_line>
<text font="NUMPTY+ImprintMTnum" bbox="194.095,105.629,196.999,118.111" colourspace="DeviceGray" ncolour="0" size="12.482">Cristoforo] </text>
<text font="PYNIYO+ImprintMTnum-Italic" bbox="214.031,105.640,219.827,118.367" colourspace="DeviceGray" ncolour="0" size="12.727">sts.a</text>
<text font="NUMPTY+ImprintMTnum" bbox="241.600,81.508,247.396,93.991" colourspace="DeviceGray" ncolour="0" size="12.482">Galdino 72</text>
<text font="SZWUPJ+ImprintExpertMT" bbox="272.785,614.422,276.490,625.380" colourspace="DeviceGray" ncolour="0" size="10.958"> </text>
<text font="NUMPTY+ImprintMTnum" bbox="53.923,592.408,58.102,602.646" colourspace="DeviceGray" ncolour="0" size="10.238">34c</text>
<text font="QKWQNQ+ImprintMTnum-Bold" bbox="72.640,592.472,77.394,603.157" colourspace="DeviceGray" ncolour="0" size="10.685">80</text>
<text font="NUMPTY+ImprintMTnum" bbox="187.701,592.028,190.605,604.510" colourspace="DeviceGray" ncolour="0" size="12.482">troverà … immaginare] </text>
<text font="PYNIYO+ImprintMTnum-Italic" bbox="201.265,592.039,204.169,604.766" colourspace="DeviceGray" ncolour="0" size="12.727">da </text>
<text font="NUMPTY+ImprintMTnum" bbox="305.701,592.028,310.796,604.510" colourspace="DeviceGray" ncolour="0" size="12.482">qualche rimedio inaspe</text>
<text font="MUVAOR+Symbol" bbox="310.691,592.412,315.701,602.472" colourspace="DeviceGray" ncolour="0" size="10.060"><</text>
<text font="NUMPTY+ImprintMTnum" bbox="331.518,592.028,337.314,604.510" colourspace="DeviceGray" ncolour="0" size="12.482">ttato</text>
<text font="MUVAOR+Symbol" bbox="337.154,592.412,342.163,602.472" colourspace="DeviceGray" ncolour="0" size="10.060">></text>
</new_line>
Может быть, это проблема открытия и закрытие тега new_line
, но я не знаю, как его решить.