Question

Я пытаюсь преобразовать xml-документ, организованный в разделы и абзацы, с разрывом страниц и разрывами строк как вехами в xml-документ, который оборачивает страницы и строки в элементах страницы и строки.

Для этого я пытаюсь использовать util: get-фрагмент-между.

Чтобы сначала получить все строки на странице во фрагмент, а затем превратить каждую строку во фрагмент.

Первый шаг работает, но на втором этапе я получаю следующую ошибку org.exist.dom.memtree.ElementImpl cannot be cast to org.exist.dom.persistent.StoredNode, которую я не понимаю.

Ниже приведен файл xquery, а ниже - фрагмент выдержки из файла xml, который я пытаюсь преобразовать.

xquery version "3.1";

let $doc := doc($docpath)

(: Build first fragment of containing only lines on page:)
let $begp-node := $doc//tei:pb[@n="15-v"]
let $endp-node := $doc//tei:pb[@n="16-r"]
let $p-fragment := util:get-fragment-between($begp-node, $endp-node, $make-fragment, $display-root-namespace)
let $p-node := util:parse($p-fragment)

(: so far so good, print out of p-node gives me an xml document with just the text on page 15-v :)

(: следующий шаг. Здесь я пытаюсь построить фрагмент для каждой строки во вновь созданном фрагменте страницы:)

let $lines := $p-node//tei:lb

        for $line at $pos in $lines
            let $make-fragment1 := true()
            let $display-root-namespace1 := true()
            let $beginning-node := $line
            let $ending-node := $line/following::tei:lb[1]
            let $fragment := util:get-fragment-between($beginning-node, $ending-node, $make-fragment1, $display-root-namespace1)

            let $node := util:parse($fragment)
            return $node

Я ожидаю, что $ node будет новымXML-документ, который просто содержит фрагмент строки.Но вместо этого я получаю сообщение об ошибке:

org.exist.dom.memtree.ElementImpl нельзя преобразовать в org.exist.dom.persistent.StoredNode

Вот этовыдержка из оригинального документа:

<p>
      <lb ed="#L"/>dilectio <choice>
      <orig>dependant</orig>
      <reg>dependant</reg>
    </choice> causaliter a cognitione tamen quaelibet obiecti apprehensio vel cognitio
    <lb ed="#L"/>cum voluntatis libertate sufficit dilectionem causare <g ref="#slash"/> prima
    probatur quia si non sequitur quod dilec
    <lb ed="#L"/>tio
    <lb ed="#L"/>posset poni seu elici naturaliter a voluntate seclusa omni cognitione consequens
    est falsum
    <pb ed="#L" n="15-v"/>
    <lb ed="#L" n="1"/> quia tunc voluntas posset diligere in infinitum contra <ref>
      <name ref="#Augustine">augustinum</name> in libro 8 2 10 <title ref="#deTrinitate">de
        trinitate</title>
    </ref> patet consequentia quia positis omnibus causis ad productionem <sic>ad productionem</sic>
    alicuius effectus re
    <lb ed="#L" n="2"/>quisitis
    <lb ed="#L" n="3"/>omni alio secluso talis effectus posset naturaliter poni in esse <g
      ref="#slash"/>2a pars probatur quia
    <lb ed="#L" n="4"/>quia si sola obiecti cognitio etc sequitur quod stante iudicio vel
    apprehensione alicuius
    <lb ed="#L" n="5"/>obiecti sub ratione <corr>
      <del rend="strikethrough">boni</del>
      <add place="inLine">mali</add>
    </corr> seclusa omnia existentia vel apparentia bonitatis
    <lb ed="#L" n="6"/>voluntas posset tale obiectum velle vel diligere consequentia nota sed
    consequens est contra <ref>
      <name ref="#Aristotle">philosophum</name>
    </ref> et <ref>
      <name ref="#Averroes">commentatorem</name>
      <lb ed="#L" n="7"/>primo <name ref="#Ethics">ethicorum</name>
    </ref> quia omnia bonum appetunt
  <p xml:id="pgb1q2-d1e3692">
    <g ref="#pilcrow"/>primum corollarium 
    <lb ed="#L" n="8"/>

Любой совет очень ценится.

Chris Wallace · Answer 1 · 07 февраля 2019

Этот алгоритм, хотя в 3 раза медленнее, чем код Java, работает в памяти:

(:~  trim the XML from $nodes $start to $end 
 :   The algorithm is 
 : 1) find  all the ancestors of the start node - $startParents
 : 2) find  all the ancestors of the end node- $endParents
 : 3) recursively, starting with the common top we create a new element which is a copy of the element being trimmed by 
 :    3.1 copying all attributes 
 :    3.2 there are four cases depending on the node and the start and end edge nodes of the tree
 :     a) left and right nodes are the same - nothing else to copy
 :     b) both nodes are in the node's children - trim the start one, copy the intervening children and trim the end one
 :     c) only the start node is in the node's children - trim this node and copy the following siblings
 :     d) only the end node is in the node's children  - copy the preceding siblings and trim the node
 :    attributes (currently in the fb namespace since its not a TEI attribute) are added to trimmed nodes  
 : @param start  - the element bounding the start of the subtree
 : @param end - the element bounding the end of the subtree
:)

declare function fb:trim-node($start as node() ,$end as node()) {
let $startParents := $start/ancestor-or-self::*
let $endParents := $end/ancestor-or-self::*
let $top := $startParents[1]
return
   fb:trim-node($top,subsequence($startParents,2),subsequence($endParents,2))
};

declare function fb:trim-node($node as node(), $start as node()*, $end as node()*) {
       if (empty($start) and empty($end)) 
       then $node                                                       (: leaf is untrimmed :)
       else 
          let $startNode := $start[1]
          let $endNode:= $end[1]
          let $children := $node/node()
          return
             element {QName (namespace-uri($node), name($node))} {       (: preserve the namespace :)
              $node/@* ,                                                 (: copy all the attributes :)
              if ($startNode is $endNode)                                (: edge node  is common :)
              then fb:trim-node($startNode, subsequence($start,2),subsequence($end,2))
              else 
              if ($startNode = $children and $endNode = $children)       (: both in same subtree :)
              then (fb:trim-node($startNode, subsequence($start,2),()),  (: first the trimmed start node :)
                                                                         (: then the siblings between start and end nodes :)                                                                     
                    $startNode/following-sibling::node() 
                           except $endNode/following-sibling::node() 
                           except $endNode,

                    fb:trim-node($endNode, (), subsequence($end,2))      (: then the trimmed end node :)      
                   )
              else if ($startNode = $children)                           (: start node is in the children :)
              then 
                 ( fb:trim-node($startNode, subsequence($start,2),()),  (: first the trimmed start node :)
                   $startNode/following-sibling::node()                 (: then  the following siblings :)
                 )
              else if ($endNode = $children)                            (: end node is in the children :)
              then 
                 (  $endNode/preceding-sibling::node(),                  (: the preceding siblings :)
                    fb:trim-node($endNode, (), subsequence($end,2))      (: then the trimmed end node :)              
                 )
              else ()      
            }
};

Здесь представлено сравнение четырех алгоритмов, включая Java, с использованием оригинала демонстрационного приложения от joewiz: http://kitwallace.co.uk/Book/set/fragment-between/page

Ошибка использования util: get-фрагмент-между в eXist-Db

Пожалуйста, войдите или зарегистрируйтесь чтобы ответить на этот вопрос.

1 Ответ

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Ошибка использования util: get-фрагмент-между в eXist-Db

Пожалуйста, войдите или зарегистрируйтесь чтобы ответить на этот вопрос.

1 Ответ

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Похожие темы