Мне нужно проиндексировать 5 различных типов XML-файлов. Они имеют сходную структуру с небольшими различиями в каждом из них.
пример 1:
<?xml version="1.0"?>
<manifest>
<metadata>
<isbn>9780815341291</isbn>
<title>Essential Cell Biology,Third Edition</title>
<authors>
<author>Alberts;Bruce</author>
<author>Bray;Dennis</author>
</authors>
<categories>
<category>SCABC</category>
<category>SCDEF</category>
</categories>
</metadata>
<resources>
<audioresource>
<uuid>123456789</uuid>
<source>03_Mutations_Origin_Cancer.mp3</source>
<mimetype>audio/mpeg</mimetype>
<title>Part Three - Mutations and the Origin of Cancer</title>
<description>123</description>
<chapters>
<chapter>1</chapter>
</chapters>
</audioresource>
</resources>
</manifest>
пример 2:
<?xml version="1.0"?>
<manifest>
<metadata>
<isbn>9780815341291</isbn>
<title>Essential Cell Biology,Third Edition</title>
<authors>
<author>FN:Alberts;Bruce</author>
<author>FN:Bray;Dennis</author>
</authors>
<categories>
<category>SCABC</category>
<category>SCGHI</category>
</categories>
</metadata>
<resources>
<glossaryresource>
<uuid>123456789</uuid>
<term>A subunit </term>
<definition>The portion of a bacterial exotoxin that interferes with normal host cell function. </definition>
<chapters>
<chapter>10</chapter>
</chapters>
</glossaryresource>
</resources>
</manifest>
Мой dih-config.xml выглядит так:
<dataConfig>
<dataSource name="fileReader" type="FileDataSource" encoding="UTF-8"/>
<document>
<entity name="dir" rootEntry="false" dataSource="null" processor="FileListEntityProcessor" fileName="^.*\.xml$" recursive="true" baseDir="X:/tmp/npr">
<entity name="audioresource"
rootEntity="true"
dataSource="fileReader"
url="${dir.fileAbsolutePath}"
stream="false"
logTemplate=" processing ${dir.fileAbsolutePath}"
logLevel="debug"
processor="XPathEntityProcessor"
forEach="/manifest/metadata | /manifest/metadata/authors | /manifest/metadata/categories | /manifest/metadata/resources | /manifest/resources/audioresource | /manifest/resources/audioresource/chapters"
transformer="DateFormatTransformer">
<field column="category" xpath="/manifest/metadata/categories/category" />
<field column="author" xpath="/manifest/metadata/authors/author" />
<field column="book_title" xpath="/manifest/metadata/title" />
<field column="isbn" xpath="/manifest/metadata/isbn"/>
<field column="id" xpath="/manifest/resources/audioresource/uuid"/>
<field column="mimetype" xpath="/manifest/resources/audioresource/mimetype" />
<field column="title" xpath="/manifest/resources/audioresource/title"/>
<field column="description" xpath="/manifest/resources/audioresource/description"/>
<field column="chapter" xpath="/manifest/resources/audioresource/chapters/chapter"/>
<field column="source" xpath="/manifest/resources/audioresource/source"/>
</entity>
</entity>
</document>
</dataConfig>
Я не совсем знаком с xpath. Я не могу использовать подстановочный знак в имени элемента, могу ли я? Пробовал, и это не сработало.
Большое спасибо заранее.