Apache Lucene Indexer Поиск с помощью CJKAnalyzer - PullRequest
0 голосов
/ 02 июля 2018
             I am using Apache lucene Indexer Search to search text, and I am using
 CJKAnalyzer. It search provided word by character, It means 
 If I Search for Japanese word "ぁxまn" , then its showing all 
 the words which is having any character of the provided Japanese word.
             But I dont want this I want search whole word or the 
 word which is having above mentioned word.

например. Если бы я проиндексировал 3 слова. т.е. "ぁ x ま n", "ぁ x ま", "ま n"

 case 1 :  If I search for "ぁxまn" then it should only give one result.
 case 2 :  If I search for "ぁx" then it should give two result.

Теперь В моем случае, если я ищу слово "ぁ x ま n", оно дает три результата, что неверно.

------------------- Код индексации -------------------------- -------

writer = getIndexWriter();
List<Document> documents = new ArrayList<>();
Document document1 = createDocument(1, "ぁxまn", "Richard");
writer.addDocument(document1);
writer.commit();



 private static Document createDocument(Integer id, String firstName,  String lastName)
{
    Document document = new Document();
    document.add(new StringField("id", id.toString() , Field.Store.YES));
    document.add(new TextField("firstName", firstName , Field.Store.YES));
    document.add(new TextField("lastName", lastName , Field.Store.YES));
    document.add(new TextField("website", website , Field.Store.YES));
    return document;
}


private static IndexWriter createWriter() throws IOException
{
    FSDirectory dir = FSDirectory.open(Paths.get(INDEX_DIR).toFile());
    IndexWriterConfig config = new                                         
    IndexWriterConfig(Version.LUCENE_44,new CJKAnalyzer());
    IndexWriter writer = new IndexWriter(dir, config);
    return writer;
}

-------- позвонить в поиске ------

TopDocs foundDocs2 = searchByFirstName("*ぁxまn*", searcher);
-------------------------------------------------------------
private static TopDocs searchByFirstName(String firstName, IndexSearcher searcher) throws Exception
{

        MultiFieldQueryParser mqp = new MultiFieldQueryParser(new String[]{"firstName"}, new CJKAnalyzer());
        mqp.setAllowLeadingWildcard(true);
        Query q =mqp.parse(firstName);
        TopDocs hits = searcher.search(q, 10);
        return hits;
}
...