Index and full-text-search with Jackrabbit, Lucene using Tika

1.7k views Asked by At

Full text search not working.

I am creating document management system using Apache Jackrabbit 2.9.0 and tika-parsers 1.3

In workspace.xml & repository.xml added tikaConfig

<SearchIndex class="org.apache.jackrabbit.core.query.lucene.SearchIndex">
    <param name="path" value="${wsp.home}/index"/>
    <param name="supportHighlighting" value="true"/>
    <param name="tikaConfigPath" value="${rep.home}/tika-config.xml"/> 
</SearchIndex>

In tika-config.xml

  <mimeTypeRepository resource="/org/apache/tika/mime/tika-mimetypes.xml" magic="false"/>
  <parsers>
    <parser name="parse-html" class="org.apache.tika.parser.html.HtmlParser">
      <mime>text/html</mime>
      <mime>application/xhtml+xml</mime>
      <mime>application/x-asp</mime>
    </parser>
  </parsers>

</properties>

Added a Html file to repository as below ( JcrUtils.putFile() ) & Mime type as "text/html"

     public static Node putFile(
            Node parent, String name, String mime,
            InputStream data, Calendar date) throws RepositoryException {
        Binary binary = parent.getSession().getValueFactory().createBinary(data);
        try {
            Node file = getOrAddNode(parent, name, NodeType.NT_FILE);
            Node content = getOrAddNode(file, Node.JCR_CONTENT, NodeType.NT_RESOURCE);

            content.setProperty(Property.JCR_MIMETYPE, mime);
            content.setProperty(Property.JCR_LAST_MODIFIED, date);
            content.setProperty(Property.JCR_DATA, binary);
            return file;
        } finally {
            binary.dispose();
        }
    }

File is added successfully and can be read back the same content. also versing is working fine. but when full-text-search is not working. whether the problem in Indexing?

The JCR SQL2 query as below

"select * from [nt:resource] as x WHERE contains(x.*, '*session*')"

Help me to solve this problem, I googled but can't find the relevant issue. Thank you

0

There are 0 answers