How to make uploaded PDF text searchable in Apache Sling

173 views Asked by At

I am exploring Apache Sling 11 to build a web application which is more content driven. I have a page where files(pdf/txt/doc) can be uploaded to path /content/company/uploads as nt-file. In search module I am using JCR query to do search for particular text and I wanted the text inside PDF/TXT files to be searchable, right now the search is able to pickup texts in txt file but not pdf files. The pdf file that I used for testing is just full of text.

I have configured tika in oak:index/lucene and did run a re-index but no change in query result.

Apache Sling version - 11 Backend - Mongo DB(oak-mongo)

Query that is used

SELECT * FROM [nt:base] WHERE ISDESCENDANTNODE('/content/company/uploads') AND lower([*]) LIKE 'test word'

Tika configuration screenshot below Tika configuration screenshot below

I am just starting to learn sling, any help is highly appreciated, thanks.

1

There are 1 answers

0
java_dev On

Instead of using like I used CONTAINS(*, '%test word%') in query. But now the problem is the text inside txt files are not picked up.