SolR's Tika processor in Data Import Handler does not get filename from DB processor

119 views Asked by At

I have a DIH configuration where I want to combine data from DB and Tika, by passing the filename from db to Tika. Problem is that filename in Tika is coming as empty. Logs say:

ERROR (Thread-16) [   ] o.a.s.h.d.DataImporter Full Import failed:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.solr.handler.dataimport.DataImportHandlerException: java.lang.RuntimeException: java.io.FileNotFoundException: Could not find file:  (resolved to: C:\Users\jimbo\Desktop\solr-8.9.0\server\.

My configuration xml file is this:

<dataConfig>
    <dataSource name="ds-db" driver="org.mariadb.jdbc.Driver" url="jdbc:mysql://localhost:3306/eepyakm?user=root" user="root" password="wpadmin"/>
    <dataSource name="ds-file" type="BinFileDataSource"/>
    <document>
        <entity name="supplier" query="select * from suppliers_tmp_view" dataSource="ds-db" 
                deltaQuery="select id from suppliers_tmp_view where last_modified > '${dataimporter.last_index_time}'"
                deltaImportQuery="select * from suppliers_tmp_view where id='${dataimporter.delta.id}'">
             
            <entity name="attachment" dataSource="ds-db" 
                    query="select * from suppliers_tmp_files_view where supplier_tmp_id='${supplier.id}'"
                    deltaQuery="select id,supplier_tmp_id from suppliers_tmp_files_view where last_modified > '${dataimporter.last_index_time}'"
                    parentDeltaQuery="select id from suppliers_tmp_view where id='${attachment.supplier_tmp_id}'">
            
                <field name="path" column="path"/>
                
                <entity name="file" processor="TikaEntityProcessor" url="${attachment.path}" format="text" dataSource="ds-file">
                    
                    <field column="text"/>
                </entity>
            </entity>
        </entity>
    </document>
</dataConfig>

I found a similar problem at a very old post: Solr's TikaEntityProcessor not working

0

There are 0 answers