https://github.com/larsga/Duke - I am using Duke - for Data Deduplication.
I have setup Duke (jar files - Duke jar as well as lucene jars are added in the classpath) ..
Sample example in the github- https://github.com/larsga/Duke/wiki/SemanticDogfood
When I tried running this :
soundaryat@IMCHLT132:~/Duke$ java no.priv.garshol.duke.Duke --testfile=doc/example-data/dogfood-test.txt --testdebug --showmatches doc/example-data/dogfood.xml
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.lucene.analysis.standard.StandardAnalyzer.<init>(Lorg/apache/lucene/util/Version;)V
at no.priv.garshol.duke.databases.LuceneDatabase.<init>(LuceneDatabase.java:77)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at java.lang.Class.newInstance(Class.java:442)
at no.priv.garshol.duke.ConfigLoader.instantiate(ConfigLoader.java:292)
at no.priv.garshol.duke.ConfigLoader.access$100(ConfigLoader.java:31)
at no.priv.garshol.duke.ConfigLoader$ConfigHandler.startElement(ConfigLoader.java:199)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.startElement(AbstractSAXParser.java:509)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(XMLNSDocumentScannerImpl.java:380)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2787)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:606)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:118)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:510)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:848)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:777)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1140)
at no.priv.garshol.duke.ConfigLoader.load(ConfigLoader.java:49)
at no.priv.garshol.duke.Duke.main_(Duke.java:64)
at no.priv.garshol.duke.Duke.main(Duke.java:35)
whereas, the other example in the same github works - https://github.com/larsga/Duke/wiki/LinkingCountries
can anyone help,.. thanks in advance..
I had the same problem and by googling I've found out that Duke is not compatible with latest versions of Lucene. Are you using Lucene 5.X? If that's the case, you should download the older versions of Lucene jars (4.0.0) and include them in the classpath. It worked for me!