RSolr::Error::Http - 400 Bad Request Error: 'Exception writing document id jd472w44j to the index; possible analysis error: Document contains at least one immense term in field="suggest" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped. Please correct the analyzer to not produce such terms. The prefix of the first immense term is: \'[10, 114, 116, 105, -62, -80, 49, 52, 32, 9, 32, 49, 49, 48, 49, 49, 49, 49, 49, 102, 105, 108, 108, 105, 108, 105, 108, 108, 32, 49]...\', original message: bytes can be at most 32766 in length; got 36558. Perhaps the document has an indexed string field (solr.StrField) which is too large','code'=>400}}
While uploading large file size of 2.4 MB with 200 pages, getting this error
184 views Asked by Ranjeev Letsalign At
2
There are 2 answers
3
On
It says that one of your terms ("words") is larger than 32KB. Common reasons for this error is that you are adding the full text into a StrField
or using a TextField
with a tokenizer that does not split the words (e.g. KeywordTokenizer
).
Check your schema to see which field(s) handle the bulk of your text. Ensure it is TextField
and that is has a fitting tokenizer. ASCII 32 is space and it occurs in the term prefix that you pasted, so WhiteSpaceTokenizer
is probably what you need.
Because your document has field value greater than lucene limit. check this
change field type in your schema file