While uploading large file size of 2.4 MB with 200 pages, getting this error

184 views Asked by At

RSolr::Error::Http - 400 Bad Request Error: 'Exception writing document id jd472w44j to the index; possible analysis error: Document contains at least one immense term in field="suggest" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped. Please correct the analyzer to not produce such terms. The prefix of the first immense term is: \'[10, 114, 116, 105, -62, -80, 49, 52, 32, 9, 32, 49, 49, 48, 49, 49, 49, 49, 49, 102, 105, 108, 108, 105, 108, 105, 108, 108, 32, 49]...\', original message: bytes can be at most 32766 in length; got 36558. Perhaps the document has an indexed string field (solr.StrField) which is too large','code'=>400}}

2

There are 2 answers

0
Vinod On

Because your document has field value greater than lucene limit. check this

change field type in your schema file

3
Toke Eskildsen On

It says that one of your terms ("words") is larger than 32KB. Common reasons for this error is that you are adding the full text into a StrField or using a TextField with a tokenizer that does not split the words (e.g. KeywordTokenizer).

Check your schema to see which field(s) handle the bulk of your text. Ensure it is TextField and that is has a fitting tokenizer. ASCII 32 is space and it occurs in the term prefix that you pasted, so WhiteSpaceTokenizer is probably what you need.