I'm using Xapian and Haystack in my django app. I have a model which contains a text field that I want to index for searching. This field is used to store all sorts of characters: words, urls, html, etc.
I'm using the default document-based index template:
text = indexes.CharField(document=True, use_template=True)
This sometimes yields the following error when someone has pasted a particularly long link:
InvalidArgumentError: Term too long (> 245)
Now I understand the error. I've gotten around it before for other fields in other situations.
My question is, what's the preferred way to handle this exception?
It seems that handling this exception requires me to use a prepare_text() method:
def prepare_text(self, obj):
content = []
for word in obj.body.split(' '):
if len(word) <= 245:
content += [word]
return ' '.join(content)
It just seems clunky and prone to problems. Plus I can't use the search templates.
How have you handled this problem?
I think you get it right. There's a patch on inkscape xapian_backend fork, inspired from xapian omega project.
I've done something like you've done on my project, with some trick in order to use the search index template: