Document Clustering in Objective-C

462 views Asked by At

I am making an application that organizes a set of documents (ranging in number from a minimum of ~10 documents to a maximum of ~2000) into groups, based on the word/phrase content of each document. Each document can range from a paragraph of words to about a page and a half.

I'm not looking for a document clustering library that clusters results based on an initial search term, but a library that clusters without a search term.

Are there any libraries out there that do document clustering that can easily integrate with an Objective-C project?

1

There are 1 answers

0
Michael Nett On

I'm not very well-read in Object C, but if you can import native C code then you could use the greedyRSC heuristic. We had very nice results for the Reuters and LA-Times corpora.

Description of the message & C-code is available here: http://research.nii.ac.jp/~meh/greedyRSC/rscpage.html