Problems with svd in java

276 views Asked by At

I have gone through jama and colt(I code in java) . Both of them expect me to use arrays such that the number of rows are more than the number of coloumns .

But in case of the Latent semantic analysis (LSA) i have 5 books and there are a total of 1000 odd words . When i use a term document matrix i get a 5*1000 matrix.

Since this does not work , i am forced to transpose the matrix . On transposing i use a 1000 * 5 . With a 1000*5 when i perform a svd i get a S matrix with 5*5 . To perform dimensionality reduction this the 5*5 matrix looks small .

What can be done ?

1

There are 1 answers

0
John Lehmann On

The text segment size you are using is way too large. A document (column) should represent a page or few pages of text, perhaps a chapter at the largest. I have seen paragraph size used as well.