I take part in some Django poroject and we use geo data (with GeoDjango). I have installed PostGis as it described on AWS docs.
We have a lot of some points (markers) on the map. And we need to cluster them.
I found one library anycluster. This library need the PostgreSQL extension named kmeans-postgresql be installed on the Postgre database.
But my database is located on Amazon RDS. And I can't connect to it by SSH in order to install an extension...
Anybody knows how can I install kmeans-postgresql extension on my Amazon RDS database?
Or maybe you can advise me other ways of clustering?
The K-Means It is a really complex calculation that is useful to data mining and cluster analysis ( you can see more about it in the wikipedia page https://en.wikipedia.org/wiki/K-means_clustering ). It have a big complexity when have to deal with many points. The K-means extension to postgresql http://pgxn.org/dist/kmeans/doc/kmeans.html it is written in C and compiled in the database machine. This brings a better performance compared to an procedure in plpgsql. Unfortunately as @estevao_lucas answered, this extension it is not enabled into Amazon RDS.
If you really need the k-means result, I translated this implementation of it, created by Joni Salonen in http://jonisalonen.com/2012/k-means-clustering-in-mysql/ and changed to plpgsql https://gist.github.com/thiagomata/a9737c3455d6248bef9f. This function uses temporary table. It is possible change it to use only arrays of Pins, if you wanna to.
But, if you only need to show some pins in a map, you will probably be happy with a really faster and simpler function that groups the results into an [x,y] matrix. I have created such function because the kmeans function was taking too much time to process my database (with a lot more than 400K elements). So this implementation is really faster, but does not have all the features you would expect from the K-means module. Besides that, this grid function https://gist.github.com/thiagomata/18ea14853998468c1a1d returns very good results, when the goal it is to show a big number of pins in a map.