I have elements of different categories that need to be clustered separately (according to their category) and then all together. Each element has a location (latitude,longitude).
My goal is to determine the clusters (group of different categories) of cluster (group of different elements in the same category) like in the following pictures: https://i.stack.imgur.com/B5uej.png
In my case the distance between two elements that should be included in a cluster is the same distance as the distance between two clusters of clusters. For example in the picture with the blue cluster. Since all the elements in this blue cluster are separeted by a distance of d at most (from any elements of the cluster) then they belong in the blue cluster. It's the same for the red cluster where we include the elements that are separated by a distance of d at most
With DBSCAN I can easily find the clusters of all of these elements if I provide as input all the elements together. And If I want to find the clusters of each category, then I will have to provide as input only the different category and run DBSCAN one by one. But I guess there should be something much faster than running many times DBSCAN to get these clusters of clusters
Why do you think it would be faster to mix categories that you want to be separate?
Do the cheap operations first, such as splitting your data set. Then process each partition independently.
As far as I know, scipy cannot accelerate geodetic distances. So you will have to do O(n^2) distance computations. If you have 10 categories, your problem gets 10x faster if you can split it into such partitions, and run DBSCAN 10 times, because each run is 10^2 times cheaper!