I want to plot a dendrogram for a cluster result. Right now I am using ElkiBuilder from ELKI 0.7.5 for clustering.
In the best case I'd like to directly plot a dendrogram.
If that's not possible I'd like to extract information (distances) from the clustering to create a dendrogram with another library (eg. using newick format)
Therefore my questions:
Is it possible to create dendrograms with ELKI?
Is it possible to access the distances which have been calculated during the clustering? (the distances used when two clusters were merged)
Right now I am using the following code for clustering:
public Clustering<?> createClustering() {
double[][] distanceMatrix = new double[][]{
{0.0, 1.0, 3.0},
{1.0, 0.0, 4.0},
{3.0, 4.0, 0.0}
};
int noOfClusters = 2;
// Adapter to load data from an existing array.
DatabaseConnection dbc = new ArrayAdapterDatabaseConnection(distanceMatrix);
// Create a database (which may contain multiple relations!)
Database db = new StaticArrayDatabase(dbc, null);
// Load the data into the database (do NOT forget to initialize...)
db.initialize();
Clustering<?> clustering = new ELKIBuilder<>(CutDendrogramByNumberOfClusters.class) //
.with(CutDendrogramByNumberOfClusters.Parameterizer.MINCLUSTERS_ID, noOfClusters) //
.with(AbstractAlgorithm.ALGORITHM_ID, AnderbergHierarchicalClustering.class) //
.with(AGNES.Parameterizer.LINKAGE_ID, WardLinkage.class)
.build().run(db);
return clustering;
}
The
AGNESclass (instead I recommend to useAnderbergHierarchicalClusteringinstead, it is much faster but gives the exact same result) returns the clustering in a standard form called "pointer hierarchy" (PointerHierarchyRepresentationResult). The merge of i and j at height h is represented as a pointer from i to j, with height h. Afterwards, j represents the merged cluster. This basic form was introduces by Sibson et al. with the SLINK algorithm in 1973.In particular this contains the
yinformation (getParentDistanceStore), the merges (given bygetParentStore), and it can compute an order to arrange the points for visualizationgetPositions.You may want to have a look at the code of
DendrogramVisualization, which is responsible for creating the SVG dendrogram in the GUI.