Extrapolation of sample to population

367 views Asked by At

How to extrapolate a sample of 10,000 rows to the entire population (100,000) in python. I did agglomerative clustering on the sample in python, stuck with extrapolating the result to the entire population.

1

There are 1 answers

0
Has QUIT--Anony-Mousse On BEST ANSWER

There is no general rule.

For hierarchical clustering, this very much depends on your linkage, and the clustering of a different sample or the whole population may be very different. (For a starter, try a different sample and compare!)

Generalizing a clustering result to new data is usually contradicting the very assumptions made for the clustering. This is not classification, but explorative data analysis.

However, if you have found good clustering results, and you have verified them to be desirable, then you can train a classifier on the cluster labels to predict the cluster label of new data.