I want to do cluster the data which consists of object names, x_coordinate, y_coordinate and corresponding temperature. Trying mean square clustering algorithm for clustering the nearby object according to location and the nearby temperature i.e. identify hot and cold areas. Following is code and small sample data. but it gives only single cluster by default settings but cannot show graph. I would like to know what might be wrong in following code:
import numpy as np
from mpl_toolkits.mplot3d import Axes3D
import pandas as pd
from sklearn.decomposition import PCA
from sklearn.cluster import MeanShift, estimate_bandwidth
import matplotlib.pyplot as plt
from itertools import cycle
data = pd.read_csv("data.csv")
centers = [[1, 1, 1], [0,0,0], [0,0,0]]
X= data._get_numeric_data()
bandwidth = estimate_bandwidth()
ms = MeanShift()
ms.fit(X)
labels = ms.labels_
cluster_centers = ms.cluster_centers_
print labels
print cluster_centers
fig = plt.figure()
ax = plt.axes(projection='3d')
x = data['x_cordinate']
y=data['y_cordinate']
z=data['tpa']
c=labels
ax.scatter(x,y,z, c=c)
plt.show()
Data.csv :
name,x_cordinate,y_cordinate,temperature
Ctrs3,5189200,6859000,0.3998434286
Ctrs4,5173360,6812800,0.4779542857
Ctrs5,5660440,6812800,0.7044195918
Cstrs3,1935400,5929720,0
Cstrs4,1953880,5929720,0
Cstrs5,491320,2689120,0
Cltrs3,3436240,5884840,0.3998434286
Cltrs4,3296320,5884840,0.4779542857
Cltrs5,5426800,5725120,0.7044195918
estimate_bandwidth needs an argument (your data). Does this code run?
Anyway... when this happens to me, I give smaller values of the
quantile
parameter forestimate_bandwidth
than the default 0.3 (and pass that bandwidth estimate to the MeanShift constructor!).You may also know a good bandwidth a-priori and are best using that if you do.