I am implementing my own kmeans algorithm on a set of data. When I go with selecting any random points in the dataset as centroids, I am getting a very poor accuracy. But, when I go with selecting one centroid randomly from each class of data, I get a good accuracy. Please help me with where I am going wrong. Below is my implementation:

First, I generate random centroids and feed it to a function, to assign each point to a cluster based on which centroid it is closer to

```
def assignClustersKNN(features,centroids,labels):
assignments = defaultdict(list)
distances = [[0 for x in range(len(centroids))] for y in range(len(features))]
#Iterating over all data points
for i in range(len(features)):
#Iterating over all centroids
for j in range(0,len(centroids)):
distances[i][j] = euclidean(features[i],centroids[j])
#Getting the index of the centroid which is the closest
clusterAssigned = distances[i].index(min(distances[i]))
#adding the point to the closest cluster
assignments[clusterAssigned].append(features[i])
return assignments
```

Then, I update the centroid of each cluster by computing the mean of the points in a cluster, which is the centroid of that cluster

```
def updateCentroids(assignments):
newCentroids = np.zeros(shape=(len(assignments.keys()),3))
for i in assignments.keys():
#getting the datapoints of each cluster
clusterMembers = assignments[i]
#computing the mean of the datapoints of the cluster
newCentroids[i] = np.mean(clusterMembers,axis=0)
return newCentroids
```

I have chosen the stopping condition as , when the centroids of a cluster in an iteration do not differ from the centroids from the previous iteration, that means that the clusters have not changed and I stop the process