I'm a beginner here and would appreciate your advice. I'm trying to apply K-means clustering to solar energy data set. Observations were taken each hour for 30 days of 20 different strings. I want to cluster each day (even each hour) separately, so that I could see which string at what time performs better than others. I thought that converting my data frame into 3D array would be easier, however from sklearn.cluster.KMeans documentation I understand that this method expects 2D array. Do you have any suggestion how can I slice my data in order to pass it through K-means algorithm and t-sne in order to illustrate my results? Thank you! This is how original data frame looks like.
#removing night time
nighttime_hours = list(range(21, 24)) + list(range(0, 6))
data_202203 = data_202203[~data_202203.index.hour.isin(nighttime_hours)]
#Reshaping data frame to 3D np array
data_np = data_202203.to_numpy()
num_data = len(data_np)//15
data_3d = data_np.reshape((num_data, 15, -1))
data_t = data_3d.swapaxes(1,2)
# Specify the number of clusters
num_clusters = 3
# Create and fit the KMeans model
kmeans = KMeans(n_clusters=num_clusters, random_state=42)
cluster_labels = kmeans.fit_predict(data_t)
# Add cluster labels as a new column in your DataFrame
data_t['Cluster'] = cluster_labels
Currently this gives following error: ValueError: Found array with dim 3. KMeans expected <= 2.