K-means clustering time series data

32 views Asked by Monika At 13 March 2024 at 12:30

I'm a beginner here and would appreciate your advice. I'm trying to apply K-means clustering to solar energy data set. Observations were taken each hour for 30 days of 20 different strings. I want to cluster each day (even each hour) separately, so that I could see which string at what time performs better than others. I thought that converting my data frame into 3D array would be easier, however from sklearn.cluster.KMeans documentation I understand that this method expects 2D array. Do you have any suggestion how can I slice my data in order to pass it through K-means algorithm and t-sne in order to illustrate my results? Thank you! This is how original data frame looks like.

#removing night time
nighttime_hours = list(range(21, 24)) + list(range(0, 6))
data_202203 = data_202203[~data_202203.index.hour.isin(nighttime_hours)]

#Reshaping data frame to 3D np array
data_np = data_202203.to_numpy()
num_data = len(data_np)//15 
data_3d = data_np.reshape((num_data, 15, -1))
data_t = data_3d.swapaxes(1,2)

# Specify the number of clusters 
num_clusters = 3

# Create and fit the KMeans model
kmeans = KMeans(n_clusters=num_clusters, random_state=42)
cluster_labels = kmeans.fit_predict(data_t)

# Add cluster labels as a new column in your DataFrame
data_t['Cluster'] = cluster_labels

Currently this gives following error: ValueError: Found array with dim 3. KMeans expected <= 2.

Original Q&A

TechQA.

K-means clustering time series data

There are 0 answers

Related Questions in PYTHON

Related Questions in TIME-SERIES

Related Questions in BIGDATA

Related Questions in K-MEANS

Related Questions in TSNE

Popular Questions

Trending Questions