I ve installed dask. My main aim is clustering a large dataset, but before starting work on it, I want to make a few tests. However, whenever I want to run a dask code piece, it takes too much time and a memory error appears at the end. I tried their Spectral Clustering Example and the short code below.
Do you think what is the problem?
from dask.distributed import Client from sklearn.externals.joblib import parallel_backend from sklearn.datasets import make_blobs from sklearn.cluster import DBSCAN import datetime X, y = make_blobs(n_samples = 150000, n_features = 2, centers = 3, cluster_std = 2.1) client = Client() now = datetime.datetime.now() model = DBSCAN(eps = 0.5, min_samples = 30) with parallel_backend('dask'): model.fit(X) print(datetime.datetime.now() - now)