DTW distance matrix - tsclust loading to long

47 views Asked by At

im trying to find an answer but i still new to Dynamic time warping in r. I have a data set with over 20000 observation, 20 ID's and an outcome which was measured two and three times pe3r hour. my data looks something like this:

#ID Hour outcome
#1 00:30 3.4
#1 00:50 2.3
#...     ......
#1 23:40 0.5
#2 00:21 2.3
#...     ......

So for each ID i have around 1500 time points but the time series are not the same length (some ID start sooner or later and the time series have different time intervals)

I tried a distance matrix

dtwOmitNA <-function (x,y)
    {
        a<-na.omit(x)
        b<-na.omit(y)
        return(dtw(a,b,distance.only=TRUE)$normalizedDistance)
    }

and i want to use my distance matrix for tsclust using DBA centriod which lookes something like this:

 clustering_result <- tsclust(time_series_list
                           , k = 2L:19L  #number of clusters
                           , distance = "dtwOmitNA" #dissimilarity function
                           , centroid = "dba"#DTW Barycenter Averaging
                           , trace = F
                           , seed = seed
                           , norm = "L2", window.size = NULL #for DBA
                           , args = tsclust_args(cent = list(trace = F, window.size = 18L), dist=list(window.size = 18L))
                         #  , normalize=T # distance normalized
                          # , sqrt.dist =F
                           )

The question is that tsclust is loading too long and i dont know if i did a mistake somewhere? Maybe the problem is that i have to many observation (because i measure each id multiple times per hour?)

i tried searching for other examples but i only could set the window.size with the information i found.

0

There are 0 answers