I would like to compare two eye-tracking scanpaths. The eye-tracking results in a sequence of labels that the observer look at, for a division of an image into labeled tiles (rectangular regions). We also know from the eye-tracking at what time, and for how long the eye look at tile N.
The Levenshtein or string edit distance works fine, as long as the timing of the fixations are not taken into account. For example, f user 1 looks at the tiles "AKPLA", and user 2 looks at tiles "ATPLB" the string edit distance will be 2, but user 2 might look at "P" in a much longer time than user 2.
Any ideas of how to improve the distance measure to measure timing differences as well? (note that the algorithm is not restricted to character strings, it works equally well with arrays of integers).
An eye-tracking scanpaths originally would be a time-series. Transforming your time-series into a string only containing the labels where the person looked at leads to loosing information about time.
Thus if you want to take the time into account, you have to either work with the original time-series or take the time into account for your transformation.
For example: You could for every ten seconds give the laben where the person looked at on average. It could be "AAAAKPLAA" as compared to "AATTTPLBB". In this case you could use the Edit Distance and it would take into account how long someone looked where.
You could also simply work on the original time-series of eye-tracking which - as I assume - contains a time-stamp and a position. Then you could use Dynamic Time Warping to estimate the dissimilarity.
Anyhow, this is a very broad question and probably of no relevance for you any longer. If you could post the answer that you found yourself, it would be great.