Dynamic Time Warp with Speech Signal Processing Toolkit (SPTK) output

358 views Asked by At

I'm an IT student and got an assignment to do about Dynamic Time Warping(DTW) using the Speech Signal Processing Toolkit (SPTK) and comparing some words spoken by 2 speakers and finding the similarities. I managed to get the SPTK working and everything, collected 8 people(4 female, 4 male) who recorded 8 words each for me(same words for every person) and saved them as files with a .wav extension.

My .wav files are: RIFF (little-endian) data, WAVE audio, mono 16000 Hz. I transfered every .wav file into .short data files. I transfered every .short file to a .mcep file with this line of code:

x2x +sf < source_maleA.short | frame -l 400 -p 80 | window -l 400 -L 512 | mcep -l 512 -m 20 -a 0.42 > source_maleA.mcep

After that, I went to compare the .mcep files with this line of code:

dtw -m 24 target_maleB.mcep < source_maleA.mcep > source_maleA_target_maleB.dtw

The output of that command line should be a numeric value(probably a float/double/int value) or a few values. The problem is that I'm not sure how to open that .dtw files and in the documentation I get there isn't any good info about that. When I try to open it in any editor or cat it in the terminal, I get some strange letters as an output [picture 1].

In the documentation however it says that with the parameter -s [Score] I can get the score of the DTW process. So I tried it with this command line:

dtw -m 24 -s Scorefile target_maleB.mcep < source_maleA.mcep > source_maleA_target_maleB.dtw

I get a value, but in strange format.

I searched online and in many documentations about the .dtw file and couldn't find anything. I tried to convert the result into another format, but not any luck with that. Tried to contact my mentor about it, but no answers so far and it's been a while already.

Anyone could give me any suggestion on what to do or anything else? The documentation can be found on this site : http://sp-tk.sourceforge.net/ (sorry for not link, but still not enough reputation - will remove if I have to), but I don't think it's needed that much, since I think I pretty much understood the DTW process and think I've done it ok, it's just that the output is causing me problems.

Thanks in advance,

Marco.

picture 1

1

There are 1 answers

0
Lorenz G On

The score file is in float so you have to convert it to asci with the x2x command from SPTK:

x2x +fa scorefile.bin > scorefile.txt