I'm looking for a means of hash coding audio speech files for comparison via SQL

846 views Asked by At

I've been developing a tool to compare an audio file recorded on day one to another recorded thirty days later. My training is in linguistics and this tool will be used to catalogue, index, and compare a database of unique vocal recordings. I am aware of commercial grade APIs such as MusicBrainz or EchoNest, but cannot use them for this project. All the files must be locally stored and cannot be contributed to an online database.

At present, I have spectrograms of each file and a batch converter that can convert to almost any sound file. I use a spectrum analyzer to exactly match the spectrograms (like a hash map overlay) and am able to match my results with 96% accuracy. However, as my project grows my storage needs will become far too lofty for this method.

My thought is this - if I can adjust the audio files to a similar frame speed, I should be able to hash code the acoustic data and store the hash strings in a simple SQL table rather than whole audio files or spectrograms. I don't want to hash the whole file - just the acoustics, for matching. I've found a few overkill solutions via Python (dejavu, libmo, etc) but as a linguist, not a computers person, I am unsure if a novice can wrangle the code for hashing audio data

I'm looking to have a way to create hash values (or another checksum) within the next week or so.Thoughts from the interwebz?

0

There are 0 answers