PROBLEM
I want to implement a type of specific approximate matching of two sentences in Python.
Example -
s_1 = "I hope you are safe from COVID-19 today"
s_2 = "I hope you're safe from COVID 19 today"
score = get_similarity(s_1, s_2)
OR
s_1 = "I allow account access to facebook"
s_2 = "I allow an account access to face book"
score = get_similarity(s_1, s_2)
APPROACH
I tried using FuzzyWuzzy
to get a partial ratio of matching, but I have observed that with that, even if s_2
is "I allow an account access to
, without the face book
, it will give a high similarity score.
ASK
Is there a better way so that I can take into account a similarity of the entire sentence into consideration?
NOTE - s_2
might or might not be a transcription from a from a video file so will have to account for that delta in getting a precise text. Example, FACEBOOK can be transcribed as FACE BOOK