I currently have a function that performs a Levensthein comparison between two values. I am interested in using this type of function in two ways.
Option 1: I would like to first go through an existing database field (Product Description) and compare all of the existing products against each other and return a calculated percent value that indicates the likelihood that a similar product already exists. The results could look something like this:
[product1] [percentmatch] [product2]
'Large Cup' 66.66% 'Lg Cup'
I would assume that my select statement would then loop to the next product in the table and perform the same comparison to find a similar product. Therefore, eventually I would see a duplicate result in the reverse order from what is listed above.
Option 2: I would also like to use a script to search the database for similar products when I provide a variable to search for and have the same results be shown for any product that exceeds a 50% match for example.
[My New Variable] [percentmatch] [existingproductname]
X-Large Cup 81.81% Large Cup
X-Large Cup 54.54% Lg Cup
Thank you for your help with this in advance. I look forward to seeing how I can do this.
Thanks, D
You probably want to calculate percentage based on the number of edit operations vs. length of the 1st string:
Option #1 will be a CROSS JOIN:
Option #2: