I'm fairly new to SQL and I'm struggling with a sub query. I've got a table that looks like this:
sss | mm | sid
------------------+----+-----
IBM LTD | | 003
I.B.M. | | 003
A.BM LTD | | 004
IMB LTD | | 004
IMB UK | | 005
IBMUK LTD | | 006
IBMUKLTD | | 007
IBM LIMITED | | 008
IBM U.K. | | 008
IBM U.K. LIMITED | | 009
I.B.M UK LTD | 1 | 001
IBM | 1 | 001
IBM UK | 1 | 001
IBM UK LTD | 1 | 001
The rows with a 1
in the mm
column are rows that have been manually matched and the ID is correct. I want to use the similarity
function to try and find close matches to the records that have already been matched manually.
I know that normally you put a sub-query in the from statement but I can't do that with the similarity
function, but I'm not sure how to do it, this is what I'd like to do:
SELECT
sss,
similarity(sss, select(sss from tusm where mm = 1) as match_score)
from
tusm
where
mm is null and
where match_score >= 0.5
;
You have a few problems here. You can't use column aliases in the where clause, you are trying to assign a column alias on the wrong side of the parenthesis, you can't feed a set to the second argument of
similarity
, and you've just generally mangled the syntax in several places.To overcome the inability of
similarity
to deal with sets, do a self join:giving
If the table gets large, it will be hard to make this perform well.