I'm using SQL Server 2008 R2, and using CONTAINSTABLE ISABOUT() to identify key phrases in document titles. However, one specific key phrase is yielding false positives. They keyphrase is: "p-h-c industry" (without the quotes). Here's some sample code that illustrates the problem:-
CREATE TABLE tmp_test (myidentifier nvarchar(30) NOT NULL, mytext varchar(max));
CREATE UNIQUE CLUSTERED INDEX test_key ON tmp_test(myidentifier);
CREATE FULLTEXT CATALOG tmp_test_cat;
CREATE FULLTEXT INDEX ON tmp_test(mytext) KEY INDEX test_key ON tmp_test_cat;
GO
INSERT tmp_test VALUES ('good case','Western p-h-c industry');
INSERT tmp_test VALUES ('bad case','competitive advertising industry');
GO
SELECT * FROM CONTAINSTABLE (tmp_test, mytext, 'ISABOUT ("p-h-c industry" WEIGHT (.8))') ORDER BY RANK DESC;
go
KEY RANK
------------------------------ -----------
Informational: The full-text search condition contained noise word(s).
good case 15
bad case 15
(2 row(s) affected)
I would expect the SELECT * FROM CONTAINSTABLE() to only return a row for the 'good case' since there is in an exact match. But it's also returning a row for the 'bad case' even though the keyphase "p-h-c industry" does not appear in the bad case text.
Can anyone suggest a fix or workaround? I have to do it using CONTAINSTABLE() since the code architecture is built around that. Thanks