Ignore Dash (-) from Full Text Search (FREETEXTTABLE) search column in SQL Server

1.6k views Asked by At

I use CONTAINSTABLE for my searching algorithm. I want to search column value with ignoring dash in particular column value. for example, column contains '12345-67' then it should search with '1234567' as below query.

SELECT * 
FROM table1 AS FT_Table
INNER JOIN CONTAINSTABLE(table2, columnname, '1234567') AS Key_Table ON FT_Table.ID = Key_Table.[Key]

Is there any way to ignore dash (-) while searching with string that doesn't contain a dash (-)?

1

There are 1 answers

1
Emre Kabaoglu On

I did some digging and spent a few hours time :) Unfortunately, there is no way to perform it. Looks like Sql Server FTS populate the words by breaking words (except whitespaces) also special characters( -, {, ( etc.) But it doesn't populate complete word and my understanding there is no way to provide some population rules for satisfy the need. (I mean, telling to population service, If the word contains "-" replace it with "".) I provided an example for clarify the situation.

Firstly, create table, FTS catalog, Full text index and insert sample row for table.

CREATE TABLE [dbo].[SampleTextData] 
(
  [Id] int identity(1,1) not null,
  [Text] varchar(max) not null,
  CONSTRAINT [PK_SampleTextData] PRIMARY KEY CLUSTERED 
  (
        [Id] ASC
  )
);  
CREATE FULLTEXT CATALOG ftCatalog AS DEFAULT;  
CREATE FULLTEXT INDEX ON SampleTextData
(Text)
KEY INDEX PK_SampleTextData
ON ft 

INSERT INTO [SampleTextData] values ('samp-le text')

Then, provide sample queries;

select * from containstable(SampleTextData,Text,'samp-le') --Success

select * from containstable(SampleTextData,Text,'samp') --Success

select * from containstable(SampleTextData,Text,'le') --Success

select * from containstable(SampleTextData,Text,'sample') -- Fail

These samples are successfully except one 'Samp-le'. For investigating the situtation, execute this query;

SELECT display_term, column_id, document_count
FROM sys.dm_fts_index_keywords (DB_ID('YourDatabase'), OBJECT_ID('SampleTextData')) 

Output :

le          2   1
samp        2   1
samp-le     2   1
text        2   1
END OF FILE 2   1

The query gives us word results which are populated by FTS population service. As you see, the population results contain 'le', 'samp', 'samp-le' but not 'sample'. This is the reason how sample query getting failed.