SQL Server: Why is SELECT on case insensitive column faster than on case sensitive?

1.9k views Asked by At

I use SQL Server 2016 Express and a Java application with JDBC driver version 4.2.

My database has a collation of Latin1_General_CI_AS (case insensitive). My table has a column of type VARCHAR(128) NOT NULL. There is an unique index on that column.

My test scenario is as follows:

After inserting 150000 strings of 48 characters length I do 200 selects of randomly chosen, existing strings. I measure the total execution time of all queries.

Then I drop the index, alter the table to change the columns collation to Latin1_General_CS_AS (case sensitive) and create the unique index again.

Then 200 selects take in total more time.

In both cases (CI and CS) the execution plans are simple and identical (search by using the index).

The query execution time not only depends on case sensitivity. With collation CS it grows faster if the strings have identical prefixes. Here are my results (execution time in seconds):

+----+---------+------------------+-------------------+-------------------+
|    + RND(48) + CONST(3)+RND(45) + CONST(10)+RND(38) + CONST(20)+RND(28) +
+----+---------+------------------+-------------------+-------------------+
| CI +       6 +                6 +                 7 +                 9 +
| CS +      10 +               20 +                45 +                78 +
+----+---------+------------------+-------------------+-------------------+

The longer the identical prefix of the random strings is the more time the case sensitive queries take.

  1. Why is the search on case insensitive column faster than on case sensitive column?
  2. What is the reason for the identical prefix behavior?
1

There are 1 answers

1
Lee James On

The reason is because your SQL installation (I am guessing) was done with CI collation. This means your tempdb and master databases are using CI and currently so is your own database. Therefore, even though you changed your character column to be CS, when it is used in tempdb for sorting/merging operations, that is executed in a CI context. To get an accurate comparison, you need to change your installation collation to be CS or make these comparisons side by side on different SQL instances - one using CS and one using CI.