I have difficulties trying to understand the behavior of PATINDEX when matching the caret symbol "^". I am both using it in it's exclusionary function, and matching it itself. I am trying different combinations of varchar and nvarchar for match pattern and input. The results are not what I expect. I could not find helpful info online.

Sample code, run on SQL Server 2019, collation is SQL_Latin1_General_CP1_CI_AS:

    -- Using patindex to match carets and non-carets confuses me. 
    -- I expect no matches, all results should be zero, 
    -- but only the first field returns a zero, the others get ones/find matches.
    SELECT
       'match_^__v_v'     = PATINDEX(  '^[^^]%',  '^^Z' COLLATE Latin1_GENERAL_CI_AS)
      ,'match_^__v_nv'    = PATINDEX(  '^[^^]%', N'^^Z' COLLATE Latin1_GENERAL_CI_AS)
      ,'match_^__nv_nv'   = PATINDEX( N'^[^^]%', N'^^Z' COLLATE Latin1_GENERAL_CI_AS)
      ,'match_^__nv_v'    = PATINDEX( N'^[^^]%',  '^^Z' COLLATE Latin1_GENERAL_CI_AS)
    
    -- Same issues for this attempt with additional square brackets, 
    -- with the same unexpected results as above:
    SELECT
       'match_[^]__v_v'   = PATINDEX(  '[^][^^]%',  '^^Z' COLLATE Latin1_GENERAL_CI_AS)
      ,'match_[^]__v_nv'  = PATINDEX(  '[^][^^]%', N'^^Z' COLLATE Latin1_GENERAL_CI_AS)
      ,'match_[^]__nv_nv' = PATINDEX( N'[^][^^]%', N'^^Z' COLLATE Latin1_GENERAL_CI_AS)
      ,'match_[^]__nv_v'  = PATINDEX( N'[^][^^]%',  '^^Z' COLLATE Latin1_GENERAL_CI_AS)
    
    -- Similar operation for "normal" letter shows none of the issues. 
    -- All fields return zero, as expected.
    SELECT
       'match_a__v_v'     = PATINDEX(  'a[^a]%',  'aaZ' COLLATE Latin1_GENERAL_CI_AS)
      ,'match_a__v_nv'    = PATINDEX(  'a[^a]%', N'aaZ' COLLATE Latin1_GENERAL_CI_AS)
      ,'match_a__nv_nv'   = PATINDEX( N'a[^a]%', N'aaZ' COLLATE Latin1_GENERAL_CI_AS)
      ,'match_a__nv_v'    = PATINDEX( N'a[^a]%',  'aaZ' COLLATE Latin1_GENERAL_CI_AS)

Questions:

  1. There should be no match, all fields should get zero. Why are the results not as expected?
  2. Why is PATINDEX's behavior different for varchar and nvarchar?
  3. Where exactly should I use varchar and/or nvarchar? Should I simply make match pattern and input the same type?

Clearly this has to do with the special function of the caret symbol. What am I missing?

I added COLLATE Latin1_GENERAL_CI_AS, after learning about the significance of the relationship of collation and data type, and after the provided sql fiddle in the comments. Thanks @Martin-Smith and @Thom-A! This makes all formulas with carets fail the same way.

I found pointers to a workaround here. This gets a zero, as expected. It uses a binary collation for exact matches by ASCII code and a range within the square brackets. The range can be defined with the same character as first and last. The meaning of the square brackets is [not the characters between ^ and ^].

SELECT "match_nv_nv" = PATINDEX( N'^[^^-^]%', N'^^Z' COLLATE Latin1_General_BIN2)
0

There are 0 answers