SQL Server 2008 query to find rows containing non-alphanumeric characters in a column

95.8k views Asked by At

I was actually asked this myself a few weeks ago, whereas I know exactly how to do this with a SP or UDF but I was wondering if there was a quick and easy way of doing this without these methods. I'm assuming that there is and I just can't find it.

A point I need to make is that although we know what characters are allowed (a-z, A-Z, 0-9) we don't want to specify what is not allowed (#@!$ etc...). Also, we want to pull the rows which have the illegal characters so that it can be listed to the user to fix (as we have no control over the input process we can't do anything at that point).

I have looked through SO and Google previously, but was unable to find anything that did what I wanted. I have seen many examples which can tell you if it contains alphanumeric characters, or doesn't, but something that is able to pull out an apostrophe in a sentence I have not found in query form.

Please note also that values can be null or '' (empty) in this varchar column.

4

There are 4 answers

6
beach On BEST ANSWER

Won't this do it?

SELECT * FROM TABLE
WHERE COLUMN_NAME LIKE '%[^a-zA-Z0-9]%'

Setup

use tempdb
create table mytable ( mycol varchar(40) NULL)

insert into mytable VALUES ('abcd')
insert into mytable VALUES ('ABCD')
insert into mytable VALUES ('1234')
insert into mytable VALUES ('efg%^&hji')
insert into mytable VALUES (NULL)
insert into mytable VALUES ('')
insert into mytable VALUES ('apostrophe '' in a sentence') 

SELECT * FROM mytable
WHERE mycol LIKE '%[^a-zA-Z0-9]%'

drop table mytable 

Results

mycol
----------------------------------------
efg%^&hji
apostrophe ' in a sentence
3
Adriaan Stander On

Sql server has very limited Regex support. You can use PATINDEX with something like this

PATINDEX('%[a-zA-Z0-9]%',Col)

Have a look at PATINDEX (Transact-SQL)

and Pattern Matching in Search Conditions

1
Ayme On
SELECT * FROM TABLE_NAME WHERE COL_NAME LIKE '%[^0-9a-zA-Z $@$.$-$''''$,]%'

This works best for me when I'm trying to find any special characters in a string

0
wwmbes On

I found this page with quite a neat solution. What makes it great is that you get an indication of what the character is and where it is. Then it gives a super simple way to fix it (which can be combined and built into a piece of driver code to scale up it's application).

DECLARE @tablename VARCHAR(1000) ='Schema.Table'
DECLARE @columnname VARCHAR(100)='ColumnName'
DECLARE @counter INT = 0
DECLARE @sql VARCHAR(MAX)

WHILE @counter <=255
BEGIN

SET @sql=

'SELECT TOP 10 '+@columnname+','+CAST(@counter AS VARCHAR(3))+' as CharacterSet, CHARINDEX(CHAR('+CAST(@counter AS VARCHAR(3))+'),'+@columnname+') as LocationOfChar
FROM '+@tablename+'
WHERE CHARINDEX(CHAR('+CAST(@counter AS VARCHAR(3))+'),'+@columnname+') <> 0'

PRINT (@sql)
EXEC (@sql)
SET @counter = @counter + 1
END

and then...

UPDATE Schema.Table
SET ColumnName= REPLACE(Columnname,CHAR(13),'')

Credit to Ayman El-Ghazali.