I didn't find this so I thought I would post it. I needed a function to strip out disallowed characters in X12 fields.
Basic character set allowed:
Uppercase letters A B C D E F G H I J K L M N O P Q R S T U V W X Y Z Numeric digits 0 1 2 3 4 5 6 7 8 9 Special characters ! " & ' ( ) * + , - . / : ; ? = The space character
Control characters: BEL HT LF VT FF CR FS GS RS US NL ASCII hex values: 07 09 0A 0B 0C 0D 1C 1D 1E 1F
Extended character set alowed:
Lowercase letters a b c d e f g h i j k l m n o p q r s t u v w x y z Other special chars % ~ @ [ ] _ { } \ | < > National characters # $
Control characters: SOH STX ETX EOT ENQ ACK DC1 DC2 DC3 DC4 NAK SYN ETB ASCII hex values: 01 02 03 04 05 06 11 12 13 14 15 16 17
Some times I want to allow only the Basic character set and some times I want to allow the extended character set depending on the file and the partner. This is the code I came up with:
-- This function is for stripping out non-X12 characters
-- with the option of also stripping out X12 extended characters
-- which are %~@[]_{}\|<>
/*
Declare @text as nvarchar(max) = N'% ~ @ [ ] _ { } \ | < > A B C D E F G H I J K L M N O P Q R S T U V W X Y Z / { } *' + char(27) + char(189) + char(191) + '* 0 1 2 3 4 5 6 7 8 9 ! " & ( ) * + , - . / : ; ? = _ ^ %'
Print @text
Print [dbo].[f_remove_non_x12_chars_extended](@text,'')
Print [dbo].[f_remove_non_x12_chars_extended](@text,'~@_{}\|<>')
Print [dbo].[f_remove_non_x12_chars_extended](@text,'[]~@_{}\|<>')
Print [dbo].[f_remove_non_x12_chars_extended](@text,'[]~@_{}\|<>%^"')
Print [dbo].[f_remove_non_x12_chars_extended](@text,'[]~@_{}\|<>%^-"*')
Print [dbo].[f_remove_non_x12_chars_extended](@text,'[]~@_{}\|<>%^-"* ')
*/
CREATE Function [dbo].[f_remove_non_x12_chars_extended](@text VarChar(MAX), @extraChars varchar(100) = '')
Returns VarChar(1000)
AS
Begin
-- PATINDEX treats the []%_^- charcters as wildcards so we have to treat them differently
Declare @hasOpenBracket as bit = 0
Declare @hasClosedBracket as bit = 0
Declare @hasPercent as bit = 0
Declare @hasUnderscore as bit = 0
Declare @hasCaret as bit = 0
Declare @hasDash as bit = 0
-- If the wildcards are in the extra character set remove them from @extraCharacters and set a flag that they are in there
if CHARINDEX('[', @extraChars) > 1 begin set @hasOpenBracket = 1; set @extraChars = REPLACE(@extraChars, '[', ''); end
if CHARINDEX(']', @extraChars) > 1 begin set @hasClosedBracket = 1;set @extraChars = REPLACE(@extraChars, ']', ''); end
if CHARINDEX('%', @extraChars) > 1 begin set @hasPercent = 1; set @extraChars = REPLACE(@extraChars, '%', ''); end
if CHARINDEX('_', @extraChars) > 1 begin set @hasUnderscore = 1; set @extraChars = REPLACE(@extraChars, '_', ''); end
if CHARINDEX('^', @extraChars) > 1 begin set @hasCaret = 1; set @extraChars = REPLACE(@extraChars, '^', ''); end
if CHARINDEX('-', @extraChars) > 1 begin set @hasDash = 1; set @extraChars = REPLACE(@extraChars, '-', ''); end
Declare @removeValues as varchar(50) = '%[' + CHAR(0) + '-' + CHAR(31) + CHAR(127) + '-' + CHAR(255) + @extraChars + ']%'
Declare @ptr int = PatIndex(@removeValues, @text COLLATE Latin1_General_100_BIN2)
While @ptr > 0
BEGIN
Set @text = Stuff(@text, @ptr, 1, '')
Set @ptr = PatIndex(@removeValues, @text COLLATE Latin1_General_100_BIN2)
END
-- Remove the extra characters if the flag is set
if @hasOpenBracket = 1 begin set @text = replace(@text, '[', '') end
if @hasClosedBracket = 1 begin set @text = replace(@text, ']', '') end
if @hasPercent = 1 begin set @text = replace(@text, '%', '') end
if @hasUnderscore = 1 begin set @text = replace(@text, '_', '') end
if @hasCaret = 1 begin set @text = replace(@text, '^', '') end
if @hasDash = 1 begin set @text = replace(@text, '-', '') end
Return @text
--Return CONCAT(@hasOpenBracket, @hasClosedBracket, @hasPercent, @hasUnderscore, @hasCaret, @extraChars, @text)
End
Just a tip to save some others some time. If you would like to suggest improvements feel free! If you want to be condensending go else where.