I am trying to process a string with CHAR(int) and NCHAR(int) to convert those instances with their ASCII counter-parts. An example would be something like this:
CHAR(124) + (SELECT TOP 1 CAST(name AS VARCHAR(8000)) FROM (SELECT TOP 1 colid, name FROM [Projects]..[syscolumns]
WHERE xtype=char(85)
AND id = OBJECT_ID(NCHAR(69)+NCHAR(78)+NCHAR(95)+NCHAR(69)+NCHAR(109)+NCHAR(112)+NCHAR(108))
Note that I don't want to do anything to VARCHAR(int), and just to the CHAR(int) and NCHAR(int) parts only. The above should translate to:
|(SELECT TOP 1 CAST(name AS VARCHAR(8000)) FROM (SELECT TOP 1 colid, name FROM [Projects]..[syscolumns] WHERE xtype=U AND id = OBJECT_ID(EN_Empl)
Note that any "+" on either side of CHAR(int) or NCHAR(int) should be removed. I tried the the following:
def conv(m):
return chr(int(m.group(2)))
print re.sub(r'([\+ ]?n?char\((.*?)\)[\+ ]?)', conv, str, re.IGNORECASE)
where str
=the raw string that must be processed.
Somehow, the VARCHAR(8000) is being picked up. If I tweak the regex, the "=" after xtype is going away, rather than just the space and the "+" on either side of a CHAR(int) or NCHAR(int) instance.
Hope someone can pull me out of this.
ADDITIONAL SAMPLE STRINGS:
String "char(124)+(Select Top 1 cast(name as varchar(8000)) from (Select Top 1 colid,name From [Projects]..[syscolumns]
Where id = OBJECT_ID(NCHAR(69)+NCHAR(78)+NCHAR(95)+NCHAR(69)+NCHAR(109)+NCHAR(112)+NCHAR(108)))"
Regex: r'(\bn?char\((\d+)\)(?:\s*\+\s*)?)'
Result: "|(Select Top 1 cast(name as varchar(8000)) from (Select Top 1 colid,name From [Projects]..[syscolumns] Where id = OBJECT_ID(ENCHAR(78)+NCHAR(95)+NCHAR(69)+NCHAR(109)+NCHAR(112)+NCHAR(108)))"
You can go a long way just by adding the word boundary (
\b
) assertion, but I'd like to suggest that you (1) usere.VERBOSE
to write a regexp someone can understand later; (2) compile the regexp to reduce clutter at the call site; and, (3) tighten some of the matching criteria. Like so:Note that I changed your
str
todata
:str
is the name of a heavily used builtin function, and it's a Really Bad Idea to create a variable with the same name.