Handling backslashes in plpython

160 views Asked by At
CREATE OR REPLACE FUNCTION CLEAN_STRING(in_str varchar) returns varchar 
AS
$$
def strip_slashes(in_str):

    while in_str.endswith("\\") or in_str.endswith("/"):
            in_str = in_str[:-1]

    in_str = in_str.replace("\\", "/")
    return in_str
clean_str = strip_slashes(in_str)
return clean_str
$$
LANGUAGE plpythonu ;

This gives me IndentationError . However, If I remove backslashes, it works fine. How can I handle backslashes inside plpythonu?

2

There are 2 answers

0
Deepak K M On

I used a below work-around for now.

CREATE OR REPLACE FUNCTION CLEAN_STRING(in_str varchar) returns varchar 
AS
$$
def strip_slashes(in_str):

    while in_str.endswith(chr(92)) or in_str.endswith("/"):
            in_str = in_str[:-1]

    in_str = in_str.replace(chr(92), "/")
    return in_str
clean_str = strip_slashes(in_str)
return clean_str
$$
LANGUAGE plpythonu ;
0
Filipe On

I think it's related to the fact that when inside a string you need to escape special characters like backslashes.

I went trough something similar, I will describe my situation and hope it shed some light to you.

I was writing a greenplum function, and needed to create a SQL command and aply it to a variable, like this:

v_cmd := 'insert into tst_regexp2 (codigo) select cast(t.codigo as  integer) from (select regexp_replace(''([:space:]|\u00a0|)+$'', '''') as codigo from tst_regexp) t;';

But the backlash on "\u00a0" was throwing an annoying warning on the function creation. I tried the normal escaping solutions like duplicating the backlashes "\\u00a0", and nothing solved the issue.

After some research I came accross this special "E" command that informs that all special characters should be escaped when you use it before a string, so the following command did the trick:

v_cmd := 'insert into tst_regexp2 (codigo) select cast(t.codigo as  integer) from (select regexp_replace(''([:space:]|'||E'\u00a0|)+$'', '''') as codigo from tst_regexp) t;';

I hope that helps a little bit.

Finally, if you're writing a function to clean a string from special characters, I recommend you take a look at regexp. The below code I use to clean everything (so far) but numbers from a string, I'm sure you will find an expression that suits you:

regexp_replace(regexp_replace(regexp_replace(upper(codigo), ''[[:alpha:][:blank:][:punct:][:cntrl:]]'', '''', ''g''), ''[ÀÈÌÒÙÁÉÍÓÚÂÔÊÎÛÃÕ]'', '''', ''g''), ''([:space:]|'||E'\u00a0|)+$'', '''') as codigo

Regards