Django - mark only certain part of string / user input as safe?

1.4k views Asked by At

The user text input is as follows:

Test 'post'. Post at 8:52 on Feb 3rd. /u/username created it.
This <a href="link">link</a> should not be displayed as a link.

I send the user's input through a custom filter when showing it on the template. This is the custom filter:

word_split_re = re.compile(r'(\s+)')

@register.filter
@stringfilter
def customUrlize(value):
    words = word_split_re.split(force_text(value))
    for i, b in enumerate(words):
        if b.startswith('/u/'):
            username = b[3:]
            if re.match("^[A-Za-z0-9_-]*$", username):
                b = "<a href='testLink'>" + b + "</a>"
                words[i] = mark_safe(b)
    return ''.join(words)

As you can see, what I want to do is wrap the words which start with '/u/' (And only contains letters, numbers, underscores and dashes) with an

<a>

tag. With the current filter, all the code is escaped and it is displayed as:

Test 'post'. Post at 8:52 on Feb 3rd. <a href='testLink'>/u/username</a> created it.
This <a href="link">link</a> should not be displayed as a link.

What I want is for the text to be displayed normally but for /u/username to be a link.

If I try doing:

return mark_safe(''.join(words))

then it displays even the

<a href="link">link</a>

as a link along with

/u/username

How do I make it so that it only displays

/u/username

as a link?

Edit: I am using Django 1.5.

In my template, assuming

comment

is a

CharField

I display the comment as so:

{{ comment|customUrlize }}
2

There are 2 answers

7
AudioBubble On BEST ANSWER

Unless there is some additional formatting in the text that you want to keep, you can just escape the text before altering it.

Returns the given text with ampersands, quotes and angle brackets encoded for use in HTML. The input is first passed through force_text() and the output has mark_safe() applied.

From the Django documentation

So this line:

words = word_split_re.split(force_text(value))

Becomes this:

words = word_split_re.split(escape(value))

The complete filter is:

from django.utils.html import escape

word_split_re = re.compile(r'(\s+)')

@register.filter
@stringfilter
def customUrlize(value):
    words = word_split_re.split(escape(value)) 
    for i, b in enumerate(words):
        if b.startswith('/u/'):
            username = b[3:]
            if re.match("^[A-Za-z0-9_-]*$", username):
                b = "<a href='testLink'>" + b + "</a>"
                words[i] = mark_safe(b)
    return mark_safe(''.join(words))

And should give:

Test 'post'. Post at 8:52 on Feb 3rd. <a href='testLink'>/u/username</a> created it.
This &lt;a href="link"&gt;link&lt;/a&gt; should not be displayed as a link.

Which renders as:

Test 'post'. Post at 8:52 on Feb 3rd. /u/username created it. This <a href="link">link</a> should not be displayed as a link.

0
Lucas Veiga On

Maybe it won't help depending on your needs, but you could just split the string in two and in the template mark only the first one with |safe.

For example:

a = "Test 'post'. Post at 8:52 on Feb 3rd. <a href='testLink'>/u/username</a> created it. This <a href='link'>link</a> should not be displayed as a link."

b = a.split('it.')

Then just pass it to the template as

'string1': b[0]
'string2': b[1]

or whatever, and then {{string1|safe}} <br> {{string2}} in the template.

The output will be as you wanted. Without the "it.", ofcourse. But that's easy to fix.