Automatically closing tags in form input?

343 views Asked by At

I'm creating a BBcode type of function which takes out all html code from a form input and then converts [b][/b] to actual bold tags, [u] to actual u tags, and [i] to actual i tags.

What concerns me, however, is if whoever writes and submits the input doesn't close all the tags. I don't want that to mess up the entire page when the info is displayed later.

How would you recommend I automatically close all the tags (only b, i, and u are allowed) with the function? Is there a way to count how many [b] and how many [/b] there are and if there's a difference add that many [/b] to the end? Or is there an easier way?

BTW, I haven't tried anything yet because the only thing I can think of is to count how many [b] there are, count how many [/b] there are, get the difference between the two and make a loop that many times adding the closing tag. But I don't know how to do the first part of that (returning how many [b] there are).

If someone is willing to enlighten me on how to do that (I'm a noob I know) I will get right on trying it and let you know how it goes. :)

3

There are 3 answers

3
Steini On

There are different possibilities.

  • You can scan the content with different libraries such as "HTMLTidy" which would remove any unclosed tags

  • Also you could count all tags and if they are not closed and simply append a close tag for each unclosed tag to the content dynamically. preg_match could help you here...

  • Another idea would be to isolate the part with the user-written content into an iframe, that would cause that broken HTML wouldnt affect any elements outside the page.

2
Armage On

Use a simple array. You add each allowed opening tag, then "array_pop" each closing tag. At the end of processing the input text, if the array is not empty, you can close waiting tags.

And please, show us that you try to find a solution before asking, show us your code :)

EDIT:

Ok, here is a draft (not a polished one). I'm using a FILO (first in, last out) to store tags.

The first "for loop" parses the text to store unclosed tags. The second loop (foreach) adds waiting tags at end of input text.

If error is found, the code returns false, it should return more info about the error :)

$text = "[u]hop[u]text[b]bar[/b][/u][b][i]foo";

echo closeTags($text);

function closeTags($text) {
    $tags = array();
    $currentTag = '';
    $tagOn = false;
    $closingTagOn = false;
    $lastPos = 2;

    $len = strlen($text);
    for ($i=0 ; $i < $len ; $i++) {
        // reading tag ?
        if ($tagOn or (!$tagOn and '[' === $text[$i])) {
            $currentTag .= $text[$i];
            $tagOn = true;
        }

        // closing tag ?
        if (isset($currentTag[1]) and '/' === $currentTag[1]) {
            $closingTagOn = true;
            $lastPos = 3;
        }

        // tag ending ?
        if (isset($currentTag[$lastPos])) {
            if (']' !== $currentTag[$lastPos]) {
                return false; // malformed text
            }
            else {
                if ($closingTagOn) {
                    // quick & dirty
                    if ($tags[count($tags)-1][1] === $currentTag[2]) {
                        array_pop($tags);
                    }
                    else {
                        // malformed, markups should not cross over each other
                        return false;
                    }
                }
                else {
                    // adding the tag
                    $tags[] = $currentTag;
                }

                // re-init
                $currentTag = '';
                $tagOn = false;
                $closingTagOn = false;
                $lastPos = 2;
            }
        }
    }

    $tags = array_reverse($tags);
    foreach($tags as $tag) {
        $text .= '[/' . $tag[1] . ']';
    }
    return $text;
}
3
thinkofacard On

Okay, I discovered that PHP has a nifty little "tidy" class installed in PHP5+! So, this is the function I came up with and it seems to work!

function bbcode($data) {

  $patterns = array();
    $patterns[0] = '/</';
    $patterns[1] = '/>/';
  $new = preg_replace($patterns, "", $data);

  $newer = sanitize($new);

    $search  = array('[b]', '[/b]', '[i]', '[/i]', '[u]', '[/u]');
    $replace = array('<b>', '</b>', '<i>', '</i>', '<u>', '</u>');

  $newest = str_replace($search, $replace, $newer );

  $data1 = nl2br($newest);

  $tidy = tidy_parse_string($data1);
    $tidy->cleanRepair();
  return $tidy;
}