How to sanitize form values to allow text-only

803 views Asked by At

I understand that if a user needs to supply HTML code as part of a form input (e.g. in a textarea) then I use an Anti-Samy policy to filter out the hazardous HTML that's not permitted.

However, I have some text-fields and text-areas which should be text-only. No HTML code at all should be inserted into the DB from these fields.

I am trying to therefore sanitize the inputs so that only raw text is inserted into the database. I believe I can do this two ways:

  1. Use a Regex expression to filter out HTML code e.g. #REReplaceNoCase(FORM.InputField, "[^a-zA-Z\d\s:]", "", "ALL")#
  2. Use a strict text-only Anti-Samy policy

Which option is the correct/good-practice way to remove any user inputted HTML code from a textfield. Or are there further options available to me?

1

There are 1 answers

4
Tony Junkes On BEST ANSWER

While you could use AntiSamy to do it, I don't know how sensible that would be. Kinda defeats the purpose of it's flexibility, I think. I'd be curious about the overhead, even if minimal, to running that as a filter over just a regex.

Personally I'd probably opt for the regex route in this scenario. Your example appears to only strip the brackets. Is that acceptable in your situation? (understandable if it was just an example) Perhaps use something like this:

reReplace(string, "<[^>]*>", "", "ALL");