Differentiate properly escaped HTML metacharacters from improperly escaped ones

137 views Asked by At

I'm working on a replacement for a desktop Java app, a single page app written in Scala and Lift.

I have this situation where some of data in the database has properly used HTML metacharacters, such as Unicode escape sequences for accented characters in non-English names. At the same time, I have other data with improper HTML metacharacters, such as ampersands in the names or organizations.

  • Good (don't escape): Universita\u0027
  • Bad (needs escape): Bob & Jim

How do I determine whether or not the data needs to be fixed before I send it to the client?

There are two ways to approach this. One is a function that takes a string and returns the index of any improperly escaped HTML metacharacters (which I can fix myself). Alternately it could be a function that takes a string and returns a string with the improperly escaped metacharacters fixed, and leaves the proper ones alone.

0

There are 0 answers