9 Dec
2003
9 Dec
'03
6:54 a.m.
bulia byak wrote:
On the other hand, why do we have to guard against any control chars at all? If we get a string in UTF8, suffices it to check that it's valid UTF8, and complain if it's not. If it is valid, we can insert it without any further checks. Any objections?
Not sure.
One general thing with UTF-8 is security. A little is mentioned in the RFC on UTF-8
http://community.roxen.com/developers/idocs/rfc/rfc3629.html (See "Security Considerations")
So... it's often good to do 'sanitizing' of input. In this case, where it's direct user input, speed should not even be a problem.