[Ur] UTF-8 in xml

Adam Chlipala adamc at impredicative.com
Sun Nov 22 11:33:21 EST 2009


Vladimir Shabanov wrote:
> When I'm writing some utf-8 text inside <xml/> Ur generates escape
> sequences, while it should just leave text as is since Ur generates
> xml in utf-8 encoding (at least it is written at first line of
> html-s).
>
> Could this be fixed?
>   

Support for non-ASCII characters is certainly in my plans, but I don't 
feel I know enough right now to implement it securely.  I've read scare 
stories about normalization of Unicode and the problems it can cause.  
In particular, I want to be sure that no CDATA text is escaped in a way 
that leaves characters which browsers will interpret in any way besides 
displaying them literally.

Does anyone understand the issue well enough to suggest the right way to 
implement this?



More information about the Ur mailing list