[Ur] XHTML character entities (was: File I/O)

Sat Nov 5 15:07:59 EDT 2011

Adam Chlipala wrote:
> Marc Weber wrote:
>> Excerpts from Adam Chlipala's message of Tue Nov 01 14:34:11 +0100 2011:
>>> The error message isn't meant to suggest it's invalid HTML, but merely
>>> that the Ur/Web lexer doesn't support it yet.  I could copy-and-paste a
>>> table of all valid HTML entities into the Ur/Web lexer/parser source
>>> code.  Is that the best way to support all these little shorthands?
>>> (You're not losing expressive power, as far as I know, since the
>>> "&#NNN;" form is already supported.)
>> The best way is to make urweb read and understand xml dtd files or such.
>> (my 2 cents)
>
> I'm looking at an HTML 4 DTD at w3.org, and the text "copy" doesn't 
> appear in it anywhere.  Are these character identities (e.g., 
> "©") really part of the DTD?

I found some files at w3.org that seem to specify this information.  I 
don't know if they're called "DTDs," but they seem to get the job done. :)

The latest Mercurial repo version of Ur/Web now supports all the entities:
     http://hg.impredicative.com/urweb

James, could you verify that this feature now works as you expect?  Thanks!