<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

<head>

  <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">

  <title></title>

</head>

<body bgcolor="#ffffff" text="#000000">

On 10/12/2012 08:46 AM, Alexei Golovko wrote:

<blockquote cite="mid:764171350045978@web30d.yandex.ru" type="cite">

  <div>Yes, they are not legal XML, they are parts, corresponding to

user's selection.</div>

  <div>I need such things:</div>

  <div>1) to get a length of fragment (to save position of fragment as

sum of lengths of all preceding fragments)</div>

  <div>2) to compare fragments (hence attributes shoud be ordered some

way, whitespaces inside tag normalised etc.)</div>

  <div>3) to check that string forms valid fragment (no "partial-tag

/>")</div>

  <div> </div>

  <div>Additional bonus is to be sure, that one fragment may be safely

(w.r.t. fixed XML schema) replaced by another (suggested by application

user).</div>

  <div> </div>

  <div>Currently, I treat length of tag (open: <em attr1="value1"

... > — or closed: </em>) as 1 (because tag is atomar for

selecting and because it look more natural where I use DOM api).</div>

</blockquote>

<br>

OK, then I'd expect to parse fragments into a simple tree datatype.<br>

<br>

Maybe if you point us to an example of your parsing code, I can give

some advice on making it faster.  (Linear-time/space parsing of strings

in pure Ur code should be pretty easy, if you use the right standard

library functions.)<br>

<br>

<blockquote cite="mid:764171350045978@web30d.yandex.ru" type="cite">

  <div>  12.10.2012, 03:09, "Adam Chlipala" <a class="moz-txt-link-rfc2396E" href="mailto:adamc@csail.mit.edu"><adamc@csail.mit.edu></a>:</div>

  <blockquote type="cite">On 10/10/2012 01:22 PM, Alexei Golovko wrote:

    <blockquote cite="mid:162981349889763@web17g.yandex.ru" type="cite">

      <div>What is the best way to parse xml on the client side? More

precisely, I need to process not only full xml data, but also the

fragments like <em>"bla-bla</em>

baz-baz-<strong>baz</strong>"</em> with bounds in the text

nodes (that is not inside tag as <em>"end-of-tag-name> text"</em>).</div>

      <div> </div>

      <div>I have some (quick and dirty) parsec-like combinators, but

they are buggy and too slow.</div>

    </blockquote>

    <br>

So you want fragments that are not legal XML on their own?  Well, which

type do you want to target with your translation?</blockquote>

  <div> </div>

  <div>Thanks.</div>

  <div>Fragments are not html, so the first does not solve problem.</div>

  <div>The second, I thought, doesn't work on client-side, does it?</div>

</blockquote>

<br>

Right; the feed library is server-side.<br>

<br>

<blockquote cite="mid:764171350045978@web30d.yandex.ru" type="cite">

  <blockquote type="cite">Two bits of related library code:<br>

- A basic & configurable HTML parser (only does legal fragments,

though): <a moz-do-not-send="true"

 href="http://hg.impredicative.com/meta/file/7530b2b54353/html.urs">http://hg.impredicative.com/meta/file/7530b2b54353/html.urs</a><br>

- The XML feed processing library: <a moz-do-not-send="true"

 href="http://hg.impredicative.com/feed">http://hg.impredicative.com/feed</a><br>

  </blockquote>

</blockquote>

</body>

</html>