[libxml-devel] Strange node being created from white space

Charlie Savage cfis at savagexi.com
Mon Feb 2 02:08:07 EST 2009


> Regarding the use of NOBLANKS, I am actually glad it is not required 
> (and even discouraged). I nly tried it once the initial code did not 
> function as I expected. I cannot think of a scenario where you do want a 
> single "\n" to become a separate node, so I think it should be default 
> behaviour behaviour to not make it so.
> 
> Do you have any other suggestions what might be wrong?

As I read David's post (see link I posted), there is nothing wrong at 
all.  This is the way an xml parser is supposed to work and its up to 
you as the developer to know what to do with the blank nodes.  Quoting:

------
There is no way to know for sure whether a blank node is significant
or not. The standard says *always keep them*, the XML_PARSE_NOBLANKS 
options still tries to do some guesses, but since there is no proven 
algorithm which works (c.f. decades of SGML experience) this is an 
heuristic, and there is no garanteed result. Don't expect a reliable 
behaviour from this option, do not use it, only the application can tell 
if a blank node is significant or not.
-------------

I'd say don't worry about it, and use xpath to grab the elements you 
need out of the document.

Is this windows by the way?  Could be a CRLF issue (search google, 
someone once had an issue with that a few years ago).

Charlie
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 3237 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://rubyforge.org/pipermail/libxml-devel/attachments/20090202/59fda26e/attachment.bin>


More information about the libxml-devel mailing list