From danj at 3skel.com Thu Nov 1 16:44:58 2007 From: danj at 3skel.com (Dan Janowski) Date: Thu, 1 Nov 2007 16:44:58 -0400 Subject: [libxml-devel] xpath searching without specifying namespace? In-Reply-To: References: Message-ID: <68EF391F-3D09-4B98-A4FC-D443D0AB77E1@3skel.com> Hi, The namespace code within xpath.find is rather oddly written and struck me as such at the time I was re-writing xpath.find. I had to leave it alone, as scope expansion during such an operation is hazardous. Now that you are having trouble with it, maybe I can figure out what it is supposed to do (and is clearly not). To do this, please send a sample xml document and ruby code using libxml that illustrates the problem clearly and (hopefully) compactly and indicate what you are expecting but not getting and I will peer into the fog of namespaces (a component I have not updated). Dan On Oct 26, 2007, at 12:26, mortee wrote: > > Hi, > > I just wanted to give libxml a go, because it seemed quite a bit > faster > than my current solution. > > The docs for XML::XPath.find suggest that if I omit the namespace > specification, then "matching nodes from any namespace will be > included". > > My problem is that when I fed it an XML file with a DOCTYPE > declaration > and a namespace spec on the root node, no matter what element name > I was > trying to search for, it returned an empty result set. > > I could work around the problem by removing *both* the DOCTYPE and the > NS attribute from the root node; in this case simple unqualified > element > names were found. > > I could also have it to find what I'm looking for by specifying a > "prefix:ns_uri" as the second parameter to #find, and have any element > names in my xpath be qualified by the specified prefix. This > however is > an overkill for me, as the whole document has one single namespace, > so I > really don't want any NS-based filtering on my xpath query. > > So the question is, why doesn't it find unqualified element names > when I > omit the second parameter from #find, and how could I achieve it? > > thx > mortee > > _______________________________________________ > libxml-devel mailing list > libxml-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/libxml-devel From mortee.lists at kavemalna.hu Sun Nov 4 04:16:03 2007 From: mortee.lists at kavemalna.hu (mortee) Date: Sun, 04 Nov 2007 10:16:03 +0100 Subject: [libxml-devel] xpath searching without specifying namespace? In-Reply-To: <68EF391F-3D09-4B98-A4FC-D443D0AB77E1@3skel.com> References: <68EF391F-3D09-4B98-A4FC-D443D0AB77E1@3skel.com> Message-ID: Dan Janowski wrote: > To do this, please send a sample xml document and ruby code using > libxml that illustrates the problem clearly and (hopefully) compactly > and indicate what you are expecting but not getting and I will peer > into the fog of namespaces (a component I have not updated). OK, here it is. I have attached two example XML files - one with a namespace declared on the root node (test-ns.xml), and one without (test-nons.xml). This is the only difference between them. Here's what I get for them. Everything's just as expected, except that the first find should return 2 as well. $ irb irb(main):001:0> require 'xml/libxml' => true irb(main):002:0> XML::Document.file('test-ns.xml').find('nodes/subnode').size => 0 irb(main):003:0> XML::Document.file('test-nons.xml').find('nodes/subnode').size => 2 irb(main):004:0> XML::Document.file('test-ns.xml').find('myns:nodes/myns:subnode', 'myns:http://some-ns.com/').size => 2 The docs here say that if I don't specify a namespace on the search, then any NS may be matched: http://web.archive.org/web/20070208211112/http://libxml.rubyforge.org/doc/classes/XML/XPath.html As a side note, the API Docs link is broken on the new front page, taht's why I included this cached copy. mortee -------------- next part -------------- A non-text attachment was scrubbed... Name: test-nons.xml Type: text/xml Size: 244 bytes Desc: not available Url : http://rubyforge.org/pipermail/libxml-devel/attachments/20071104/f2307e95/attachment.xml -------------- next part -------------- A non-text attachment was scrubbed... Name: test-ns.xml Type: text/xml Size: 286 bytes Desc: not available Url : http://rubyforge.org/pipermail/libxml-devel/attachments/20071104/f2307e95/attachment-0001.xml From transfire at gmail.com Sun Nov 4 13:48:38 2007 From: transfire at gmail.com (Trans) Date: Sun, 4 Nov 2007 13:48:38 -0500 Subject: [libxml-devel] xpath searching without specifying namespace? In-Reply-To: References: <68EF391F-3D09-4B98-A4FC-D443D0AB77E1@3skel.com> Message-ID: <4b6f054f0711041048n560ca74anb092b01f3e2dd698@mail.gmail.com> On Nov 4, 2007 4:16 AM, mortee wrote: > As a side note, the API Docs link is broken on the new front page, > taht's why I included this cached copy. We need to merge to trunk do we can generate new rdocs. It might be easier to to replace trunk and I'll just copy the stuff I added from the old trunk to the new one. That might save a bunch of merging headache at this point. Dan, let me know, which. T. From erik at hollensbe.org Sun Nov 4 13:59:41 2007 From: erik at hollensbe.org (Erik Hollensbe) Date: Sun, 4 Nov 2007 10:59:41 -0800 Subject: [libxml-devel] API docs (was: xpath searching without specifying namespace?) In-Reply-To: <4b6f054f0711041048n560ca74anb092b01f3e2dd698@mail.gmail.com> References: <68EF391F-3D09-4B98-A4FC-D443D0AB77E1@3skel.com> <4b6f054f0711041048n560ca74anb092b01f3e2dd698@mail.gmail.com> Message-ID: <83455E52-D40B-4C26-B1B4-AC6A845DF3FF@hollensbe.org> On Nov 4, 2007, at 10:48 AM, Trans wrote: > On Nov 4, 2007 4:16 AM, mortee wrote: > >> As a side note, the API Docs link is broken on the new front page, >> taht's why I included this cached copy. > > We need to merge to trunk do we can generate new rdocs. It might be > easier to to replace trunk and I'll just copy the stuff I added from > the old trunk to the new one. That might save a bunch of merging > headache at this point. I don't know if this has already been discussed or not (my apologies for being too lazy to read the archives), but is there any chance of retaining backwards compat with the API in 0.3? A lot of us are depending on that functionality. -Erik From anelson at apocryph.org Sun Nov 4 16:27:05 2007 From: anelson at apocryph.org (Adam Nelson) Date: Sun, 4 Nov 2007 17:27:05 -0400 Subject: [libxml-devel] "[BUG] XmlNode Doc is not bound! (ruby_xml_node.c:1270)" when using Reader.expand Message-ID: I'm using libxml-ruby 0.5.2 on Ubuntu 7.10 x64 with Ruby 1.8.6. I'm trying to use the XML::Reader method expand to extract a full XML::Node object when I find one my program is interested in. This works OK except when Ruby tries to garbage collect the Node object returned by Reader.expand. I get the following error: [BUG] XmlNode Doc is not bound! (ruby_xml_node.c:1270) I was able to reproduce this using the 'simple.xml' file in the libxml-ruby test suite. Run the following code: #!/usr/bin/env ruby require 'xml/libxml' rdr = XML::Reader.file("simple.xml") rdr.read rdr.expand GC.start rdr.close The GC.start line will output the BUG message above then do a core dump. The message from ruby_xml_node.c line 1270 comes from the ruby_xml_node_mark_common function. Apparently the libxml xmlNode object has a document associated with it, but that document doesn't have a Ruby wrapper object associated with it. I'm afraid Linux isn't my native development platform so I don't know what if any additional debugging steps I can take, so I hope the code above demonstrates the problem. Thanks, Adam From danj at 3skel.com Sun Nov 4 20:07:45 2007 From: danj at 3skel.com (Dan Janowski) Date: Sun, 4 Nov 2007 20:07:45 -0500 Subject: [libxml-devel] API docs (was: xpath searching without specifying namespace?) In-Reply-To: <83455E52-D40B-4C26-B1B4-AC6A845DF3FF@hollensbe.org> References: <68EF391F-3D09-4B98-A4FC-D443D0AB77E1@3skel.com> <4b6f054f0711041048n560ca74anb092b01f3e2dd698@mail.gmail.com> <83455E52-D40B-4C26-B1B4-AC6A845DF3FF@hollensbe.org> Message-ID: I am only aware of one instance ([] operator missing on XPath::Object) that is an interface change. All the other changes are intrinsically necessary to fix the bugs in the implementation. What are you dependent on that has changed? Dan On Nov 4, 2007, at 13:59, Erik Hollensbe wrote: > > On Nov 4, 2007, at 10:48 AM, Trans wrote: > >> On Nov 4, 2007 4:16 AM, mortee wrote: >> >>> As a side note, the API Docs link is broken on the new front page, >>> taht's why I included this cached copy. >> >> We need to merge to trunk do we can generate new rdocs. It might be >> easier to to replace trunk and I'll just copy the stuff I added from >> the old trunk to the new one. That might save a bunch of merging >> headache at this point. > > I don't know if this has already been discussed or not (my apologies > for being too lazy to read the archives), but is there any chance of > retaining backwards compat with the API in 0.3? A lot of us are > depending on that functionality. > > -Erik > > _______________________________________________ > libxml-devel mailing list > libxml-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/libxml-devel From danj at 3skel.com Sun Nov 4 20:11:38 2007 From: danj at 3skel.com (Dan Janowski) Date: Sun, 4 Nov 2007 20:11:38 -0500 Subject: [libxml-devel] "[BUG] XmlNode Doc is not bound! (ruby_xml_node.c:1270)" when using Reader.expand In-Reply-To: References: Message-ID: Reader is a feature yet to be updated. The [BUG] is working as intended, mostly because I was unsure if anything could do what you have done. I have not used expand, so I will have to look at it. In all likelihood it should be an easy fix. Dan On Nov 4, 2007, at 16:27, Adam Nelson wrote: > I'm using libxml-ruby 0.5.2 on Ubuntu 7.10 x64 with Ruby 1.8.6. I'm > trying to use the XML::Reader method expand to extract a full > XML::Node object when I find one my program is interested in. This > works OK except when Ruby tries to garbage collect the Node object > returned by Reader.expand. I get the following error: > > [BUG] XmlNode Doc is not bound! (ruby_xml_node.c:1270) > > I was able to reproduce this using the 'simple.xml' file in the > libxml-ruby test suite. Run the following code: > > #!/usr/bin/env ruby > require 'xml/libxml' > rdr = XML::Reader.file("simple.xml") > rdr.read > rdr.expand > GC.start > rdr.close > > The GC.start line will output the BUG message above then do a core > dump. > > The message from ruby_xml_node.c line 1270 comes from the > ruby_xml_node_mark_common function. Apparently the libxml xmlNode > object has a document associated with it, but that document doesn't > have a Ruby wrapper object associated with it. > > I'm afraid Linux isn't my native development platform so I don't know > what if any additional debugging steps I can take, so I hope the code > above demonstrates the problem. > > Thanks, > Adam > _______________________________________________ > libxml-devel mailing list > libxml-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/libxml-devel From erik at hollensbe.org Mon Nov 5 02:35:22 2007 From: erik at hollensbe.org (Erik Hollensbe) Date: Sun, 4 Nov 2007 23:35:22 -0800 Subject: [libxml-devel] API docs (was: xpath searching without specifying namespace?) In-Reply-To: References: <68EF391F-3D09-4B98-A4FC-D443D0AB77E1@3skel.com> <4b6f054f0711041048n560ca74anb092b01f3e2dd698@mail.gmail.com> <83455E52-D40B-4C26-B1B4-AC6A845DF3FF@hollensbe.org> Message-ID: <64082F92-95BE-49F8-B36C-63F4BAB48B96@hollensbe.org> On Nov 4, 2007, at 5:07 PM, Dan Janowski wrote: > I am only aware of one instance ([] operator missing on > XPath::Object) that is an interface change. All the other changes are > intrinsically necessary to fix the bugs in the implementation. What > are you dependent on that has changed? Heh, that. :) That's good to know -- I got an error on that and basically decided that development wasn't worth mucking with yet if there were going to be heavy API changes. I will work around it and attempt to make libxml- xmlrpc portable across both the stable and development branch. Thanks, -Erik From tony.primerano at gmail.com Wed Nov 7 13:23:20 2007 From: tony.primerano at gmail.com (Tony Primerano) Date: Wed, 7 Nov 2007 13:23:20 -0500 Subject: [libxml-devel] Where are the API docs Message-ID: <600555f80711071023i525324f1v908aac7079f657b4@mail.gmail.com> http://libxml.rubyforge.org/ points to http://libxml.rubyforge.org/rdoc/ but that gives a 404 google has several pages as http://libxml.rubyforge.org/doc/ but that is gone now too. Are there alternate sites that have the documentation or a good examples? From transfire at gmail.com Wed Nov 7 13:31:09 2007 From: transfire at gmail.com (Trans) Date: Wed, 7 Nov 2007 13:31:09 -0500 Subject: [libxml-devel] Where are the API docs In-Reply-To: <600555f80711071023i525324f1v908aac7079f657b4@mail.gmail.com> References: <600555f80711071023i525324f1v908aac7079f657b4@mail.gmail.com> Message-ID: <4b6f054f0711071031l1332185er9b0bc6e623a5666b@mail.gmail.com> On Nov 7, 2007 1:23 PM, Tony Primerano wrote: > http://libxml.rubyforge.org/ points to > http://libxml.rubyforge.org/rdoc/ but that gives a 404 I will generate some, but I can;t promise you they will be very helpful. We really need to a documentation wiki. T. From mortee.lists at kavemalna.hu Wed Nov 7 13:49:47 2007 From: mortee.lists at kavemalna.hu (mortee) Date: Wed, 07 Nov 2007 19:49:47 +0100 Subject: [libxml-devel] weird delay Message-ID: I just set out to do some simple measurements to see how fast libxml may be compared to hpricot. I made a little script with a ~4 megs XML document appended after __END__. $ uname -s CYGWIN_NT-5.1 $ gem list libxml *** LOCAL GEMS *** libxml-ruby (0.5.2.0) LibXML2 bindings for Ruby $ head -18 ./xml-bm2.rb #!/usr/bin/env ruby require 'benchmark' require 'hpricot' require 'xml/libxml' xml = DATA.read Benchmark.bmbm { |b| b.report('hpricot') do Hpricot::XML(xml).search('data').each{} end b.report('libxml') do XML::Parser.string(xml).parse.find('//data').each{} end } __END__ $ ./xml-bm2.rb Rehearsal ------------------------------------------- hpricot 14.407000 0.157000 14.564000 ( 15.686000) libxml 0.796000 0.093000 0.889000 ( 42.462000) --------------------------------- total: 15.453000sec user system total real hpricot 13.797000 0.000000 13.797000 ( 15.578000) libxml 0.859000 0.016000 0.875000 ( 41.091000) As you can see, hpricot has finished with parsing the XML previously loaded into memory about three times faster in real time than libxml. Also the other figures for libxml are pretty interesting. To this comes the fact that while hpricot processes the document, my CPU maxes out all the way through - however during the libxml phase, it's virtually idle. Does anyone have any clue as to why this may happen, and how to have libxml live up to its potential?... thx mortee From transfire at gmail.com Wed Nov 7 14:35:28 2007 From: transfire at gmail.com (Trans) Date: Wed, 7 Nov 2007 14:35:28 -0500 Subject: [libxml-devel] weird delay In-Reply-To: References: Message-ID: <4b6f054f0711071135r5863176cwfc53885f2233255e@mail.gmail.com> On Nov 7, 2007 1:49 PM, mortee wrote: > I just set out to do some simple measurements to see how fast libxml may > be compared to hpricot. > > I made a little script with a ~4 megs XML document appended after __END__. > > $ uname -s > CYGWIN_NT-5.1 > $ gem list libxml > > *** LOCAL GEMS *** > > libxml-ruby (0.5.2.0) > LibXML2 bindings for Ruby > $ head -18 ./xml-bm2.rb > #!/usr/bin/env ruby > require 'benchmark' > require 'hpricot' > require 'xml/libxml' > > xml = DATA.read > > Benchmark.bmbm { |b| > b.report('hpricot') do > Hpricot::XML(xml).search('data').each{} > end > b.report('libxml') do > XML::Parser.string(xml).parse.find('//data').each{} > end > } > > __END__ > > $ ./xml-bm2.rb > Rehearsal ------------------------------------------- > hpricot 14.407000 0.157000 14.564000 ( 15.686000) > libxml 0.796000 0.093000 0.889000 ( 42.462000) > --------------------------------- total: 15.453000sec > > user system total real > hpricot 13.797000 0.000000 13.797000 ( 15.578000) > libxml 0.859000 0.016000 0.875000 ( 41.091000) > > As you can see, hpricot has finished with parsing the XML previously > loaded into memory about three times faster in real time than libxml. > Also the other figures for libxml are pretty interesting. To this comes > the fact that while hpricot processes the document, my CPU maxes out all > the way through - however during the libxml phase, it's virtually idle. > > Does anyone have any clue as to why this may happen, and how to have > libxml live up to its potential?... I've seen this kind of things before too. I was getting faster speeds out of Rexml! libxml is fast, but something in the Ruby binding is putting the kabosh on it. We really need a performance analysis run. How hard is it to profile your code? Have you tried profile.rb? I would like to get a release of Dan's great work out soon. But maybe we should address this issue first. T. From danj at 3skel.com Wed Nov 7 17:04:10 2007 From: danj at 3skel.com (Dan Janowski) Date: Wed, 7 Nov 2007 17:04:10 -0500 Subject: [libxml-devel] weird delay In-Reply-To: References: Message-ID: <843BA934-CCF7-4798-A3D7-4F6685E58C9F@3skel.com> The timing info is curious because libxml does the job with very little cpu time, it is the real-time delay that is the problem. On identifying the location of the delay (since it is wall-clock time we are talking about), it should be sufficient to do splits between each method call (could just be t1=Time.now, etc) to identify where the delay is. Dan On Nov 7, 2007, at 13:49, mortee wrote: > I just set out to do some simple measurements to see how fast > libxml may > be compared to hpricot. > > I made a little script with a ~4 megs XML document appended after > __END__. > > $ uname -s > CYGWIN_NT-5.1 > $ gem list libxml > > *** LOCAL GEMS *** > > libxml-ruby (0.5.2.0) > LibXML2 bindings for Ruby > $ head -18 ./xml-bm2.rb > #!/usr/bin/env ruby > require 'benchmark' > require 'hpricot' > require 'xml/libxml' > > xml = DATA.read > > Benchmark.bmbm { |b| > b.report('hpricot') do > Hpricot::XML(xml).search('data').each{} > end > b.report('libxml') do > XML::Parser.string(xml).parse.find('//data').each{} > end > } > > __END__ > > $ ./xml-bm2.rb > Rehearsal ------------------------------------------- > hpricot 14.407000 0.157000 14.564000 ( 15.686000) > libxml 0.796000 0.093000 0.889000 ( 42.462000) > --------------------------------- total: 15.453000sec > > user system total real > hpricot 13.797000 0.000000 13.797000 ( 15.578000) > libxml 0.859000 0.016000 0.875000 ( 41.091000) > > As you can see, hpricot has finished with parsing the XML previously > loaded into memory about three times faster in real time than libxml. > Also the other figures for libxml are pretty interesting. To this > comes > the fact that while hpricot processes the document, my CPU maxes > out all > the way through - however during the libxml phase, it's virtually > idle. > > Does anyone have any clue as to why this may happen, and how to have > libxml live up to its potential?... > > thx > mortee > > _______________________________________________ > libxml-devel mailing list > libxml-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/libxml-devel From cfis at savagexi.com Wed Nov 7 17:05:39 2007 From: cfis at savagexi.com (Charlie Savage) Date: Wed, 07 Nov 2007 15:05:39 -0700 Subject: [libxml-devel] weird delay In-Reply-To: <843BA934-CCF7-4798-A3D7-4F6685E58C9F@3skel.com> References: <843BA934-CCF7-4798-A3D7-4F6685E58C9F@3skel.com> Message-ID: <473236B3.7000007@savagexi.com> > The timing info is curious because libxml does the job with very > little cpu time, it is the real-time delay that is the problem. > > On identifying the location of the delay (since it is wall-clock time > we are talking about), it should be sufficient to do splits between > each method call (could just be t1=Time.now, etc) to identify where > the delay is. Or just run it in ruby-prof and you'll see what each method is doing... Charlie -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 3237 bytes Desc: S/MIME Cryptographic Signature Url : http://rubyforge.org/pipermail/libxml-devel/attachments/20071107/f78a07d0/attachment.bin From marc at bloodnok.com Thu Nov 8 18:55:04 2007 From: marc at bloodnok.com (Marc Munro) Date: Thu, 08 Nov 2007 15:55:04 -0800 Subject: [libxml-devel] Another segfault In-Reply-To: <843BA934-CCF7-4798-A3D7-4F6685E58C9F@3skel.com> References: <843BA934-CCF7-4798-A3D7-4F6685E58C9F@3skel.com> Message-ID: <1194566104.25099.6.camel@bloodnok.com> Dan, I have managed to capture another segfault in MEM2 release 0.5.2. Here is a gdb session. Let me know what else I can do to help track this down. marc:[skit]$ gdb ruby GNU gdb 6.6.90.20070912-debian Copyright (C) 2007 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "i486-linux-gnu"... (no debugging symbols found) Using host libthread_db library "/lib/libthread_db.so.1". (gdb) run ./skit --diff tests/REGRESS_SRC tests/REGRESS_SRC2 --generate Starting program: /usr/bin/ruby ./skit --diff tests/REGRESS_SRC tests/REGRESS_SRC2 --generate (no debugging symbols found) (no debugging symbols found) (no debugging symbols found) (no debugging symbols found) [Thread debugging using libthread_db enabled] [New Thread 0xb7cde8c0 (LWP 3549)] (no debugging symbols found) (no debugging symbols found) (no debugging symbols found) (no debugging symbols found) (no debugging symbols found) Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0xb7cde8c0 (LWP 3549)] 0xb7c50ea2 in ruby_xml_xpath_object_first (self=3080810320) at ruby_xml_xpath_object.c:165 165 switch(xpop->nodesetval->nodeTab[0]->type) { (gdb) bt #0 0xb7c50ea2 in ruby_xml_xpath_object_first (self=3080810320) at ruby_xml_xpath_object.c:165 #1 0xb7ebd145 in ?? () from /usr/lib/libruby1.8.so.1.8 #2 0xb7a16f50 in ?? () #3 0xbfa6ef20 in ?? () #4 0xb7a17b94 in ?? () #5 0xb7ebd13a in ?? () from /usr/lib/libruby1.8.so.1.8 #6 0xb7ca85fc in ?? () #7 0xb7a1842c in ?? () #8 0xb7a18454 in ?? () #9 0xb7cdb7e0 in ?? () #10 0x00000000 in ?? () (gdb) __ Marc -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://rubyforge.org/pipermail/libxml-devel/attachments/20071108/395bd58d/attachment.bin From marc at bloodnok.com Thu Nov 8 18:57:49 2007 From: marc at bloodnok.com (Marc Munro) Date: Thu, 08 Nov 2007 15:57:49 -0800 Subject: [libxml-devel] Another segfault In-Reply-To: <1194566104.25099.6.camel@bloodnok.com> References: <843BA934-CCF7-4798-A3D7-4F6685E58C9F@3skel.com> <1194566104.25099.6.camel@bloodnok.com> Message-ID: <1194566269.25099.8.camel@bloodnok.com> FYI, the xpath expression which seems to trigger this is "'1'='1'". __ Marc On Thu, 2007-08-11 at 15:55 -0800, Marc Munro wrote: > Dan, > I have managed to capture another segfault in MEM2 release 0.5.2. > > Here is a gdb session. Let me know what else I can do to help track > this down. > [. . .] -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://rubyforge.org/pipermail/libxml-devel/attachments/20071108/89f7051f/attachment.bin From transfire at gmail.com Tue Nov 13 11:49:24 2007 From: transfire at gmail.com (Trans) Date: Tue, 13 Nov 2007 16:49:24 -0000 Subject: [libxml-devel] Sync Trunk to MEM2 Message-ID: <1194972564.174922.152490@d55g2000hsg.googlegroups.com> Dan, I want to bring MEM2 over to trunk. I know you believe in working in branches. That's good of course, but I'm not sure we have enough developers (uh... you) for it to matter though :-) In any case we need to sync up trunk to MEM2. How should we go about it? T. From danj at 3skel.com Tue Nov 13 17:09:05 2007 From: danj at 3skel.com (Dan Janowski) Date: Tue, 13 Nov 2007 17:09:05 -0500 Subject: [libxml-devel] Sync Trunk to MEM2 In-Reply-To: <1194972564.174922.152490@d55g2000hsg.googlegroups.com> References: <1194972564.174922.152490@d55g2000hsg.googlegroups.com> Message-ID: I meant to work on this over the weekend, but I have been a bit under the weather. Let me look at it to figure out the right way. I know you suggested to just move trunk to something else and replace it with MEM2. That may be the best way, since trunk is not required by svn, just a convention, there should be no problem in relocating it to something else. Dan On Nov 13, 2007, at 11:49, Trans wrote: > Dan, > > I want to bring MEM2 over to trunk. I know you believe in working in > branches. That's good of course, but I'm not sure we have enough > developers (uh... you) for it to matter though :-) In any case we need > to sync up trunk to MEM2. How should we go about it? > > T. > > _______________________________________________ > libxml-devel mailing list > libxml-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/libxml-devel From danj at 3skel.com Wed Nov 14 02:55:16 2007 From: danj at 3skel.com (Dan Janowski) Date: Wed, 14 Nov 2007 02:55:16 -0500 Subject: [libxml-devel] Sync Trunk to MEM2 In-Reply-To: <1194972564.174922.152490@d55g2000hsg.googlegroups.com> References: <1194972564.174922.152490@d55g2000hsg.googlegroups.com> Message-ID: <6A475CEB-4375-470D-A346-D6DBACBDA84B@3skel.com> T. MEM2 is now fully merged into trunk by an actual svn merge. I can't say I am overly impressed with svn's merge, mercurial has it better. I will work in the trunk going forward unless I have to work on something dangerous. Dan On Nov 13, 2007, at 11:49, Trans wrote: > Dan, > > I want to bring MEM2 over to trunk. I know you believe in working in > branches. That's good of course, but I'm not sure we have enough > developers (uh... you) for it to matter though :-) In any case we need > to sync up trunk to MEM2. How should we go about it? > > T. > > _______________________________________________ > libxml-devel mailing list > libxml-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/libxml-devel From danj at 3skel.com Wed Nov 14 03:37:55 2007 From: danj at 3skel.com (Dan Janowski) Date: Wed, 14 Nov 2007 03:37:55 -0500 Subject: [libxml-devel] weird delay In-Reply-To: References: Message-ID: Did you have a location for the delay? On Nov 7, 2007, at 13:49, mortee wrote: > I just set out to do some simple measurements to see how fast > libxml may > be compared to hpricot. > > I made a little script with a ~4 megs XML document appended after > __END__. > > $ uname -s > CYGWIN_NT-5.1 > $ gem list libxml > > *** LOCAL GEMS *** > > libxml-ruby (0.5.2.0) > LibXML2 bindings for Ruby > $ head -18 ./xml-bm2.rb > #!/usr/bin/env ruby > require 'benchmark' > require 'hpricot' > require 'xml/libxml' > > xml = DATA.read > > Benchmark.bmbm { |b| > b.report('hpricot') do > Hpricot::XML(xml).search('data').each{} > end > b.report('libxml') do > XML::Parser.string(xml).parse.find('//data').each{} > end > } > > __END__ > > $ ./xml-bm2.rb > Rehearsal ------------------------------------------- > hpricot 14.407000 0.157000 14.564000 ( 15.686000) > libxml 0.796000 0.093000 0.889000 ( 42.462000) > --------------------------------- total: 15.453000sec > > user system total real > hpricot 13.797000 0.000000 13.797000 ( 15.578000) > libxml 0.859000 0.016000 0.875000 ( 41.091000) > > As you can see, hpricot has finished with parsing the XML previously > loaded into memory about three times faster in real time than libxml. > Also the other figures for libxml are pretty interesting. To this > comes > the fact that while hpricot processes the document, my CPU maxes > out all > the way through - however during the libxml phase, it's virtually > idle. > > Does anyone have any clue as to why this may happen, and how to have > libxml live up to its potential?... > > thx > mortee > > _______________________________________________ > libxml-devel mailing list > libxml-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/libxml-devel From danj at 3skel.com Wed Nov 14 03:37:21 2007 From: danj at 3skel.com (Dan Janowski) Date: Wed, 14 Nov 2007 03:37:21 -0500 Subject: [libxml-devel] API docs (was: xpath searching without specifying namespace?) In-Reply-To: <64082F92-95BE-49F8-B36C-63F4BAB48B96@hollensbe.org> References: <68EF391F-3D09-4B98-A4FC-D443D0AB77E1@3skel.com> <4b6f054f0711041048n560ca74anb092b01f3e2dd698@mail.gmail.com> <83455E52-D40B-4C26-B1B4-AC6A845DF3FF@hollensbe.org> <64082F92-95BE-49F8-B36C-63F4BAB48B96@hollensbe.org> Message-ID: <1F674B03-96D3-442A-8D46-78AFC77FE17D@3skel.com> [] operator added to XPath::Object in svn trunk. On Nov 5, 2007, at 02:35, Erik Hollensbe wrote: > > On Nov 4, 2007, at 5:07 PM, Dan Janowski wrote: > >> I am only aware of one instance ([] operator missing on >> XPath::Object) that is an interface change. All the other changes are >> intrinsically necessary to fix the bugs in the implementation. What >> are you dependent on that has changed? > > Heh, that. :) > > That's good to know -- I got an error on that and basically decided > that development wasn't worth mucking with yet if there were going to > be heavy API changes. I will work around it and attempt to make > libxml- > xmlrpc portable across both the stable and development branch. > > Thanks, > > -Erik > _______________________________________________ > libxml-devel mailing list > libxml-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/libxml-devel From danj at 3skel.com Wed Nov 14 03:56:40 2007 From: danj at 3skel.com (Dan Janowski) Date: Wed, 14 Nov 2007 03:56:40 -0500 Subject: [libxml-devel] MEM2 to trunk Message-ID: All changes developed in the MEM2 branch have been merged back into trunk. The trunk release version is unchanged (0.5.2.1) as of the merge. Changes, patches, upgrades will be in the trunk until further notice. In other words, for those of you getting updates via SVN, be sure to check out 'trunk' for your use. Dan From danj at 3skel.com Wed Nov 14 04:11:57 2007 From: danj at 3skel.com (Dan Janowski) Date: Wed, 14 Nov 2007 04:11:57 -0500 Subject: [libxml-devel] "[BUG] XmlNode Doc is not bound! (ruby_xml_node.c:1270)" when using Reader.expand In-Reply-To: References: Message-ID: <1F5F70B0-D812-494D-B282-756521C09CDE@3skel.com> This is fixed in the trunk version 210. Dan On Nov 4, 2007, at 16:27, Adam Nelson wrote: > I'm using libxml-ruby 0.5.2 on Ubuntu 7.10 x64 with Ruby 1.8.6. I'm > trying to use the XML::Reader method expand to extract a full > XML::Node object when I find one my program is interested in. This > works OK except when Ruby tries to garbage collect the Node object > returned by Reader.expand. I get the following error: > > [BUG] XmlNode Doc is not bound! (ruby_xml_node.c:1270) > > I was able to reproduce this using the 'simple.xml' file in the > libxml-ruby test suite. Run the following code: > > #!/usr/bin/env ruby > require 'xml/libxml' > rdr = XML::Reader.file("simple.xml") > rdr.read > rdr.expand > GC.start > rdr.close > > The GC.start line will output the BUG message above then do a core > dump. > > The message from ruby_xml_node.c line 1270 comes from the > ruby_xml_node_mark_common function. Apparently the libxml xmlNode > object has a document associated with it, but that document doesn't > have a Ruby wrapper object associated with it. > > I'm afraid Linux isn't my native development platform so I don't know > what if any additional debugging steps I can take, so I hope the code > above demonstrates the problem. > > Thanks, > Adam > _______________________________________________ > libxml-devel mailing list > libxml-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/libxml-devel From danj at 3skel.com Wed Nov 14 04:55:02 2007 From: danj at 3skel.com (Dan Janowski) Date: Wed, 14 Nov 2007 04:55:02 -0500 Subject: [libxml-devel] PATCH: Enable Text Node Creation In-Reply-To: <000001c78c3c$1dca6bc0$595f4340$@net> References: <000001c78c3c$1dca6bc0$595f4340$@net> Message-ID: (This is really old, but ...) Node#new_text added at trunk version #211. Dan On May 1, 2007, at 18:00, Eric Schultz wrote: > Hi everyone, > > > > I?ve been working on using the libxml-ruby for a project and I > discovered there?s not an easy way to create a new text node. Since > I needed this, I used my rather limited C skills and copied the > method to create a comment node and modified it so it makes text > nodes. The patch is attached. (I hope that?s alright.) To whomever > takes care of adding things to the repository, you can take care of > that I. Thanks for all the work that?s already been do > > > > Eric Schultz > > > _______________________________________________ > libxml-devel mailing list > libxml-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/libxml-devel From transfire at gmail.com Wed Nov 14 06:28:41 2007 From: transfire at gmail.com (Trans) Date: Wed, 14 Nov 2007 11:28:41 -0000 Subject: [libxml-devel] Sync Trunk to MEM2 In-Reply-To: <6A475CEB-4375-470D-A346-D6DBACBDA84B@3skel.com> References: <1194972564.174922.152490@d55g2000hsg.googlegroups.com> <6A475CEB-4375-470D-A346-D6DBACBDA84B@3skel.com> Message-ID: <1195039721.077421.70000@v65g2000hsc.googlegroups.com> On Nov 14, 2:55 am, Dan Janowski wrote: > T. > > MEM2 is now fully merged into trunk by an actual svn merge. I can't > say I am overly impressed with svn's merge, mercurial has it better. That's great. I feared it would be basty which is why I suggested just copying, but I'm glad you worked through the merge, being the proper way and all. > I will work in the trunk going forward unless I have to work on > something dangerous. Good deal. I'll work on putting together a release. T. From transfire at gmail.com Wed Nov 14 06:32:19 2007 From: transfire at gmail.com (Trans) Date: Wed, 14 Nov 2007 11:32:19 -0000 Subject: [libxml-devel] weird delay In-Reply-To: References: Message-ID: <1195039939.766640.56980@57g2000hsv.googlegroups.com> On Nov 7, 1:49 pm, mortee wrote: > $ head -18 ./xml-bm2.rb > #!/usr/bin/env ruby > require 'benchmark' > require 'hpricot' > require 'xml/libxml' > > xml = DATA.read > > Benchmark.bmbm { |b| > b.report('hpricot') do > Hpricot::XML(xml).search('data').each{} > end > b.report('libxml') do > XML::Parser.string(xml).parse.find('//data').each{} > end > > } > > __END__ > > $ ./xml-bm2.rb > Rehearsal ------------------------------------------- > hpricot 14.407000 0.157000 14.564000 ( 15.686000) > libxml 0.796000 0.093000 0.889000 ( 42.462000) > --------------------------------- total: 15.453000sec > > user system total real > hpricot 13.797000 0.000000 13.797000 ( 15.578000) > libxml 0.859000 0.016000 0.875000 ( 41.091000) > > As you can see, hpricot has finished with parsing the XML previously > loaded into memory about three times faster in real time than libxml. > Also the other figures for libxml are pretty interesting. To this comes > the fact that while hpricot processes the document, my CPU maxes out all > the way through - however during the libxml phase, it's virtually idle. mortee, do you think you can run this through a profiler and see what you come up with? T. From hello at timperrett.com Wed Nov 14 20:09:06 2007 From: hello at timperrett.com (Tim Perrett) Date: Thu, 15 Nov 2007 01:09:06 +0000 Subject: [libxml-devel] MEM2 to trunk In-Reply-To: References: Message-ID: <29DEE5DA-66E5-4A47-81E6-1ED974797753@timperrett.com> Hey Chaps Excuse my ignorance, but ive been reading the posts on the list and the MEM2 branch seems to be a great improvement from where we were. Now its in the trunk, what's the ETA for it being rolled out as an official gem? Cheers Tim On 14 Nov 2007, at 08:56, Dan Janowski wrote: > All changes developed in the MEM2 branch have been merged back into > trunk. The trunk release version is unchanged (0.5.2.1) as of the > merge. Changes, patches, upgrades will be in the trunk until further > notice. In other words, for those of you getting updates via SVN, be > sure to check out 'trunk' for your use. > > Dan From transfire at gmail.com Wed Nov 14 22:00:07 2007 From: transfire at gmail.com (Trans) Date: Wed, 14 Nov 2007 19:00:07 -0800 (PST) Subject: [libxml-devel] MEM2 to trunk In-Reply-To: <29DEE5DA-66E5-4A47-81E6-1ED974797753@timperrett.com> References: <29DEE5DA-66E5-4A47-81E6-1ED974797753@timperrett.com> Message-ID: <185bdc3a-69ec-4f10-8222-6ef2eb98cb5b@f80g2000hsh.googlegroups.com> On Nov 14, 8:09 pm, Tim Perrett wrote: > Hey Chaps > > Excuse my ignorance, but ive been reading the posts on the list and > the MEM2 branch seems to be a great improvement from where we were. > Now its in the trunk, what's the ETA for it being rolled out as an > official gem? not too long (1-2 weeks?). i can probably put together a release in short order, but i would prefer that we address the performance issue first. T. From cfis at savagexi.com Wed Nov 14 22:07:29 2007 From: cfis at savagexi.com (Charlie Savage) Date: Wed, 14 Nov 2007 20:07:29 -0700 Subject: [libxml-devel] MEM2 to trunk In-Reply-To: <185bdc3a-69ec-4f10-8222-6ef2eb98cb5b@f80g2000hsh.googlegroups.com> References: <29DEE5DA-66E5-4A47-81E6-1ED974797753@timperrett.com> <185bdc3a-69ec-4f10-8222-6ef2eb98cb5b@f80g2000hsh.googlegroups.com> Message-ID: <473BB7F1.3050605@savagexi.com> > not too long (1-2 weeks?). i can probably put together a release in > short order, but i would prefer that we address the performance issue > first. Also - I'd like to update the gem package to include a Windows binary. I'll do that in the next couple days unless there are any objections. Charlie -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 3237 bytes Desc: S/MIME Cryptographic Signature Url : http://rubyforge.org/pipermail/libxml-devel/attachments/20071114/4df6a99f/attachment.bin From transfire at gmail.com Wed Nov 14 22:28:41 2007 From: transfire at gmail.com (Trans) Date: Wed, 14 Nov 2007 22:28:41 -0500 Subject: [libxml-devel] MEM2 to trunk In-Reply-To: <473BB7F1.3050605@savagexi.com> References: <29DEE5DA-66E5-4A47-81E6-1ED974797753@timperrett.com> <185bdc3a-69ec-4f10-8222-6ef2eb98cb5b@f80g2000hsh.googlegroups.com> <473BB7F1.3050605@savagexi.com> Message-ID: <4b6f054f0711141928p5bdcec0ak781418dfc2d5f7a@mail.gmail.com> On Nov 14, 2007 10:07 PM, Charlie Savage wrote: > > not too long (1-2 weeks?). i can probably put together a release in > > short order, but i would prefer that we address the performance issue > > first. > > Also - I'd like to update the gem package to include a Windows binary. > I'll do that in the next couple days unless there are any objections. Hi Charlie, I'd prefer to take care of this simply b/c I'm going to do some work on the build process as a whole. But I may need some help along the way and certianly will need help testing it one Windows. So if I can call on you then, that would be great. T. From cfis at savagexi.com Wed Nov 14 22:46:12 2007 From: cfis at savagexi.com (Charlie Savage) Date: Wed, 14 Nov 2007 20:46:12 -0700 Subject: [libxml-devel] MEM2 to trunk In-Reply-To: <4b6f054f0711141928p5bdcec0ak781418dfc2d5f7a@mail.gmail.com> References: <29DEE5DA-66E5-4A47-81E6-1ED974797753@timperrett.com> <185bdc3a-69ec-4f10-8222-6ef2eb98cb5b@f80g2000hsh.googlegroups.com> <473BB7F1.3050605@savagexi.com> <4b6f054f0711141928p5bdcec0ak781418dfc2d5f7a@mail.gmail.com> Message-ID: <473BC104.60709@savagexi.com> > Hi Charlie, I'd prefer to take care of this simply b/c I'm going to do > some work on the build process as a whole. But I may need some help > along the way and certianly will need help testing it one Windows. So > if I can call on you then, that would be great. Sure. Let me tell you how I did it for ruby-prof in case it helps. First, I've attached the Rakefile I created for ruby-prof which setups gems for *nix and Windows. In particular, look at line 82 and on. It took me a while to puzzle through it - with a lot of inspiration from how RCov does it. Second, I created a separate directory for a MingW build which contains a hand-crafted Makefile. You have to do this because the standard mkmf built-in to Ruby assume you are linking against the same libraries Ruby itself was built with. That assumption fails because Ruby on Windows is built with VC++. And the reason to not use VC++ is because you then introduce dependencies on the right version of the C Runtime (MingW uses a version which is on just about every windows computer). Thus I separately run the make file to build the executable - the rakefile just copies it over for packaging. Third, for debugging purposes, I have a separate directory for a VC++ project file. That isn't so important for building gems...but it sure makes it a lot easier to debug stuff on Windows. Maybe this is easier now with mkrf, but I haven't tried it so I'm not sure. If only Ruby had something like Python's distutils that handles all these various compiler/os combinations... Charlie How do you want to split it?Sure > > T. > _______________________________________________ > libxml-devel mailing list > libxml-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/libxml-devel -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: Rakefile Url: http://rubyforge.org/pipermail/libxml-devel/attachments/20071114/af4d8d9c/attachment.pl -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 3237 bytes Desc: S/MIME Cryptographic Signature Url : http://rubyforge.org/pipermail/libxml-devel/attachments/20071114/af4d8d9c/attachment.bin From transfire at gmail.com Thu Nov 15 07:23:15 2007 From: transfire at gmail.com (Trans) Date: Thu, 15 Nov 2007 04:23:15 -0800 (PST) Subject: [libxml-devel] MEM2 to trunk In-Reply-To: <473BC104.60709@savagexi.com> References: <29DEE5DA-66E5-4A47-81E6-1ED974797753@timperrett.com> <185bdc3a-69ec-4f10-8222-6ef2eb98cb5b@f80g2000hsh.googlegroups.com> <473BB7F1.3050605@savagexi.com> <4b6f054f0711141928p5bdcec0ak781418dfc2d5f7a@mail.gmail.com> <473BC104.60709@savagexi.com> Message-ID: On Nov 14, 10:46 pm, Charlie Savage wrote: thanks!!! lots of good code here. it should help a lot. > If only Ruby had something like Python's distutils that handles > all these various compiler/os combinations... I'm actually working on something like that, which is part of why I want to do the packages. still pretty early in development (and the docs are a bit out of date at the moment) but if you want to get a taste, it's called Box (see proutils.rubyforge.org/box). (of course the tricky part is generalizing the procedures so anyone can reuse them :) T. From mortee.lists at kavemalna.hu Thu Nov 15 17:12:46 2007 From: mortee.lists at kavemalna.hu (mortee) Date: Thu, 15 Nov 2007 23:12:46 +0100 Subject: [libxml-devel] MEM2 to trunk In-Reply-To: <185bdc3a-69ec-4f10-8222-6ef2eb98cb5b@f80g2000hsh.googlegroups.com> References: <29DEE5DA-66E5-4A47-81E6-1ED974797753@timperrett.com> <185bdc3a-69ec-4f10-8222-6ef2eb98cb5b@f80g2000hsh.googlegroups.com> Message-ID: Trans wrote: > > On Nov 14, 8:09 pm, Tim Perrett wrote: >> Hey Chaps >> >> Excuse my ignorance, but ive been reading the posts on the list and >> the MEM2 branch seems to be a great improvement from where we were. >> Now its in the trunk, what's the ETA for it being rolled out as an >> official gem? > > not too long (1-2 weeks?). i can probably put together a release in > short order, but i would prefer that we address the performance issue > first. Hi! If it's about the performance issue I mentioned earlier, then sorry... I've been after a job lately, so I couldn't really pay attention to this thing. Tomorrow I'll try to profile my test case and post the results. mortee From mortee.lists at kavemalna.hu Fri Nov 16 11:50:50 2007 From: mortee.lists at kavemalna.hu (mortee) Date: Fri, 16 Nov 2007 17:50:50 +0100 Subject: [libxml-devel] weird delay In-Reply-To: <1195039939.766640.56980@57g2000hsv.googlegroups.com> References: <1195039939.766640.56980@57g2000hsv.googlegroups.com> Message-ID: Trans wrote: > mortee, do you think you can run this through a profiler and see what > you come up with? Here's what I could produce. I'd gladly say I hope this provides some meaningful insight to some of you, but the results don't seem to be too informative... As you can see, the 40 secs delay (with virtually zero CPU activity) is spent during the XML::Parser#parse method - but the profiler data for the same operation shows just a fraction of a second... mortee $ head -41 xml-bm-libxml.rb #!/usr/bin/env ruby require 'xml/libxml' require 'benchmark' require 'ruby-prof' xml = DATA.read parserinit_prof = parse_prof = traverse_prof = parser = doc = nil Benchmark.bm 10 do |b| b.report('init') do parserinit_prof = RubyProf.profile do parser = XML::Parser.string(xml) end end b.report('parse') do parse_prof = RubyProf.profile do doc = parser.parse end end b.report('traverse') do traverse_prof = RubyProf.profile do doc.find('//data').each{} end end end puts 'init profile data:' RubyProf::FlatPrinter.new(parserinit_prof).print puts 'parse profile data:' RubyProf::FlatPrinter.new(parse_prof).print puts 'traverse profile data:' RubyProf::FlatPrinter.new(traverse_prof).print __END__ $ ./xml-bm-libxml.rb user system total real init 0.015000 0.000000 0.015000 ( 0.014000) parse 0.797000 0.031000 0.828000 ( 40.985000) traverse 0.094000 0.000000 0.094000 ( 0.101000) init profile data: Thread ID: 268683600 Total: 0.015 %self total self wait child calls name 100.00 0.01 0.01 0.00 0.00 1 #string 0.00 0.01 0.00 0.00 0.01 0 Global#[No method] parse profile data: Thread ID: 268683600 Total: 0.828 %self total self wait child calls name 100.00 0.83 0.83 0.00 0.00 1 XML::Parser#parse 0.00 0.83 0.00 0.00 0.83 0 Global#[No method] traverse profile data: Thread ID: 268683600 Total: 0.094 %self total self wait child calls name 67.02 0.06 0.06 0.00 0.00 1 XML::XPath::Object#each 32.98 0.03 0.03 0.00 0.00 1 XML::Document#find 0.00 0.09 0.00 0.00 0.09 0 Global#[No method] From mortee.lists at kavemalna.hu Fri Nov 16 13:15:09 2007 From: mortee.lists at kavemalna.hu (mortee) Date: Fri, 16 Nov 2007 19:15:09 +0100 Subject: [libxml-devel] xpath searching without specifying namespace? In-Reply-To: <68EF391F-3D09-4B98-A4FC-D443D0AB77E1@3skel.com> References: <68EF391F-3D09-4B98-A4FC-D443D0AB77E1@3skel.com> Message-ID: Dan Janowski wrote: > To do this, please send a sample xml document and ruby code using > libxml that illustrates the problem clearly and (hopefully) compactly > and indicate what you are expecting but not getting and I will peer > into the fog of namespaces (a component I have not updated). Any progress with this issue? mortee From danj at 3skel.com Fri Nov 16 13:57:45 2007 From: danj at 3skel.com (Dan Janowski) Date: Fri, 16 Nov 2007 13:57:45 -0500 Subject: [libxml-devel] xpath searching without specifying namespace? In-Reply-To: References: <68EF391F-3D09-4B98-A4FC-D443D0AB77E1@3skel.com> Message-ID: <059D1CA6-7495-44DA-9356-AF02FDB0EEC5@3skel.com> Not yet, but it will be shortly. I am dealing with a seg fault now. A few more days? Dan On Nov 16, 2007, at 13:15, mortee wrote: > Dan Janowski wrote: >> To do this, please send a sample xml document and ruby code using >> libxml that illustrates the problem clearly and (hopefully) compactly >> and indicate what you are expecting but not getting and I will peer >> into the fog of namespaces (a component I have not updated). > > Any progress with this issue? > > mortee > > _______________________________________________ > libxml-devel mailing list > libxml-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/libxml-devel From danj at 3skel.com Fri Nov 16 15:19:43 2007 From: danj at 3skel.com (Dan Janowski) Date: Fri, 16 Nov 2007 15:19:43 -0500 Subject: [libxml-devel] libxsl Message-ID: I have just updated the SVN version of libxsl so that it is compatible with libxml 0.5.2.1, in theory. I don't use it, so if someone who does wants to try it, please let me know. Dan From aktxyz at gmail.com Wed Nov 7 11:24:13 2007 From: aktxyz at gmail.com (uncle) Date: Wed, 07 Nov 2007 16:24:13 -0000 Subject: [libxml-devel] rdoc link is broken :( Message-ID: <1194452653.193335.289420@d55g2000hsg.googlegroups.com> http://libxml.rubyforge.org/rdoc/ From aktxyz at gmail.com Fri Nov 16 17:17:04 2007 From: aktxyz at gmail.com (uncle) Date: Fri, 16 Nov 2007 14:17:04 -0800 (PST) Subject: [libxml-devel] libxml-ruby-0.5.2.0 seg faulting :( Message-ID: <24aea40d-f4da-47b9-9bee-69a023f98808@y5g2000hsf.googlegroups.com> I just upgraded - feisty to gutsy - rails 1.2.3 to rails 1.2.5 - ruby 1.8.5 to 1.8.6 Turns out in the sudo gem update all, I also got gems/libxml- ruby-0.5.2.0 Then I started getting random/frequent segfaults in a pretty big rails app. The seg faults were happening in ostruct, so I went hunting all over, downgrading, etc. Finally replaced ostruct with a 5 line home grown class that no way had problems, and I am still seg faulting. Then I downgraded from 0.5.2.0 to 0.3.8.4 of libxml-ruby and poof, things are solid as a rock again. I notice on the libxml-ruby site news area it stops at 0.3.8.4 (not sure what mem2 is). What is the 0.5.2.0 thing, and is there anything I can provide that can help you all get to the bottom of the segfault. I am not a gdb/ linux guru, so getting out of my area there. -- Thanks From erik at hollensbe.org Fri Nov 16 17:37:00 2007 From: erik at hollensbe.org (Erik Hollensbe) Date: Fri, 16 Nov 2007 14:37:00 -0800 Subject: [libxml-devel] libxml-ruby-0.5.2.0 seg faulting :( In-Reply-To: <24aea40d-f4da-47b9-9bee-69a023f98808@y5g2000hsf.googlegroups.com> References: <24aea40d-f4da-47b9-9bee-69a023f98808@y5g2000hsf.googlegroups.com> Message-ID: <90DF9F51-1CBA-4BBA-8AA9-56B0D13E247A@hollensbe.org> On Nov 16, 2007, at 2:17 PM, uncle wrote: > I just upgraded > - feisty to gutsy > - rails 1.2.3 to rails 1.2.5 > - ruby 1.8.5 to 1.8.6 > > Turns out in the sudo gem update all, I also got gems/libxml- > ruby-0.5.2.0 > > Then I started getting random/frequent segfaults in a pretty big rails > app. The seg faults were happening in ostruct, so I went hunting all > over, downgrading, etc. Finally replaced ostruct with a 5 line home > grown class that no way had problems, and I am still seg faulting. You still have the old libxml-ruby on your machine. Look into the 'gem' ruby method and how to pin your rails app to a specific version. -Erik From danj at 3skel.com Fri Nov 16 18:40:48 2007 From: danj at 3skel.com (Dan Janowski) Date: Fri, 16 Nov 2007 18:40:48 -0500 Subject: [libxml-devel] libxml-ruby-0.5.2.0 seg faulting :( In-Reply-To: <24aea40d-f4da-47b9-9bee-69a023f98808@y5g2000hsf.googlegroups.com> References: <24aea40d-f4da-47b9-9bee-69a023f98808@y5g2000hsf.googlegroups.com> Message-ID: <8CDC341C-D64F-4690-B0E6-B60FBC2C70E5@3skel.com> There is a 0.5.2.1 that fixes a problem with xpath.find returns when they are empty. That is only available by an svn checkout, soon to be released as a gem. It is easy to build and after the pkg/ dir will have a gem that you can install. Dan On Nov 16, 2007, at 17:17, uncle wrote: > I just upgraded > - feisty to gutsy > - rails 1.2.3 to rails 1.2.5 > - ruby 1.8.5 to 1.8.6 > > Turns out in the sudo gem update all, I also got gems/libxml- > ruby-0.5.2.0 > > Then I started getting random/frequent segfaults in a pretty big rails > app. The seg faults were happening in ostruct, so I went hunting all > over, downgrading, etc. Finally replaced ostruct with a 5 line home > grown class that no way had problems, and I am still seg faulting. > > Then I downgraded from 0.5.2.0 to 0.3.8.4 of libxml-ruby and poof, > things are solid as a rock again. > > I notice on the libxml-ruby site news area it stops at 0.3.8.4 (not > sure what mem2 is). > > What is the 0.5.2.0 thing, and is there anything I can provide that > can help you all get to the bottom of the segfault. I am not a gdb/ > linux guru, so getting out of my area there. > > -- Thanks > _______________________________________________ > libxml-devel mailing list > libxml-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/libxml-devel From transfire at gmail.com Fri Nov 16 20:29:37 2007 From: transfire at gmail.com (Trans) Date: Fri, 16 Nov 2007 17:29:37 -0800 (PST) Subject: [libxml-devel] rdoc link is broken :( In-Reply-To: <1194452653.193335.289420@d55g2000hsg.googlegroups.com> References: <1194452653.193335.289420@d55g2000hsg.googlegroups.com> Message-ID: On Nov 7, 11:24 am, uncle wrote: > http://libxml.rubyforge.org/rdoc/ we will fix after the next release. T. From aktxyz at gmail.com Sun Nov 18 00:11:53 2007 From: aktxyz at gmail.com (uncle) Date: Sat, 17 Nov 2007 21:11:53 -0800 (PST) Subject: [libxml-devel] libxml-ruby-0.5.2.0 seg faulting :( In-Reply-To: <8CDC341C-D64F-4690-B0E6-B60FBC2C70E5@3skel.com> References: <24aea40d-f4da-47b9-9bee-69a023f98808@y5g2000hsf.googlegroups.com> <8CDC341C-D64F-4690-B0E6-B60FBC2C70E5@3skel.com> Message-ID: <24ece017-4ea4-4c63-b3c6-3ac451127fc1@s6g2000prc.googlegroups.com> thanks for the responses, I do alot of xpath stuff so this may be my issue, will wait for the gem though :) thanks guys, this libxml ruby interface saved me when I needed xsd support ! On Nov 16, 5:40 pm, Dan Janowski wrote: > There is a 0.5.2.1 that fixes a problem with xpath.find returns when > they are empty. That is only available by an svn checkout, soon to be > released as a gem. It is easy to build and after the pkg/ dir will > have a gem that you can install. > > Dan > > On Nov 16, 2007, at 17:17, uncle wrote: > > > > > I just upgraded > > - feisty to gutsy > > - rails 1.2.3 to rails 1.2.5 > > - ruby 1.8.5 to 1.8.6 > > > Turns out in the sudo gem update all, I also got gems/libxml- > > ruby-0.5.2.0 > > > Then I started getting random/frequent segfaults in a pretty big rails > > app. The seg faults were happening in ostruct, so I went hunting all > > over, downgrading, etc. Finally replaced ostruct with a 5 line home > > grown class that no way had problems, and I am still seg faulting. > > > Then I downgraded from 0.5.2.0 to 0.3.8.4 of libxml-ruby and poof, > > things are solid as a rock again. > > > I notice on the libxml-ruby site news area it stops at 0.3.8.4 (not > > sure what mem2 is). > > > What is the 0.5.2.0 thing, and is there anything I can provide that > > can help you all get to the bottom of the segfault. I am not a gdb/ > > linux guru, so getting out of my area there. > > > -- Thanks > > _______________________________________________ > > libxml-devel mailing list > > libxml-de... at rubyforge.org > >http://rubyforge.org/mailman/listinfo/libxml-devel > > _______________________________________________ > libxml-devel mailing list > libxml-de... at rubyforge.orghttp://rubyforge.org/mailman/listinfo/libxml-devel From yzheng3000 at gmail.com Fri Nov 23 14:36:46 2007 From: yzheng3000 at gmail.com (Yan Zheng) Date: Fri, 23 Nov 2007 11:36:46 -0800 Subject: [libxml-devel] api docs urgently needed Message-ID: I just started using libxml a few weeks ago on my new project. The old api site http://libxml.rubyforge.org/rdoc/ doesn't contains api content. Can anyone please fix this issue or point to alternative location for the api? Thanks. Yan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/libxml-devel/attachments/20071123/210da014/attachment.html From transfire at gmail.com Fri Nov 23 22:27:45 2007 From: transfire at gmail.com (Trans) Date: Fri, 23 Nov 2007 19:27:45 -0800 (PST) Subject: [libxml-devel] api docs urgently needed In-Reply-To: References: Message-ID: <5a8554cf-6bbd-4d66-accc-55bcffa1e169@e6g2000prf.googlegroups.com> On Nov 23, 2:36 pm, "Yan Zheng" wrote: > I just started using libxml a few weeks ago on my new project. The old api > sitehttp://libxml.rubyforge.org/rdoc/doesn't contains api content. Can > anyone please fix this issue or point to alternative location for the api? Well they ain't nothing to write home about. But there you go. T. From transfire at gmail.com Sat Nov 24 11:00:37 2007 From: transfire at gmail.com (Trans) Date: Sat, 24 Nov 2007 08:00:37 -0800 (PST) Subject: [libxml-devel] libxsl In-Reply-To: References: Message-ID: On Nov 16, 3:19 pm, Dan Janowski wrote: > I have just updated the SVN version of libxsl so that it is > compatible with libxml 0.5.2.1, in theory. I don't use it, so if > someone who does wants to try it, please let me know. If we bring over libxsl into libxml do we need a separate ext/ subdir for it? Or can we combine the two into the single dir? I'm looking at doing the later, but I'm not sure how it should effect extconf.rb, and makes me wonder if it's even possible to have a single ext dir. T. From sean at chittenden.org Sat Nov 24 11:26:28 2007 From: sean at chittenden.org (Sean Chittenden) Date: Sat, 24 Nov 2007 08:26:28 -0800 Subject: [libxml-devel] libxsl In-Reply-To: References: Message-ID: >> I have just updated the SVN version of libxsl so that it is >> compatible with libxml 0.5.2.1, in theory. I don't use it, so if >> someone who does wants to try it, please let me know. > > If we bring over libxsl into libxml do we need a separate ext/ subdir > for it? Or can we combine the two into the single dir? I'm looking at > doing the later, but I'm not sure how it should effect extconf.rb, and > makes me wonder if it's even possible to have a single ext dir. You want/need two different directories since not all systems that have libxml will have libxslt. -sc -- Sean Chittenden sean at chittenden.org From transfire at gmail.com Sat Nov 24 12:13:28 2007 From: transfire at gmail.com (Trans) Date: Sat, 24 Nov 2007 12:13:28 -0500 Subject: [libxml-devel] libxsl In-Reply-To: References: Message-ID: <4b6f054f0711240913w4de96d96h3f5720523f3cc0e3@mail.gmail.com> On Nov 24, 2007 11:26 AM, Sean Chittenden wrote: > >> I have just updated the SVN version of libxsl so that it is > >> compatible with libxml 0.5.2.1, in theory. I don't use it, so if > >> someone who does wants to try it, please let me know. > > > > If we bring over libxsl into libxml do we need a separate ext/ subdir > > for it? Or can we combine the two into the single dir? I'm looking at > > doing the later, but I'm not sure how it should effect extconf.rb, and > > makes me wonder if it's even possible to have a single ext dir. > > You want/need two different directories since not all systems that > have libxml will have libxslt. -sc Ok. Thanks. I'll have to move to ext/xsl/ then... hmm..actually if we want to keep it in xml/ then we'd have to make two layers. ext/xml/xml/ ext/xsl/xml/ I'm wondering if we really need the xml/ require-space since we only have two possible files to load (libxml and libxsl). T. From sean at chittenden.org Sat Nov 24 13:03:43 2007 From: sean at chittenden.org (Sean Chittenden) Date: Sat, 24 Nov 2007 10:03:43 -0800 Subject: [libxml-devel] libxsl In-Reply-To: <4b6f054f0711240913w4de96d96h3f5720523f3cc0e3@mail.gmail.com> References: <4b6f054f0711240913w4de96d96h3f5720523f3cc0e3@mail.gmail.com> Message-ID: <910DCCFE-DE0C-4FD4-B08D-08BD3692F609@chittenden.org> > Ok. Thanks. I'll have to move to ext/xsl/ then... hmm..actually if we > want to keep it in xml/ then we'd have to make two layers. > > ext/xml/xml/ > ext/xsl/xml/ > > I'm wondering if we really need the xml/ require-space since we only > have two possible files to load (libxml and libxsl). Naw, you don't it. Way back when, that was me being anal and trying to emulate/improve some perlisms. I'd break out the build process and keep them as two separate packages, but 'require "libxml"' or 'require "libxslt" is sufficiently unique in the universe. :) -sc -- Sean Chittenden sean at chittenden.org From transfire at gmail.com Sun Nov 25 08:27:30 2007 From: transfire at gmail.com (Trans) Date: Sun, 25 Nov 2007 05:27:30 -0800 (PST) Subject: [libxml-devel] libxsl In-Reply-To: <910DCCFE-DE0C-4FD4-B08D-08BD3692F609@chittenden.org> References: <4b6f054f0711240913w4de96d96h3f5720523f3cc0e3@mail.gmail.com> <910DCCFE-DE0C-4FD4-B08D-08BD3692F609@chittenden.org> Message-ID: <2db48ce4-bfa8-4712-8f84-1df0abe8133e@e25g2000prg.googlegroups.com> On Nov 24, 1:03 pm, Sean Chittenden wrote: > > Ok. Thanks. I'll have to move to ext/xsl/ then... hmm..actually if we > > want to keep it in xml/ then we'd have to make two layers. > > > ext/xml/xml/ > > ext/xsl/xml/ > > > I'm wondering if we really need the xml/ require-space since we only > > have two possible files to load (libxml and libxsl). > > Naw, you don't it. Way back when, that was me being anal and trying > to emulate/improve some perlisms. I'd break out the build process and > keep them as two separate packages, but 'require "libxml"' or 'require > "libxslt" is sufficiently unique in the universe. :) -sc Really? Damn. I thought you had suggested we make one package out of it, and I was coming around to that idea. T. From danj at 3skel.com Mon Nov 26 10:45:16 2007 From: danj at 3skel.com (Dan Janowski) Date: Mon, 26 Nov 2007 10:45:16 -0500 Subject: [libxml-devel] libxsl In-Reply-To: <2db48ce4-bfa8-4712-8f84-1df0abe8133e@e25g2000prg.googlegroups.com> References: <4b6f054f0711240913w4de96d96h3f5720523f3cc0e3@mail.gmail.com> <910DCCFE-DE0C-4FD4-B08D-08BD3692F609@chittenden.org> <2db48ce4-bfa8-4712-8f84-1df0abe8133e@e25g2000prg.googlegroups.com> Message-ID: <1F6D98F0-FF5A-476C-AB4E-2CB4A2E4DF8C@3skel.com> Are we then talking about: ext/xml ext/xsl ? On Nov 25, 2007, at 08:27, Trans wrote: > > > On Nov 24, 1:03 pm, Sean Chittenden wrote: >>> Ok. Thanks. I'll have to move to ext/xsl/ then... hmm..actually >>> if we >>> want to keep it in xml/ then we'd have to make two layers. >> >>> ext/xml/xml/ >>> ext/xsl/xml/ >> >>> I'm wondering if we really need the xml/ require-space since we only >>> have two possible files to load (libxml and libxsl). >> >> Naw, you don't it. Way back when, that was me being anal and trying >> to emulate/improve some perlisms. I'd break out the build process >> and >> keep them as two separate packages, but 'require "libxml"' or >> 'require >> "libxslt" is sufficiently unique in the universe. :) -sc > > Really? Damn. I thought you had suggested we make one package out of > it, and I was coming around to that idea. > > T. > _______________________________________________ > libxml-devel mailing list > libxml-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/libxml-devel From sean at chittenden.org Mon Nov 26 11:02:22 2007 From: sean at chittenden.org (Sean Chittenden) Date: Mon, 26 Nov 2007 08:02:22 -0800 Subject: [libxml-devel] libxsl In-Reply-To: <1F6D98F0-FF5A-476C-AB4E-2CB4A2E4DF8C@3skel.com> References: <4b6f054f0711240913w4de96d96h3f5720523f3cc0e3@mail.gmail.com> <910DCCFE-DE0C-4FD4-B08D-08BD3692F609@chittenden.org> <2db48ce4-bfa8-4712-8f84-1df0abe8133e@e25g2000prg.googlegroups.com> <1F6D98F0-FF5A-476C-AB4E-2CB4A2E4DF8C@3skel.com> Message-ID: <4F8ED2FC-7016-4ED9-B8F6-AE7FE6F31C00@chittenden.org> > Are we then talking about: > > ext/xml > ext/xsl > >> Really? Damn. I thought you had suggested we make one package out of >> it, and I was coming around to that idea. Hrm... I think I was advocating for a single repo, but not single module/package. Regardless of anything I've said in the past, they should definitely be two different modules that get required independently. To maintain the illusion of a single require, however, it would be very appropriate to have libxslt check and see if libxml is loaded, and if it isn't, have libxslt could require libxml automagically. Having developers requiring libxml and libxslt seems redundant, but since libxslt implicitly requires libxml, why not have libxslt manage that dependency behind the scenes? Those that need only libxml can require only libxml. I'd be shocked if libxslt became a self sufficient XML parser without dependencies on libxml (to the point that I'll call it an impossibility). -sc -- Sean Chittenden sean at chittenden.org http://sean.chittenden.org/ From danj at 3skel.com Mon Nov 26 11:33:28 2007 From: danj at 3skel.com (Dan Janowski) Date: Mon, 26 Nov 2007 11:33:28 -0500 Subject: [libxml-devel] libxsl In-Reply-To: <4F8ED2FC-7016-4ED9-B8F6-AE7FE6F31C00@chittenden.org> References: <4b6f054f0711240913w4de96d96h3f5720523f3cc0e3@mail.gmail.com> <910DCCFE-DE0C-4FD4-B08D-08BD3692F609@chittenden.org> <2db48ce4-bfa8-4712-8f84-1df0abe8133e@e25g2000prg.googlegroups.com> <1F6D98F0-FF5A-476C-AB4E-2CB4A2E4DF8C@3skel.com> <4F8ED2FC-7016-4ED9-B8F6-AE7FE6F31C00@chittenden.org> Message-ID: <9F1EE93D-E6E5-49E5-9CB9-8EF03272A14E@3skel.com> On Nov 26, 2007, at 11:02, Sean Chittenden wrote: >> Are we then talking about: >> >> ext/xml >> ext/xsl >> >>> Really? Damn. I thought you had suggested we make one package out of >>> it, and I was coming around to that idea. > > > Hrm... I think I was advocating for a single repo, but not single > module/package. > The packages should be separate, but since xsl may or may not be available, it seemed like an easier build config to manage if the two are separated, with different extconf.rb files. Since they build different .so objects, the directory partition reflects the library output. > Regardless of anything I've said in the past, they should definitely > be two different modules that get required independently. > libxslt should, without doubt, internally require 'libxml' and raise a LoadError (i think) if it cannot, since all document objects are dependent on libxml. Dan From transfire at gmail.com Mon Nov 26 11:53:05 2007 From: transfire at gmail.com (Trans) Date: Mon, 26 Nov 2007 11:53:05 -0500 Subject: [libxml-devel] libxsl In-Reply-To: <9F1EE93D-E6E5-49E5-9CB9-8EF03272A14E@3skel.com> References: <4b6f054f0711240913w4de96d96h3f5720523f3cc0e3@mail.gmail.com> <910DCCFE-DE0C-4FD4-B08D-08BD3692F609@chittenden.org> <2db48ce4-bfa8-4712-8f84-1df0abe8133e@e25g2000prg.googlegroups.com> <1F6D98F0-FF5A-476C-AB4E-2CB4A2E4DF8C@3skel.com> <4F8ED2FC-7016-4ED9-B8F6-AE7FE6F31C00@chittenden.org> <9F1EE93D-E6E5-49E5-9CB9-8EF03272A14E@3skel.com> Message-ID: <4b6f054f0711260853s4fc578ebx44849f84a4beed83@mail.gmail.com> On Nov 26, 2007 11:33 AM, Dan Janowski wrote: > > On Nov 26, 2007, at 11:02, Sean Chittenden wrote: > The packages should be separate, but since xsl may or may not be > available, it seemed like an easier build config to manage if the two > are separated, with different extconf.rb files. Since they build > different .so objects, the directory partition reflects the library > output. Okay. I misunderstood then. I thought we wanted to move to a single package. If it's two packages, then it is better to keep them in separate projects or subprojects. Do you want to go the subproject route then (versus the separate projects we have now)? > > Regardless of anything I've said in the past, they should definitely > > be two different modules that get required independently. > > > libxslt should, without doubt, internally require 'libxml' and raise > a LoadError (i think) if it cannot, since all document objects are > dependent on libxml. Yep. T. From danj at 3skel.com Mon Nov 26 13:47:43 2007 From: danj at 3skel.com (Dan Janowski) Date: Mon, 26 Nov 2007 13:47:43 -0500 Subject: [libxml-devel] xpath.find fix Message-ID: The SVN repository now includes version 0.5.2.2 (as of #212) that fixes a major problem with .find expressions which return non-nodeset results like a boolean, string or number. This seems to fix the only remaining memory fault (segv) that I am aware of. The only two outstanding service issues which stand between here and the next release is: 1. correct find operation with namespaces 2. wall-clock delay when using Reader Dan From hello at timperrett.com Mon Nov 26 16:05:38 2007 From: hello at timperrett.com (Tim Perrett) Date: Mon, 26 Nov 2007 21:05:38 +0000 Subject: [libxml-devel] problem with UTF-16 encoding Message-ID: Hey Chaps There seems to be some kind of issue with UTF-16 encoding in libxml- ruby version 0.5.2.0. When I do this: doc = XML::Document.new() # doc.encoding = 'utf-16' doc.root = XML::Node.new('root_node') root = doc.root puts doc ## => Uncomment the encoding however and you get this: doc = XML::Document.new() doc.encoding = 'utf-16' doc.root = XML::Node.new('root_node') root = doc.root puts doc ## => ??< Any idea whats going on here and how to fix it? The encoding features used to work no problem at all. Im running ruby 1.8.6 (2007-06-07 patchlevel 36) [universal-darwin9.0] Cheers Tim From paul at aps.org Mon Nov 26 16:18:02 2007 From: paul at aps.org (Paul Dlug) Date: Mon, 26 Nov 2007 16:18:02 -0500 Subject: [libxml-devel] Load external DTD true/false not working? Message-ID: <8E13C165-1E53-4D3A-A4CC-F4C0866E2416@aps.org> It doesn't appear to me that the flag on XML::Parser 'default_load_external_dtd" works. Looking at the source: VALUE ruby_xml_parser_default_load_external_dtd_get(VALUE class) { if (xmlSubstituteEntitiesDefaultValue) return(Qtrue); else return(Qfalse); } I think the variable to set here should be xmlLoadExtDtdDefaultValue, not xmlSubstituteEntitiesDefaultValue. This can be verified with a small test: require 'xml/libxml' puts "Load DTD: #{XML::Parser.default_load_external_dtd}" XML::Parser.default_load_external_dtd = true puts "Load DTD: #{XML::Parser.default_load_external_dtd}" Which outputs (incorrectly I believe): Load DTD: false Load DTD: false However, changing this variable still does not make the above test case work and the DTD is still loaded when I parse the document. Any suggestions? --Paul From danj at 3skel.com Mon Nov 26 21:34:44 2007 From: danj at 3skel.com (Dan Janowski) Date: Mon, 26 Nov 2007 21:34:44 -0500 Subject: [libxml-devel] Load external DTD true/false not working? In-Reply-To: <8E13C165-1E53-4D3A-A4CC-F4C0866E2416@aps.org> References: <8E13C165-1E53-4D3A-A4CC-F4C0866E2416@aps.org> Message-ID: <0A52C48F-1CCE-49B0-81AC-76C4474D4DE4@3skel.com> Hi, You are at least half correct. xmlSubstituteEntitiesDefaultValue has nothing to do with DTD. However, while the _get method you have illustrated here makes reference to the wrong variable, the _set method does not suffer the same problem. So, while the script return value interrogating the variable is not correct, the functionality should be. Does the DTD entity loading work when you set it? The correction is committed in svn #216 Dan On Nov 26, 2007, at 16:18, Paul Dlug wrote: > It doesn't appear to me that the flag on XML::Parser > 'default_load_external_dtd" works. > > Looking at the source: > > VALUE > ruby_xml_parser_default_load_external_dtd_get(VALUE class) { > if (xmlSubstituteEntitiesDefaultValue) > return(Qtrue); > else > return(Qfalse); > } > > I think the variable to set here should be xmlLoadExtDtdDefaultValue, > not xmlSubstituteEntitiesDefaultValue. > > This can be verified with a small test: > > require 'xml/libxml' > > puts "Load DTD: #{XML::Parser.default_load_external_dtd}" > XML::Parser.default_load_external_dtd = true > puts "Load DTD: #{XML::Parser.default_load_external_dtd}" > > Which outputs (incorrectly I believe): > > Load DTD: false > Load DTD: false > > However, changing this variable still does not make the above test > case work and the DTD is still loaded when I parse the document. > > Any suggestions? > > > --Paul > _______________________________________________ > libxml-devel mailing list > libxml-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/libxml-devel From danj at 3skel.com Mon Nov 26 22:38:11 2007 From: danj at 3skel.com (Dan Janowski) Date: Mon, 26 Nov 2007 22:38:11 -0500 Subject: [libxml-devel] problem with UTF-16 encoding In-Reply-To: References: Message-ID: <4EE9078B-E17A-4928-BD02-0223C23FE3D1@3skel.com> I don't have 0.3x on my system anymore, but I do not think UTF16 will behave any differently. .to_s is written incorrectly, from what I can tell, since it just feeds the encoding of the document back into the formatter. But in either case, if you want the as-encoded document, you really want to use doc.dump. Encoding has never worked correctly within the library. It only functions properly when fed UTF-8 as I have had to employ Iconv for anything else. Dan On Nov 26, 2007, at 16:05, Tim Perrett wrote: > Hey Chaps > > There seems to be some kind of issue with UTF-16 encoding in libxml- > ruby version 0.5.2.0. > > When I do this: > > doc = XML::Document.new() > # doc.encoding = 'utf-16' > doc.root = XML::Node.new('root_node') > root = doc.root > puts doc > ## => > > Uncomment the encoding however and you get this: > > doc = XML::Document.new() > doc.encoding = 'utf-16' > doc.root = XML::Node.new('root_node') > root = doc.root > puts doc > ## => ??< > > Any idea whats going on here and how to fix it? The encoding features > used to work no problem at all. Im running ruby 1.8.6 (2007-06-07 > patchlevel 36) [universal-darwin9.0] > > Cheers > > Tim > > > _______________________________________________ > libxml-devel mailing list > libxml-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/libxml-devel From danj at 3skel.com Mon Nov 26 23:39:56 2007 From: danj at 3skel.com (Dan Janowski) Date: Mon, 26 Nov 2007 23:39:56 -0500 Subject: [libxml-devel] problem with UTF-16 encoding In-Reply-To: <4EE9078B-E17A-4928-BD02-0223C23FE3D1@3skel.com> References: <4EE9078B-E17A-4928-BD02-0223C23FE3D1@3skel.com> Message-ID: <108BE818-C391-4569-B38E-6AE57B428DF4@3skel.com> I have modified Document#to_s to permit the inclusion of a second encoding argument (didn't know there was a first one, eh?). It will not change the document encoding, but will case libxml to produce a representation of the document in the requested encoding (transcoding it if necessary). The default for it is nil, and results in the document's encoding. A few other notes about UTF-16 specifically; UTF-16 will result in a two byte lead in, UTF-16BE will not, nor will UTF-16LE. These latter encodings are not familiar, but may or may not be of interest. You were getting two 8bit chars and nothing else because of the UTF-16 lead in, but it was also getting truncated because the wrong ruby string constructor was being called (which did not use the length returned by the libxml dump, so an ^@ was stopping the string). In other words, it was always broken (I had not previously modified this code), now it is less broken. Dan On Nov 26, 2007, at 22:38, Dan Janowski wrote: > I don't have 0.3x on my system anymore, but I do not think UTF16 will > behave any differently. .to_s is written incorrectly, from what I can > tell, since it just feeds the encoding of the document back into the > formatter. But in either case, if you want the as-encoded document, > you really want to use doc.dump. > > Encoding has never worked correctly within the library. It only > functions properly when fed UTF-8 as I have had to employ Iconv for > anything else. > > > Dan > > On Nov 26, 2007, at 16:05, Tim Perrett wrote: > >> Hey Chaps >> >> There seems to be some kind of issue with UTF-16 encoding in libxml- >> ruby version 0.5.2.0. >> >> When I do this: >> >> doc = XML::Document.new() >> # doc.encoding = 'utf-16' >> doc.root = XML::Node.new('root_node') >> root = doc.root >> puts doc >> ## => >> >> Uncomment the encoding however and you get this: >> >> doc = XML::Document.new() >> doc.encoding = 'utf-16' >> doc.root = XML::Node.new('root_node') >> root = doc.root >> puts doc >> ## => ??< >> >> Any idea whats going on here and how to fix it? The encoding features >> used to work no problem at all. Im running ruby 1.8.6 (2007-06-07 >> patchlevel 36) [universal-darwin9.0] >> >> Cheers >> >> Tim >> >> >> _______________________________________________ >> libxml-devel mailing list >> libxml-devel at rubyforge.org >> http://rubyforge.org/mailman/listinfo/libxml-devel > > _______________________________________________ > libxml-devel mailing list > libxml-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/libxml-devel From keisukefukuda at gmail.com Tue Nov 27 00:29:44 2007 From: keisukefukuda at gmail.com (keisuke fukuda) Date: Tue, 27 Nov 2007 14:29:44 +0900 Subject: [libxml-devel] Specifying namespace on XPath? Message-ID: Hi, How to specify namespace on xpath in version 5.2.0 ? Someone told me that you can do it like this in version 3.8.4 : --- require "rubygems" require "xml/libxml" # 3.8.4 doc = XML::Parser.string(< My Blog Entries EOS namespaces = { "app" => "http://www.w3.org/2007/app", "atom" => "http://www.w3.org/2005/Atom" }.to_a p doc.find('/app:service/app:workspace/app:collection/atom:title', namespaces) __END__ => My Blog Entries --- But, in 5.2.0, it saids " in `find': nested array must be an array of strings, prefix and href/uri (ArgumentError) " ( in Addition, it SEGVs sometimes in normal interpreter and always in irb ) Is this a bug ? (Or, is there any other way to do this in 5.2.0) ? I took a look at the source code and found that there is almost no unit test about namespace on XPath. -- FUKUDA, Keisuke From keisukefukuda at gmail.com Tue Nov 27 00:34:47 2007 From: keisukefukuda at gmail.com (keisuke fukuda) Date: Tue, 27 Nov 2007 14:34:47 +0900 Subject: [libxml-devel] Specifying namespace on XPath? In-Reply-To: References: Message-ID: I'm willing to do the work (adding unit tests and fixing the bug if I can) if you don't mind :-) 2007/11/27, keisuke fukuda : > Hi, > > How to specify namespace on xpath in version 5.2.0 ? > > Someone told me that you can do it like this in version 3.8.4 : > --- > require "rubygems" > require "xml/libxml" # 3.8.4 > > doc = XML::Parser.string(< > xmlns:atom="http://www.w3.org/2005/Atom"> > > > My Blog Entries > > > > EOS > > namespaces = { > "app" => "http://www.w3.org/2007/app", > "atom" => "http://www.w3.org/2005/Atom" > }.to_a > > p doc.find('/app:service/app:workspace/app:collection/atom:title', namespaces) > __END__ > > => My Blog Entries > > --- > > But, in 5.2.0, it saids > " in `find': nested array must be an array of strings, prefix and > href/uri (ArgumentError) " > > ( in Addition, it SEGVs sometimes in normal interpreter and always in irb ) > > Is this a bug ? (Or, is there any other way to do this in 5.2.0) ? > I took a look at the source code and found that there is almost no > unit test about namespace on XPath. > > -- > FUKUDA, Keisuke > -- FUKUDA, Keisuke <福田圭祐> http://d.hatena.ne.jp/keisukefukuda/ From hello at timperrett.com Tue Nov 27 05:08:30 2007 From: hello at timperrett.com (Tim Perrett) Date: Tue, 27 Nov 2007 10:08:30 +0000 Subject: [libxml-devel] problem with UTF-16 encoding In-Reply-To: <108BE818-C391-4569-B38E-6AE57B428DF4@3skel.com> References: <4EE9078B-E17A-4928-BD02-0223C23FE3D1@3skel.com> <108BE818-C391-4569-B38E-6AE57B428DF4@3skel.com> Message-ID: <21403713-3E93-47F6-B8B2-5653138D3B00@timperrett.com> Interesting stuff. Just changed back to utf-16, and using doc.dump I see the byte order mark and the rest of the xml - result :) Cheers guys Tim On 27 Nov 2007, at 04:39, Dan Janowski wrote: > I have modified Document#to_s to permit the inclusion of a second > encoding argument (didn't know there was a first one, eh?). It will > not change the document encoding, but will case libxml to produce a > representation of the document in the requested encoding (transcoding > it if necessary). The default for it is nil, and results in the > document's encoding. > > A few other notes about UTF-16 specifically; UTF-16 will result in a > two byte lead in, UTF-16BE will not, nor will UTF-16LE. These latter > encodings are not familiar, but may or may not be of interest. > > You were getting two 8bit chars and nothing else because of the > UTF-16 lead in, but it was also getting truncated because the wrong > ruby string constructor was being called (which did not use the > length returned by the libxml dump, so an ^@ was stopping the string). > > In other words, it was always broken (I had not previously modified > this code), now it is less broken. > > Dan From keisukefukuda at gmail.com Tue Nov 27 08:39:29 2007 From: keisukefukuda at gmail.com (keisuke fukuda) Date: Tue, 27 Nov 2007 22:39:29 +0900 Subject: [libxml-devel] Specifying namespace on XPath? In-Reply-To: References: Message-ID: So, this should be the patch. Index: ext/xml/ruby_xml_xpath.c =================================================================== --- ext/xml/ruby_xml_xpath.c (revision 218) +++ ext/xml/ruby_xml_xpath.c (working copy) @@ -76,9 +76,9 @@ } else { // tuples of prefix/uri - if (RARRAY(RARRAY(nslist)->ptr[i])->len == 2) { - rprefix = RARRAY(RARRAY(nslist)->ptr[i])->ptr[0]; - ruri = RARRAY(RARRAY(nslist)->ptr[i])->ptr[1]; + if (RARRAY(nslist)->len == 2) { + rprefix = RARRAY(nslist)->ptr[0]; + ruri = RARRAY(nslist)->ptr[1]; ruby_xml_xpath_context_register_namespace(xxpc, rprefix, ruri); } else { rb_raise(rb_eArgError, "nested array must be an array of strings, prefix and href/uri"); -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: xpath_ns_fix.patch.txt Url: http://rubyforge.org/pipermail/libxml-devel/attachments/20071127/ea3fc128/attachment.txt From danj at 3skel.com Tue Nov 27 09:42:11 2007 From: danj at 3skel.com (Dan Janowski) Date: Tue, 27 Nov 2007 09:42:11 -0500 Subject: [libxml-devel] problem with UTF-16 encoding In-Reply-To: <21403713-3E93-47F6-B8B2-5653138D3B00@timperrett.com> References: <4EE9078B-E17A-4928-BD02-0223C23FE3D1@3skel.com> <108BE818-C391-4569-B38E-6AE57B428DF4@3skel.com> <21403713-3E93-47F6-B8B2-5653138D3B00@timperrett.com> Message-ID: <8AF345F3-EB1A-4F2F-97A8-D675791CC0B3@3skel.com> Last night I could not see what could BE and LE stand for?! Well, of course, Big Endian and Little Endian. When there is no lead in to indicate, the encoding can specify. Dan On Nov 27, 2007, at 05:08, Tim Perrett wrote: >> A few other notes about UTF-16 specifically; UTF-16 will result in a >> two byte lead in, UTF-16BE will not, nor will UTF-16LE. These latter >> encodings are not familiar, but may or may not be of interest. From erik at hollensbe.org Tue Nov 27 09:44:30 2007 From: erik at hollensbe.org (Erik Hollensbe) Date: Tue, 27 Nov 2007 06:44:30 -0800 Subject: [libxml-devel] problem with UTF-16 encoding In-Reply-To: <8AF345F3-EB1A-4F2F-97A8-D675791CC0B3@3skel.com> References: <4EE9078B-E17A-4928-BD02-0223C23FE3D1@3skel.com> <108BE818-C391-4569-B38E-6AE57B428DF4@3skel.com> <21403713-3E93-47F6-B8B2-5653138D3B00@timperrett.com> <8AF345F3-EB1A-4F2F-97A8-D675791CC0B3@3skel.com> Message-ID: <12604605-9459-417D-8E31-5DF54A0199A3@hollensbe.org> On Nov 27, 2007, at 6:42 AM, Dan Janowski wrote: > Last night I could not see what could BE and LE stand for?! Well, of > course, Big Endian and Little Endian. When there is no lead in to > indicate, the encoding can specify. That's what the byte order mark is for. When we were battling unicode issues at work, I found the wikipedia articles on the subject very helpful: http://en.wikipedia.org/wiki/Unicode is a good starter. HTH, -Erik From paul at aps.org Tue Nov 27 10:17:57 2007 From: paul at aps.org (Paul Dlug) Date: Tue, 27 Nov 2007 10:17:57 -0500 Subject: [libxml-devel] Load external DTD true/false not working? In-Reply-To: <0A52C48F-1CCE-49B0-81AC-76C4474D4DE4@3skel.com> References: <8E13C165-1E53-4D3A-A4CC-F4C0866E2416@aps.org> <0A52C48F-1CCE-49B0-81AC-76C4474D4DE4@3skel.com> Message-ID: On Nov 26, 2007, at 9:34 PM, Dan Janowski wrote: > Hi, > > You are at least half correct. xmlSubstituteEntitiesDefaultValue has > nothing to do with DTD. However, while the _get method you have > illustrated here makes reference to the wrong variable, the _set > method does not suffer the same problem. So, while the script return > value interrogating the variable is not correct, the functionality > should be. Does the DTD entity loading work when you set it? > > The correction is committed in svn #216 Two problems here 1) As I said in the original message, even changing that variable in the _get doesn't seem to cause it's value to change at (try the test I posted below). 2) Even set to false the DTD is still loaded when the document is parsed which does not appear to be correct behavior. The real problem I'm trying to get around is a possible bug with XPath on documents that have DTD's specifying namespaces. This may be the same as another thread I just saw so I'll post my reply to that one. Thanks, Paul > On Nov 26, 2007, at 16:18, Paul Dlug wrote: > >> It doesn't appear to me that the flag on XML::Parser >> 'default_load_external_dtd" works. >> >> Looking at the source: >> >> VALUE >> ruby_xml_parser_default_load_external_dtd_get(VALUE class) { >> if (xmlSubstituteEntitiesDefaultValue) >> return(Qtrue); >> else >> return(Qfalse); >> } >> >> I think the variable to set here should be xmlLoadExtDtdDefaultValue, >> not xmlSubstituteEntitiesDefaultValue. >> >> This can be verified with a small test: >> >> require 'xml/libxml' >> >> puts "Load DTD: #{XML::Parser.default_load_external_dtd}" >> XML::Parser.default_load_external_dtd = true >> puts "Load DTD: #{XML::Parser.default_load_external_dtd}" >> >> Which outputs (incorrectly I believe): >> >> Load DTD: false >> Load DTD: false >> >> However, changing this variable still does not make the above test >> case work and the DTD is still loaded when I parse the document. >> >> Any suggestions? >> >> >> --Paul >> _______________________________________________ >> libxml-devel mailing list >> libxml-devel at rubyforge.org >> http://rubyforge.org/mailman/listinfo/libxml-devel > > _______________________________________________ > libxml-devel mailing list > libxml-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/libxml-devel > From paul at aps.org Tue Nov 27 10:42:26 2007 From: paul at aps.org (Paul Dlug) Date: Tue, 27 Nov 2007 10:42:26 -0500 Subject: [libxml-devel] Specifying namespace on XPath? In-Reply-To: References: Message-ID: <302E8E65-6C26-413C-BE8C-37DA50D92007@aps.org> I think I have a related bug that your patch doesn't fix. If I have a document with a DTD declaration specifying a namespace and an identical document without it the XPath expression finds the node in the document w/o DTD but not with the DTD. The attached test case illustrates the problem, to replicate: 1) Run dtdtest.rb, you will see that the test will fail being unable to find a node via XML::Node.find() 2) Comment out the ATTLIST spec in a.dtd and re-run, test will pass this time (or remove the DTD declaration from the file dtd.xml). --Paul On Nov 27, 2007, at 8:39 AM, keisuke fukuda wrote: > So, this should be the patch. > > Index: ext/xml/ruby_xml_xpath.c > =================================================================== > --- ext/xml/ruby_xml_xpath.c (revision 218) > +++ ext/xml/ruby_xml_xpath.c (working copy) > @@ -76,9 +76,9 @@ > } > else { > // tuples of prefix/uri > - if (RARRAY(RARRAY(nslist)->ptr[i])->len == 2) { > - rprefix = RARRAY(RARRAY(nslist)->ptr[i])->ptr[0]; > - ruri = RARRAY(RARRAY(nslist)->ptr[i])->ptr[1]; > + if (RARRAY(nslist)->len == 2) { > + rprefix = RARRAY(nslist)->ptr[0]; > + ruri = RARRAY(nslist)->ptr[1]; > ruby_xml_xpath_context_register_namespace(xxpc, rprefix, ruri); > } else { > rb_raise(rb_eArgError, "nested array must be an array of > strings, prefix and href/uri"); > < > xpath_ns_fix.patch.txt>_______________________________________________ > libxml-devel mailing list > libxml-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/libxml-devel -------------- next part -------------- A non-text attachment was scrubbed... Name: a.dtd Type: application/octet-stream Size: 143 bytes Desc: not available Url : http://rubyforge.org/pipermail/libxml-devel/attachments/20071127/3492d043/attachment.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: dtd.xml Type: text/xml Size: 110 bytes Desc: not available Url : http://rubyforge.org/pipermail/libxml-devel/attachments/20071127/3492d043/attachment.xml -------------- next part -------------- A non-text attachment was scrubbed... Name: dtdtest.rb Type: text/x-ruby-script Size: 420 bytes Desc: not available Url : http://rubyforge.org/pipermail/libxml-devel/attachments/20071127/3492d043/attachment.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: nodtd.xml Type: text/xml Size: 64 bytes Desc: not available Url : http://rubyforge.org/pipermail/libxml-devel/attachments/20071127/3492d043/attachment-0001.xml -------------- next part -------------- From danj at 3skel.com Tue Nov 27 10:50:48 2007 From: danj at 3skel.com (Dan Janowski) Date: Tue, 27 Nov 2007 10:50:48 -0500 Subject: [libxml-devel] Specifying namespace on XPath? In-Reply-To: References: Message-ID: <9C4B756D-20AA-41A2-BDC3-7094AB083E45@3skel.com> Patch applied, svn #219. Note that 'p' of the .find result will not result in the xml segment that it used to in 0.3.8. The .find now returns an XPath::Object type and it has no to_s method defined. You can get the same effect by using .to_a on XPath::Object. I had not looked closely enough at this code segment when I encapsulated the the namespace recursion. Thanks for the patch. You have credit in the svn log. Dan On Nov 27, 2007, at 08:39, keisuke fukuda wrote: > So, this should be the patch. > > Index: ext/xml/ruby_xml_xpath.c > =================================================================== > --- ext/xml/ruby_xml_xpath.c (revision 218) > +++ ext/xml/ruby_xml_xpath.c (working copy) > @@ -76,9 +76,9 @@ > } > else { > // tuples of prefix/uri > - if (RARRAY(RARRAY(nslist)->ptr[i])->len == 2) { > - rprefix = RARRAY(RARRAY(nslist)->ptr[i])->ptr[0]; > - ruri = RARRAY(RARRAY(nslist)->ptr[i])->ptr[1]; > + if (RARRAY(nslist)->len == 2) { > + rprefix = RARRAY(nslist)->ptr[0]; > + ruri = RARRAY(nslist)->ptr[1]; > ruby_xml_xpath_context_register_namespace(xxpc, rprefix, > ruri); > } else { > rb_raise(rb_eArgError, "nested array must be an array of > strings, prefix and href/ > uri");________________________________________ > _______ > libxml-devel mailing list > libxml-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/libxml-devel From danj at 3skel.com Tue Nov 27 11:05:37 2007 From: danj at 3skel.com (Dan Janowski) Date: Tue, 27 Nov 2007 11:05:37 -0500 Subject: [libxml-devel] Load external DTD true/false not working? In-Reply-To: References: <8E13C165-1E53-4D3A-A4CC-F4C0866E2416@aps.org> <0A52C48F-1CCE-49B0-81AC-76C4474D4DE4@3skel.com> Message-ID: <3919323D-CBE1-4E5D-83CC-BAC5844FD5B1@3skel.com> The method mapping was transposed and is fixed in svn #220 See if that works now. Dan On Nov 27, 2007, at 10:17, Paul Dlug wrote: > > On Nov 26, 2007, at 9:34 PM, Dan Janowski wrote: > >> Hi, >> >> You are at least half correct. xmlSubstituteEntitiesDefaultValue has >> nothing to do with DTD. However, while the _get method you have >> illustrated here makes reference to the wrong variable, the _set >> method does not suffer the same problem. So, while the script return >> value interrogating the variable is not correct, the functionality >> should be. Does the DTD entity loading work when you set it? >> >> The correction is committed in svn #216 > > Two problems here > > 1) As I said in the original message, even changing that variable in > the _get doesn't seem to cause it's value to change at (try the test I > posted below). > > 2) Even set to false the DTD is still loaded when the document is > parsed which does not appear to be correct behavior. > > The real problem I'm trying to get around is a possible bug with XPath > on documents that have DTD's specifying namespaces. This may be the > same as another thread I just saw so I'll post my reply to that one. > > > Thanks, > Paul > > >> On Nov 26, 2007, at 16:18, Paul Dlug wrote: >> >>> It doesn't appear to me that the flag on XML::Parser >>> 'default_load_external_dtd" works. >>> >>> Looking at the source: >>> >>> VALUE >>> ruby_xml_parser_default_load_external_dtd_get(VALUE class) { >>> if (xmlSubstituteEntitiesDefaultValue) >>> return(Qtrue); >>> else >>> return(Qfalse); >>> } >>> >>> I think the variable to set here should be >>> xmlLoadExtDtdDefaultValue, >>> not xmlSubstituteEntitiesDefaultValue. >>> >>> This can be verified with a small test: >>> >>> require 'xml/libxml' >>> >>> puts "Load DTD: #{XML::Parser.default_load_external_dtd}" >>> XML::Parser.default_load_external_dtd = true >>> puts "Load DTD: #{XML::Parser.default_load_external_dtd}" >>> >>> Which outputs (incorrectly I believe): >>> >>> Load DTD: false >>> Load DTD: false >>> >>> However, changing this variable still does not make the above test >>> case work and the DTD is still loaded when I parse the document. >>> >>> Any suggestions? >>> >>> >>> --Paul >>> _______________________________________________ >>> libxml-devel mailing list >>> libxml-devel at rubyforge.org >>> http://rubyforge.org/mailman/listinfo/libxml-devel >> >> _______________________________________________ >> libxml-devel mailing list >> libxml-devel at rubyforge.org >> http://rubyforge.org/mailman/listinfo/libxml-devel >> > > _______________________________________________ > libxml-devel mailing list > libxml-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/libxml-devel From paul at aps.org Tue Nov 27 11:11:25 2007 From: paul at aps.org (Paul Dlug) Date: Tue, 27 Nov 2007 11:11:25 -0500 Subject: [libxml-devel] Load external DTD true/false not working? In-Reply-To: <3919323D-CBE1-4E5D-83CC-BAC5844FD5B1@3skel.com> References: <8E13C165-1E53-4D3A-A4CC-F4C0866E2416@aps.org> <0A52C48F-1CCE-49B0-81AC-76C4474D4DE4@3skel.com> <3919323D-CBE1-4E5D-83CC-BAC5844FD5B1@3skel.com> Message-ID: <125C98C5-804C-4491-BE17-84C7D297CA56@aps.org> On Nov 27, 2007, at 11:05 AM, Dan Janowski wrote: > The method mapping was transposed and is fixed in svn #220 > > See if that works now. It's working great now. Thanks! --Paul > On Nov 27, 2007, at 10:17, Paul Dlug wrote: > >> >> On Nov 26, 2007, at 9:34 PM, Dan Janowski wrote: >> >>> Hi, >>> >>> You are at least half correct. xmlSubstituteEntitiesDefaultValue has >>> nothing to do with DTD. However, while the _get method you have >>> illustrated here makes reference to the wrong variable, the _set >>> method does not suffer the same problem. So, while the script return >>> value interrogating the variable is not correct, the functionality >>> should be. Does the DTD entity loading work when you set it? >>> >>> The correction is committed in svn #216 >> >> Two problems here >> >> 1) As I said in the original message, even changing that variable in >> the _get doesn't seem to cause it's value to change at (try the >> test I >> posted below). >> >> 2) Even set to false the DTD is still loaded when the document is >> parsed which does not appear to be correct behavior. >> >> The real problem I'm trying to get around is a possible bug with >> XPath >> on documents that have DTD's specifying namespaces. This may be the >> same as another thread I just saw so I'll post my reply to that one. >> >> >> Thanks, >> Paul >> >> >>> On Nov 26, 2007, at 16:18, Paul Dlug wrote: >>> >>>> It doesn't appear to me that the flag on XML::Parser >>>> 'default_load_external_dtd" works. >>>> >>>> Looking at the source: >>>> >>>> VALUE >>>> ruby_xml_parser_default_load_external_dtd_get(VALUE class) { >>>> if (xmlSubstituteEntitiesDefaultValue) >>>> return(Qtrue); >>>> else >>>> return(Qfalse); >>>> } >>>> >>>> I think the variable to set here should be >>>> xmlLoadExtDtdDefaultValue, >>>> not xmlSubstituteEntitiesDefaultValue. >>>> >>>> This can be verified with a small test: >>>> >>>> require 'xml/libxml' >>>> >>>> puts "Load DTD: #{XML::Parser.default_load_external_dtd}" >>>> XML::Parser.default_load_external_dtd = true >>>> puts "Load DTD: #{XML::Parser.default_load_external_dtd}" >>>> >>>> Which outputs (incorrectly I believe): >>>> >>>> Load DTD: false >>>> Load DTD: false >>>> >>>> However, changing this variable still does not make the above test >>>> case work and the DTD is still loaded when I parse the document. >>>> >>>> Any suggestions? >>>> >>>> >>>> --Paul >>>> _______________________________________________ >>>> libxml-devel mailing list >>>> libxml-devel at rubyforge.org >>>> http://rubyforge.org/mailman/listinfo/libxml-devel >>> >>> _______________________________________________ >>> libxml-devel mailing list >>> libxml-devel at rubyforge.org >>> http://rubyforge.org/mailman/listinfo/libxml-devel >>> >> >> _______________________________________________ >> libxml-devel mailing list >> libxml-devel at rubyforge.org >> http://rubyforge.org/mailman/listinfo/libxml-devel > > _______________________________________________ > libxml-devel mailing list > libxml-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/libxml-devel > From paul at aps.org Tue Nov 27 11:41:20 2007 From: paul at aps.org (Paul Dlug) Date: Tue, 27 Nov 2007 11:41:20 -0500 Subject: [libxml-devel] Disabling substitution of UTF-8 chars with entities Message-ID: There is a serious inconsistency when "round tripping" XML containing UTF-8 characters. If you output the document to a string after parsing you get the UTF-8 back out, if you just grab a node and convert to a string you get UTF-8 characters substituted with entities: utf8test.rb: require 'xml/libxml' xml = < This is a UTF-8 pi: ? XML parser = XML::Parser.new parser.string = xml doc = parser.parse puts doc.to_s puts doc.root.to_s This outputs: This is a UTF-8 pi: ? This is a UTF-8 pi: π I would think that the behavior of to_s by default would be to write the XML out as a string just as it was parsed. Another variant should be provided if character conversion is desirable. --Paul From Piotr.Kopyt at gmail.com Tue Nov 27 06:51:29 2007 From: Piotr.Kopyt at gmail.com (optyk) Date: Tue, 27 Nov 2007 03:51:29 -0800 (PST) Subject: [libxml-devel] Segmentation fault when add the cloned/copied node Message-ID: <17dac0df-ad74-4352-a50f-4b7f4662eb6f@t47g2000hsc.googlegroups.com> hello, I get segmentation fault when add the cloned/copied node to other node, script to problem reproduction below, segv appears when use clone and copy methods, what's interesting, with clone segv is thrown in div1.child_add(c) line (see script) but when use copy I get it in printf root statement, moreover copy seems to work wrong only for text nodes, when use 't3' div everything works fine I get this error on 0.5.2.0, 0.5.2.1 and 0.5.2.2 (latest svn) version it looks like fix is required in ruby-libxml code, BTW. it looks also like ruby_xml_node_copy() in ruby_xml_node.c calls xmlCopyNode() with wrong attributes, it should be 2 for shallow copy and 1 for deep copy regards, Piotrek -------------------- SCRIPT --------------------- require 'xml/libxml' str = <<-STR
werwerwerwerwer
Quisque et diam dapibus nisi bibendum blandit.

aaaaaaaaa

STR XML::Parser.default_keep_blanks = false xp = XML::Parser.new xp.string = str doc = xp.parse xpath = "//div[@id='t1']" div1 = doc.find(xpath).to_a[0] printf "xxx div1: #{div1}\n" xpath = "//div[@id='t2']" div2 = doc.find(xpath).to_a[0] printf "xxx div2: #{div2}\n" div2.each do |child| #c = child.clone c = child.copy(false) div1.child_add(c) end printf "xxx root: #{doc.root}\n" From paul at aps.org Tue Nov 27 13:48:44 2007 From: paul at aps.org (Paul Dlug) Date: Tue, 27 Nov 2007 13:48:44 -0500 Subject: [libxml-devel] Status of Patch #7758? Message-ID: <7699BD5C-2284-4EA1-B5A8-DB32185939F0@aps.org> I see patch #7758 hasn't been worked on or updated since submission (long ago): http://rubyforge.org/tracker/index.php?func=detail&aid=7758&group_id=494&atid=1973 This seems like a great idea and the new parse method solves eliminates the need for part of the patch I submitted (#15807). Is there any interest in getting this into the current library? I would be happy to modify the patch to bring it up to date with the current trunk version. This would certainly create a much more user friendly API than what currently exists. Thanks, Paul From danj at 3skel.com Tue Nov 27 15:08:21 2007 From: danj at 3skel.com (Dan Janowski) Date: Tue, 27 Nov 2007 15:08:21 -0500 Subject: [libxml-devel] Disabling substitution of UTF-8 chars with entities In-Reply-To: References: Message-ID: The handling of encoding is not coherent in the extension, as my last patch on the topic illustrates. While I have no doubt that there are issues to resolve, in this particular instance I do not get the result you do. Anyone wanting to look at the way encoding is handled is welcome to make a recommendation. Dan On Nov 27, 2007, at 11:41, Paul Dlug wrote: > There is a serious inconsistency when "round tripping" XML containing > UTF-8 characters. If you output the document to a string after parsing > you get the UTF-8 back out, if you just grab a node and convert to a > string you get UTF-8 characters substituted with entities: > > utf8test.rb: > > require 'xml/libxml' > > xml = < > This is a UTF-8 pi: ? > XML > > parser = XML::Parser.new > parser.string = xml > > doc = parser.parse > > puts doc.to_s > puts doc.root.to_s > > > This outputs: > > > This is a UTF-8 pi: ? > This is a UTF-8 pi: π > > > I would think that the behavior of to_s by default would be to write > the XML out as a string just as it was parsed. Another variant should > be provided if character conversion is desirable. > > > --Paul > _______________________________________________ > libxml-devel mailing list > libxml-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/libxml-devel From paul at aps.org Tue Nov 27 15:24:48 2007 From: paul at aps.org (Paul Dlug) Date: Tue, 27 Nov 2007 15:24:48 -0500 Subject: [libxml-devel] Disabling substitution of UTF-8 chars with entities In-Reply-To: References: Message-ID: On Nov 27, 2007, at 3:08 PM, Dan Janowski wrote: > The handling of encoding is not coherent in the extension, as my > last patch on the topic illustrates. While I have no doubt that > there are issues to resolve, in this particular instance I do not > get the result you do. > > Anyone wanting to look at the way encoding is handled is welcome to > make a recommendation. I just did a few more experiments, it seems I only get this on Mac OS X, it works just fine on FreeBSD and Linux (gentoo). I'll do some more digging to see if I can identify the cause. --Paul > On Nov 27, 2007, at 11:41, Paul Dlug wrote: > >> There is a serious inconsistency when "round tripping" XML containing >> UTF-8 characters. If you output the document to a string after >> parsing >> you get the UTF-8 back out, if you just grab a node and convert to a >> string you get UTF-8 characters substituted with entities: >> >> utf8test.rb: >> >> require 'xml/libxml' >> >> xml = <> >> This is a UTF-8 pi: ? >> XML >> >> parser = XML::Parser.new >> parser.string = xml >> >> doc = parser.parse >> >> puts doc.to_s >> puts doc.root.to_s >> >> >> This outputs: >> >> >> This is a UTF-8 pi: ? >> This is a UTF-8 pi: π >> >> >> I would think that the behavior of to_s by default would be to write >> the XML out as a string just as it was parsed. Another variant should >> be provided if character conversion is desirable. >> >> >> --Paul >> _______________________________________________ >> libxml-devel mailing list >> libxml-devel at rubyforge.org >> http://rubyforge.org/mailman/listinfo/libxml-devel > > _______________________________________________ > libxml-devel mailing list > libxml-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/libxml-devel From danj at 3skel.com Tue Nov 27 15:26:17 2007 From: danj at 3skel.com (Dan Janowski) Date: Tue, 27 Nov 2007 15:26:17 -0500 Subject: [libxml-devel] Status of Patch #7758? In-Reply-To: <7699BD5C-2284-4EA1-B5A8-DB32185939F0@aps.org> References: <7699BD5C-2284-4EA1-B5A8-DB32185939F0@aps.org> Message-ID: <2710B742-3FA3-4852-9AC0-EDB24BB95598@3skel.com> I see the merit in this kind of approach but it cannot conflict with the libxml work flow. I.e.: instead of XML::Document.parse(xml) => Document XML::Parser.parse(xml) => Document If you want to update the patch for the current code base, I am willing to apply and eval it. Dan On Nov 27, 2007, at 13:48, Paul Dlug wrote: > I see patch #7758 hasn't been worked on or updated since submission > (long ago): > http://rubyforge.org/tracker/index.php? > func=detail&aid=7758&group_id=494&atid=1973 > > This seems like a great idea and the new parse method solves > eliminates the need for part of the patch I submitted (#15807). Is > there any interest in getting this into the current library? I would > be happy to modify the patch to bring it up to date with the current > trunk version. This would certainly create a much more user friendly > API than what currently exists. > > > Thanks, > Paul > _______________________________________________ > libxml-devel mailing list > libxml-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/libxml-devel From paul at aps.org Tue Nov 27 23:41:22 2007 From: paul at aps.org (Paul Dlug) Date: Tue, 27 Nov 2007 23:41:22 -0500 Subject: [libxml-devel] Status of Patch #7758? In-Reply-To: <2710B742-3FA3-4852-9AC0-EDB24BB95598@3skel.com> References: <7699BD5C-2284-4EA1-B5A8-DB32185939F0@aps.org> <2710B742-3FA3-4852-9AC0-EDB24BB95598@3skel.com> Message-ID: <790E6B34-DA37-47AA-9CFC-AEC505E28782@aps.org> On Nov 27, 2007, at 3:26 PM, Dan Janowski wrote: > I see the merit in this kind of approach but it cannot conflict with > the libxml work flow. I.e.: > > instead of XML::Document.parse(xml) => Document > XML::Parser.parse(xml) => Document > > If you want to update the patch for the current code base, I am > willing to apply and eval it. I updated the original patch from Tobias to work with the current subversion trunk (220). I made the suggested modification above so it's XML::Parser.parse(xml) rather than XML::Document.parse -- though I do think XML::Document.parse is a little bit of a cleaner API. I also found a bug with namespace assignments, if you assign a namespace to a node not associated with a document it segfaults: doc = XML::Document.new node = XML::Node.new('root') node.namespace = "t:test" I'm not sure what the best way to fix this is since I'm not the familiar with the namespace code at this point. Thanks, Paul -------------- next part -------------- A non-text attachment was scrubbed... Name: libxml-patched.tar.gz Type: application/x-gzip Size: 5214 bytes Desc: not available Url : http://rubyforge.org/pipermail/libxml-devel/attachments/20071127/552b3363/attachment.gz -------------- next part -------------- > On Nov 27, 2007, at 13:48, Paul Dlug wrote: > >> I see patch #7758 hasn't been worked on or updated since submission >> (long ago): >> http://rubyforge.org/tracker/index.php? >> func=detail&aid=7758&group_id=494&atid=1973 >> >> This seems like a great idea and the new parse method solves >> eliminates the need for part of the patch I submitted (#15807). Is >> there any interest in getting this into the current library? I would >> be happy to modify the patch to bring it up to date with the current >> trunk version. This would certainly create a much more user friendly >> API than what currently exists. >> >> >> Thanks, >> Paul >> _______________________________________________ >> libxml-devel mailing list >> libxml-devel at rubyforge.org >> http://rubyforge.org/mailman/listinfo/libxml-devel > > _______________________________________________ > libxml-devel mailing list > libxml-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/libxml-devel > From mortee.lists at kavemalna.hu Wed Nov 28 08:44:44 2007 From: mortee.lists at kavemalna.hu (mortee) Date: Wed, 28 Nov 2007 14:44:44 +0100 Subject: [libxml-devel] xpath searching without specifying namespace? In-Reply-To: <059D1CA6-7495-44DA-9356-AF02FDB0EEC5@3skel.com> References: <68EF391F-3D09-4B98-A4FC-D443D0AB77E1@3skel.com> <059D1CA6-7495-44DA-9356-AF02FDB0EEC5@3skel.com> Message-ID: May I ask again? You wrote this almost two weeks ago, I guess. mortee Dan Janowski wrote: > Not yet, but it will be shortly. I am dealing with a seg fault now. A > few more days? From mortee.lists at kavemalna.hu Wed Nov 28 08:45:51 2007 From: mortee.lists at kavemalna.hu (mortee) Date: Wed, 28 Nov 2007 14:45:51 +0100 Subject: [libxml-devel] weird delay In-Reply-To: <1195039939.766640.56980@57g2000hsv.googlegroups.com> References: <1195039939.766640.56980@57g2000hsv.googlegroups.com> Message-ID: I've posted what I've found out. Do you have any idea of a cause/solution? mortee Trans wrote: > mortee, do you think you can run this through a profiler and see what > you come up with? > > T. From transfire at gmail.com Wed Nov 28 09:07:42 2007 From: transfire at gmail.com (Trans) Date: Wed, 28 Nov 2007 09:07:42 -0500 Subject: [libxml-devel] xpath searching without specifying namespace? In-Reply-To: References: <68EF391F-3D09-4B98-A4FC-D443D0AB77E1@3skel.com> <059D1CA6-7495-44DA-9356-AF02FDB0EEC5@3skel.com> Message-ID: <4b6f054f0711280607x630ec2bdi62eea0caa7c85bbb@mail.gmail.com> On Nov 28, 2007 8:44 AM, mortee wrote: > > May I ask again? You wrote this almost two weeks ago, I guess. C.S. Time. A day means a week; a week means a month; a month means at least 3 months; a year...well, no one thinks that far ahead; and most importantly "two weeks" is secret code for "when it gets done" which is usually never ;) T. From mortee.lists at kavemalna.hu Wed Nov 28 09:21:14 2007 From: mortee.lists at kavemalna.hu (mortee) Date: Wed, 28 Nov 2007 15:21:14 +0100 Subject: [libxml-devel] xpath searching without specifying namespace? In-Reply-To: <4b6f054f0711280607x630ec2bdi62eea0caa7c85bbb@mail.gmail.com> References: <68EF391F-3D09-4B98-A4FC-D443D0AB77E1@3skel.com> <059D1CA6-7495-44DA-9356-AF02FDB0EEC5@3skel.com> <4b6f054f0711280607x630ec2bdi62eea0caa7c85bbb@mail.gmail.com> Message-ID: Trans wrote: > C.S. Time. A day means a week; a week means a month; a month means at > least 3 months; a year... Sure, I'm in this business too ((: just this wasn't a job with a deadline, so I just thought... > well, no one thinks that far ahead; and most > importantly "two weeks" is secret code for "when it gets done" which > is usually never ;) Erm, that's why I dare to ask from time to time ((: mortee From danj at 3skel.com Wed Nov 28 10:54:54 2007 From: danj at 3skel.com (Dan Janowski) Date: Wed, 28 Nov 2007 10:54:54 -0500 Subject: [libxml-devel] weird delay In-Reply-To: References: <1195039939.766640.56980@57g2000hsv.googlegroups.com> Message-ID: <0C368204-13DE-479C-8887-2EE0C7079DF7@3skel.com> Please send a full script and data so I can try to reproduce the problem. i.e. your xml-bm-libxml.rb script. Dan On Nov 28, 2007, at 08:45, mortee wrote: > I've posted what I've found out. Do you have any idea of a cause/ > solution? > > mortee > > Trans wrote: >> mortee, do you think you can run this through a profiler and see what >> you come up with? >> >> T. > > _______________________________________________ > libxml-devel mailing list > libxml-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/libxml-devel From keisukefukuda at gmail.com Thu Nov 29 08:39:40 2007 From: keisukefukuda at gmail.com (keisuke fukuda) Date: Thu, 29 Nov 2007 22:39:40 +0900 Subject: [libxml-devel] Segmentation fault when add the cloned/copied node In-Reply-To: <17dac0df-ad74-4352-a50f-4b7f4662eb6f@t47g2000hsc.googlegroups.com> References: <17dac0df-ad74-4352-a50f-4b7f4662eb6f@t47g2000hsc.googlegroups.com> Message-ID: Hi, I've been trying this problem these days. For now I've not yet reached to an answer, but this may be a hint : gdb told me that the problem is about memory allocation, especially a double-freeing problem. Changing optyk's script like this : div2.each do |child| #c = child.clone c = child.copy(false) div1.child_add(c) exit # <--------- add this end causes *** glibc detected *** ruby: double free or corruption (fasttop): 0x088b5db0 *** ======= Backtrace: ========= /lib/libc.so.6[0x4f2c0f5d] /lib/libc.so.6(cfree+0x90)[0x4f2c45b0] /usr/lib/libxml2.so.2(xmlFreeNode+0x14b)[0x4fa1ce8b] ./xml/libxml_so.so(ruby_xml_node2_free+0x5f)[0x4865ef] /usr/lib/libruby.so.1.8(rb_gc_call_finalizer_at_exit+0xa7)[0x457f4a87] /usr/lib/libruby.so.1.8[0x457dad07] /usr/lib/libruby.so.1.8(ruby_cleanup+0xf9)[0x457e2ad9] /usr/lib/libruby.so.1.8(ruby_stop+0x1d)[0x457e2bbd] /usr/lib/libruby.so.1.8[0x457edae1] ruby[0x80485f2] /lib/libc.so.6(__libc_start_main+0xdc)[0x4f270dec] ruby[0x8048521] Next, I found 'double-wrapping' in ruby_xml_node2_wrap(). It happened because xmlNode's member "_private" was set to NULL by xmlAddChild() function. '_private' is used to store the xmlNode object's wrapper VALUE object (which is actually a Ruby object pointer). you can see it by inserting following code in ruby_xml_node_child_set_aux() fprintf(stderr, "--- %p\n", chld->_private); ret = xmlAddChild(pnode->node, chld); fprintf(stderr, "--- %p\n", chld->_private); ==> (Run optyk's script) --- 0xb7f8b9b8 --- (nil) Just restoring '_private' value didn't fix the problem, but there may be 'set-to-null problem' elsewhere. So, to conclude, I'm afraid it's not a good idea to store VALUE pointer in '_private'. libxml header file says its for internal use related to CORBA ... :-( I'm glad if this could be a help for you, experts :-) 2007/11/27, optyk : > hello, > > I get segmentation fault when add the cloned/copied node to other > node, > script to problem reproduction below, > > segv appears when use clone and copy methods, > what's interesting, with clone segv is thrown in div1.child_add(c) > line (see script) > but when use copy I get it in printf root statement, moreover copy > seems to work wrong only for text nodes, when use 't3' div everything > works fine > > I get this error on 0.5.2.0, 0.5.2.1 and 0.5.2.2 (latest svn) version > > it looks like fix is required in ruby-libxml code, > > BTW. it looks also like ruby_xml_node_copy() in ruby_xml_node.c calls > xmlCopyNode() with wrong attributes, it should be 2 for shallow copy > and 1 for deep copy > > regards, > Piotrek > > -------------------- SCRIPT --------------------- > require 'xml/libxml' > > str = <<-STR > > >
style="STATIC">werwerwerwerwer
>
> Quisque et diam dapibus nisi bibendum blandit. >
>
>

aaaaaaaaa

>
> > > STR > > > XML::Parser.default_keep_blanks = false > xp = XML::Parser.new > xp.string = str > doc = xp.parse > > xpath = "//div[@id='t1']" > div1 = doc.find(xpath).to_a[0] > printf "xxx div1: #{div1}\n" > > xpath = "//div[@id='t2']" > div2 = doc.find(xpath).to_a[0] > printf "xxx div2: #{div2}\n" > > > div2.each do |child| > #c = child.clone > c = child.copy(false) > div1.child_add(c) > end > > printf "xxx root: #{doc.root}\n" > > > _______________________________________________ > libxml-devel mailing list > libxml-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/libxml-devel > -- FUKUDA, Keisuke <福田圭祐> http://d.hatena.ne.jp/keisukefukuda/ From keisukefukuda at gmail.com Fri Nov 30 05:05:25 2007 From: keisukefukuda at gmail.com (keisuke fukuda) Date: Fri, 30 Nov 2007 19:05:25 +0900 Subject: [libxml-devel] Segmentation fault when add the cloned/copied node In-Reply-To: References: <17dac0df-ad74-4352-a50f-4b7f4662eb6f@t47g2000hsc.googlegroups.com> Message-ID: Sorry, I made some misunderstanding. The libxml source that I looked at was somewhat old, and the latest source says that _priavte is for application data. So, using _private to store VALUE is valid way. But, the set-to-null problem is still there... :-( 07/11/29 に keisuke fukuda さんは書きました: > Hi, > > I've been trying this problem these days. > For now I've not yet reached to an answer, but this may be a hint : > > gdb told me that the problem is about memory allocation, > especially a double-freeing problem. > > Changing optyk's script like this : > > div2.each do |child| > #c = child.clone > c = child.copy(false) > div1.child_add(c) > exit # <--------- add this > end > > causes > > *** glibc detected *** ruby: double free or corruption (fasttop): 0x088b5db0 *** > ======= Backtrace: ========= > /lib/libc.so.6[0x4f2c0f5d] > /lib/libc.so.6(cfree+0x90)[0x4f2c45b0] > /usr/lib/libxml2.so.2(xmlFreeNode+0x14b)[0x4fa1ce8b] > ./xml/libxml_so.so(ruby_xml_node2_free+0x5f)[0x4865ef] > /usr/lib/libruby.so.1.8(rb_gc_call_finalizer_at_exit+0xa7)[0x457f4a87] > /usr/lib/libruby.so.1.8[0x457dad07] > /usr/lib/libruby.so.1.8(ruby_cleanup+0xf9)[0x457e2ad9] > /usr/lib/libruby.so.1.8(ruby_stop+0x1d)[0x457e2bbd] > /usr/lib/libruby.so.1.8[0x457edae1] > ruby[0x80485f2] > /lib/libc.so.6(__libc_start_main+0xdc)[0x4f270dec] > ruby[0x8048521] > > > Next, I found 'double-wrapping' in ruby_xml_node2_wrap(). > It happened because xmlNode's member "_private" was set to NULL by > xmlAddChild() function. > '_private' is used to store the xmlNode object's wrapper VALUE object > (which is actually a Ruby object pointer). > > you can see it by inserting following code in ruby_xml_node_child_set_aux() > > fprintf(stderr, "--- %p\n", chld->_private); > ret = xmlAddChild(pnode->node, chld); > fprintf(stderr, "--- %p\n", chld->_private); > > ==> (Run optyk's script) > --- 0xb7f8b9b8 > --- (nil) > > Just restoring '_private' value didn't fix the problem, but there may > be 'set-to-null problem' > elsewhere. > So, to conclude, I'm afraid it's not a good idea to store VALUE > pointer in '_private'. > libxml header file says its for internal use related to CORBA ... :-( > > I'm glad if this could be a help for you, experts :-) > > 2007/11/27, optyk : > > hello, > > > > I get segmentation fault when add the cloned/copied node to other > > node, > > script to problem reproduction below, > > > > segv appears when use clone and copy methods, > > what's interesting, with clone segv is thrown in div1.child_add(c) > > line (see script) > > but when use copy I get it in printf root statement, moreover copy > > seems to work wrong only for text nodes, when use 't3' div everything > > works fine > > > > I get this error on 0.5.2.0, 0.5.2.1 and 0.5.2.2 (latest svn) version > > > > it looks like fix is required in ruby-libxml code, > > > > BTW. it looks also like ruby_xml_node_copy() in ruby_xml_node.c calls > > xmlCopyNode() with wrong attributes, it should be 2 for shallow copy > > and 1 for deep copy > > > > regards, > > Piotrek > > > > -------------------- SCRIPT --------------------- > > require 'xml/libxml' > > > > str = <<-STR > > > > > >
> style="STATIC">werwerwerwerwer
> >
> > Quisque et diam dapibus nisi bibendum blandit. > >
> >
> >

aaaaaaaaa

> >
> > > > > > STR > > > > > > XML::Parser.default_keep_blanks = false > > xp = XML::Parser.new > > xp.string = str > > doc = xp.parse > > > > xpath = "//div[@id='t1']" > > div1 = doc.find(xpath).to_a[0] > > printf "xxx div1: #{div1}\n" > > > > xpath = "//div[@id='t2']" > > div2 = doc.find(xpath).to_a[0] > > printf "xxx div2: #{div2}\n" > > > > > > div2.each do |child| > > #c = child.clone > > c = child.copy(false) > > div1.child_add(c) > > end > > > > printf "xxx root: #{doc.root}\n" > > > > > > _______________________________________________ > > libxml-devel mailing list > > libxml-devel at rubyforge.org > > http://rubyforge.org/mailman/listinfo/libxml-devel > > > > > -- > FUKUDA, Keisuke <福田圭祐> > http://d.hatena.ne.jp/keisukefukuda/ > -- FUKUDA, Keisuke <福田圭祐> http://d.hatena.ne.jp/keisukefukuda/ From danj at 3skel.com Fri Nov 30 20:51:38 2007 From: danj at 3skel.com (Dan Janowski) Date: Fri, 30 Nov 2007 20:51:38 -0500 Subject: [libxml-devel] Segmentation fault when add the cloned/copied node In-Reply-To: References: <17dac0df-ad74-4352-a50f-4b7f4662eb6f@t47g2000hsc.googlegroups.com> Message-ID: I have not been able to reproduce this SEGV on two different processor architectures. Any additional effort or debugging would be necessary. Yes, _private is for application use. Dan