From erik at aldm.se Sun Feb 1 17:12:39 2009 From: erik at aldm.se (Erik Lindblad) Date: Sun, 01 Feb 2009 23:12:39 +0100 Subject: [libxml-devel] Strange node being created from white space Message-ID: <49861E57.50301@aldm.se> Hi I am experiencing some problems when constructing a document piece by piece. I have boiled downed the problem to the following code: require 'rubygems' require 'xml/libxml' # Envelope XML xml_string = %{ } # Create Document parser = XML::Parser.string(xml_string, :options => XML::Parser::Options::NOBLANKS) doc = parser.parse # Add body envelope_node = doc.find_first("/ns:Envelope", "ns:http://schemas.xmlsoap.org/soap/envelope/") body_node = XML::Node.new("soapenv:Body") envelope_node << body_node puts envelope_node.children.size # => 2, the extre node being a node with content "\n" # Remove white space in XML string xml_string.gsub!(/[\n]/, "") # Create Document with clean string parser = XML::Parser.string(xml_string, :options => XML::Parser::Options::NOBLANKS) doc = parser.parse # Add body envelope_node = doc.find_first("/ns:Envelope", "ns:http://schemas.xmlsoap.org/soap/envelope/") body_node = XML::Node.new("soapenv:Body") envelope_node << body_node puts envelope_node.children.size # => 1, the correct Body node The code first creates an XML::Document from an XML string. It then adds a Body node to the root Envelope node. The problem is that in the first example (where the line breaks are kept) results in an extra node being created. I provided the NOBLANKS option to show that this does not solve the problem, but the extra node is created whether it is provided or not. Does anyone have some input on this. I am running the latest 0.9.8 release (great job with it btw). Kindest regards Erik From cfis at savagexi.com Sun Feb 1 23:40:26 2009 From: cfis at savagexi.com (Charlie Savage) Date: Sun, 01 Feb 2009 21:40:26 -0700 Subject: [libxml-devel] Strange node being created from white space In-Reply-To: <49861E57.50301@aldm.se> References: <49861E57.50301@aldm.se> Message-ID: <4986793A.2010307@savagexi.com> > I am experiencing some problems when constructing a document piece by > piece. I have boiled downed the problem to the following code: > > require 'rubygems' > require 'xml/libxml' > > # Envelope XML > xml_string = %{ > xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" > xmlns:xsd="http://www.w3.org/2001/XMLSchema" > xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> > } > > # Create Document > parser = XML::Parser.string(xml_string, :options => > XML::Parser::Options::NOBLANKS) What happens if you first call XML.default_keep_blanks = false? Should do the same thing as the parser option, but worth a quick double check. And also, take a look at this message from the libxml creator. He strongly discourages the use of NOBLANKS. http://mail.gnome.org/archives/xml/2007-November/msg00022.html Charlie -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 3237 bytes Desc: S/MIME Cryptographic Signature URL: From erik at aldm.se Mon Feb 2 01:21:02 2009 From: erik at aldm.se (Erik Lindblad) Date: Mon, 02 Feb 2009 07:21:02 +0100 Subject: [libxml-devel] Strange node being created from white space In-Reply-To: <4986793A.2010307@savagexi.com> References: <49861E57.50301@aldm.se> <4986793A.2010307@savagexi.com> Message-ID: <498690CE.4000406@aldm.se> An HTML attachment was scrubbed... URL: From cfis at savagexi.com Mon Feb 2 02:08:07 2009 From: cfis at savagexi.com (Charlie Savage) Date: Mon, 02 Feb 2009 00:08:07 -0700 Subject: [libxml-devel] Strange node being created from white space In-Reply-To: <498690CE.4000406@aldm.se> References: <49861E57.50301@aldm.se> <4986793A.2010307@savagexi.com> <498690CE.4000406@aldm.se> Message-ID: <49869BD7.3040409@savagexi.com> > Regarding the use of NOBLANKS, I am actually glad it is not required > (and even discouraged). I nly tried it once the initial code did not > function as I expected. I cannot think of a scenario where you do want a > single "\n" to become a separate node, so I think it should be default > behaviour behaviour to not make it so. > > Do you have any other suggestions what might be wrong? As I read David's post (see link I posted), there is nothing wrong at all. This is the way an xml parser is supposed to work and its up to you as the developer to know what to do with the blank nodes. Quoting: ------ There is no way to know for sure whether a blank node is significant or not. The standard says *always keep them*, the XML_PARSE_NOBLANKS options still tries to do some guesses, but since there is no proven algorithm which works (c.f. decades of SGML experience) this is an heuristic, and there is no garanteed result. Don't expect a reliable behaviour from this option, do not use it, only the application can tell if a blank node is significant or not. ------------- I'd say don't worry about it, and use xpath to grab the elements you need out of the document. Is this windows by the way? Could be a CRLF issue (search google, someone once had an issue with that a few years ago). Charlie -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 3237 bytes Desc: S/MIME Cryptographic Signature URL: From goodieBoy at gmail.com Fri Feb 6 16:20:22 2009 From: goodieBoy at gmail.com (matt mitchell) Date: Fri, 6 Feb 2009 13:20:22 -0800 (PST) Subject: [libxml-devel] How to expand entities? Message-ID: <536dbaa7-691b-4bad-8005-c3b7039f1b34@w1g2000prk.googlegroups.com> Hi, How can I expand the entities in a document using this library? Thanks, Matt From cfis at savagexi.com Fri Feb 6 16:53:27 2009 From: cfis at savagexi.com (Charlie Savage) Date: Fri, 06 Feb 2009 14:53:27 -0700 Subject: [libxml-devel] How to expand entities? In-Reply-To: <536dbaa7-691b-4bad-8005-c3b7039f1b34@w1g2000prk.googlegroups.com> References: <536dbaa7-691b-4bad-8005-c3b7039f1b34@w1g2000prk.googlegroups.com> Message-ID: <498CB157.60901@savagexi.com> > How can I expand the entities in a document using this library? Globally: XML.default_substitute_entities = true Per parse run (only for 0.9.8 and higher): parser = XML::Parser.string('...', :options => XML::Parser::Options::NOENT) doc = parser.parse See the rdocs for all the other parser options you can use. Charlie -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 3237 bytes Desc: S/MIME Cryptographic Signature URL: From nemesisdesign at gmail.com Sat Feb 7 20:08:29 2009 From: nemesisdesign at gmail.com (Michael Xavier) Date: Sat, 7 Feb 2009 17:08:29 -0800 (PST) Subject: [libxml-devel] Odd Segmentation Faults Message-ID: <93e17ac0-a32e-4874-90ff-6b2449d51ab2@f40g2000pri.googlegroups.com> Hi. I've been using libxml-ruby 0.98 recently to validate XML messages that two of my apps send to eachother against some dtds in a rails project which I have stored in lib/dtds. Everything has been working great but I'm having some problems with nested attributes in my dtds. Models can potentially be nested in eachother so I'm trying to use a parameter declaration to load in nested attributes. I have played around with the path I specify and I found I have to reference it from the rails root. ##Here's my XML: 1 new def reasoning new def body 2 1 ##Here's my first attempt at definition.dtd: %product; >From that I get: ./app/models/queue_item.rb:44: [BUG] Segmentation fault Line 44 is: dtd = XML::Dtd.new(File.read(Rails.root + "/lib/dtds/" + dtd_filename)) where dtd_filename is definition.dtd I have tried validating this part of the dtd, pasting products.dtd in and validating here an on any online validation service it shows up fine, but it segfaults every time with libxml-ruby, bringing my testing to a screeching halt. So my question is, does anyone know if [BUG] is indicating a bug in libxml-ruby or something that I'm doing wrong? Any way around it? From cfis at savagexi.com Sun Feb 8 19:01:35 2009 From: cfis at savagexi.com (Charlie Savage) Date: Sun, 08 Feb 2009 17:01:35 -0700 Subject: [libxml-devel] Odd Segmentation Faults In-Reply-To: <93e17ac0-a32e-4874-90ff-6b2449d51ab2@f40g2000pri.googlegroups.com> References: <93e17ac0-a32e-4874-90ff-6b2449d51ab2@f40g2000pri.googlegroups.com> Message-ID: <498F725F.4060605@savagexi.com> Hi Michael, > > %product; > >>From that I get: > ./app/models/queue_item.rb:44: [BUG] Segmentation fault > Line 44 is: dtd = XML::Dtd.new(File.read(Rails.root + "/lib/dtds/" > + dtd_filename)) > where dtd_filename is definition.dtd Works fine here, but you didn't post lib/dtds/product.dtd, so the test I have is different than yours. > So my question is, does anyone know if [BUG] is indicating a bug in > libxml-ruby or something that I'm doing wrong? It sounds like a libxml bug - can you post a test case that shows it happening? If so, it should be easy to fix. Charlie -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 3237 bytes Desc: S/MIME Cryptographic Signature URL: From nemesisdesign at gmail.com Mon Feb 9 13:24:42 2009 From: nemesisdesign at gmail.com (Michael Xavier) Date: Mon, 9 Feb 2009 10:24:42 -0800 (PST) Subject: [libxml-devel] Odd Segmentation Faults In-Reply-To: <498F725F.4060605@savagexi.com> References: <93e17ac0-a32e-4874-90ff-6b2449d51ab2@f40g2000pri.googlegroups.com> <498F725F.4060605@savagexi.com> Message-ID: Oh, sorry my bad. lib/dtds/product.dtd is: For the test case I created a directory libxmltest with the path lib/ dtds/ containing product.dtd and definition.dtd My script looks like: ##test_libxml.rb require 'rubygems' require 'xml' xml = ' 1 new def reasoning new def body 2 1 ' begin doc = XML::Parser.string(xml).parse rescue LibXML::XML::Error => e raise LibXML::XML::Error.new, "#{e.message}\nXML dump: #{xml}" end dtd = XML::Dtd.new(File.read("lib/dtds/definition.dtd")) puts doc.validate(dtd) ? "Success!" : "Failure!" ##Output: ruby test_libxml.rb ruby 1.8.7 (2008-08-11 patchlevel 72) [i486-linux] Aborted On Feb 8, 4:01?pm, Charlie Savage wrote: > Hi Michael, > > > ? > > ? %product; > > >>From that I get: > > ./app/models/queue_item.rb:44: [BUG] Segmentation fault > > Line 44 is: ? ? dtd = XML::Dtd.new(File.read(Rails.root + "/lib/dtds/" > > + dtd_filename)) > > where dtd_filename is definition.dtd > > Works fine here, but you didn't post lib/dtds/product.dtd, so the test I > have is different than yours. > > > So my question is, does anyone know if [BUG] is indicating a bug in > > libxml-ruby or something that I'm doing wrong? > > It sounds like a libxml bug - can you post a test case that shows it > happening? ?If so, it should be easy to fix. > > Charlie > > ?smime.p7s > 4KViewDownload > > _______________________________________________ > libxml-devel mailing list > libxml-de... at rubyforge.orghttp://rubyforge.org/mailman/listinfo/libxml-devel From joe at collectivex.com Mon Feb 9 20:37:47 2009 From: joe at collectivex.com (Joe Khoobyar) Date: Mon, 09 Feb 2009 20:37:47 -0500 Subject: [libxml-devel] Improving memory usage in libxml-devel Message-ID: <4990DA6B.3060806@collectivex.com> A few weeks ago, Chris and I fixed a significant memory leak in libxml-ruby. However, we still observed significant memory usage when running a simple test case. Since then, I've intended to try and decrease the memory usage. Finally, this weekend, I made a small but significant change which decreases memory usage an order of magnitude less than it was before. Revision 783 in subversion applies this change. With this change, test08.rb holds about 13 MB on a Mac, where it held over 120 MB before. We are simply using libxml2's memory management hooks to direct it's alloc/free functions and related functions to ruby's versions of those functions. What I originally planned (which is why I'm using xmlGcMemSetup) was to direct what libxml2 calls "atomic" memory allocations to the version of the ruby's allocator which is able to run the garbage collector. Of course, what I shortly discovered was that there was apparently no version of the ruby allocator that /didn't/ attempt to run the garbage collector. That considered, I can probably safely change this to use xmlMemSetup without any change in behavior. Honestly, though, it doesn't matter as far as the logic is concerned. Basically what this does is have both libxml2 and ruby use the same allocator and allow ruby to run it's GC even in response to libxml2 allocations which keeps memory usage down much more easily. I'm already running this on all of our production servers and it yielded instant benefits. Just wondering: does anyone have any feedback on this change or on this subject? FYI: I've been running it on 32-bit ruby on OS X Leopard and on several 64-bit UNIX instances for half a week now. -- *Joe Khoobyar * Chief Technical Officer & Lead Developer CollectiveX - /Groups that work!/ mobile: 585.245.2902 email: joe at collectivex.com web: www.collectivex.com www.groupsites.com The third-rate mind is only happy when it is thinking with the majority. The second-rate mind is only happy when it is thinking with the minority. The first-rate mind is only happy when it is thinking. --- A.A. Milne -------------- next part -------------- An HTML attachment was scrubbed... URL: From rnajlis at gmail.com Mon Feb 9 13:54:47 2009 From: rnajlis at gmail.com (robert.najlis) Date: Mon, 9 Feb 2009 10:54:47 -0800 (PST) Subject: [libxml-devel] problem with version .98 on Mac OS X Message-ID: I just upgraded to version .98 on Mac OS X and when I tried running db:migrate I got the following error. dyld: NSLinkModule() error dyld: Symbol not found: _htmlNewParserCtxt Referenced from: /usr/local/lib/ruby/gems/1.8/gems/libxml-ruby-0.9.8/ lib/libxml_ruby.bundle Expected in: flat namespace Trace/BPT trap Thanks. Robert From joe at collectivex.com Mon Feb 9 16:46:33 2009 From: joe at collectivex.com (Joe Khoobyar) Date: Mon, 09 Feb 2009 16:46:33 -0500 Subject: [libxml-devel] Improving memory usage in libxml-devel Message-ID: <4990A439.2070605@collectivex.com> A few weeks ago, Chris and I fixed a significant memory leak in libxml-ruby. However, we still observed significant memory usage when running a simple test case. Since then, I've intended to try and decrease the memory usage. Finally, this weekend, I made a small but significant change which decreases memory usage an order of magnitude less than it was before. Revision 783 in subversion applies this change. With this change, test08.rb holds about 13 MB on a Mac, where it held over 120 MB before. We are simply using libxml2's memory management hooks to direct it's alloc/free functions and related functions to ruby's versions of those functions. What I originally planned (which is why I'm using xmlGcMemSetup) was to direct what libxml2 calls "atomic" memory allocations to the version of the ruby's allocator which is able to run the garbage collector. Of course, what I shortly discovered was that there was apparently no version of the ruby allocator that /didn't/ attempt to run the garbage collector. That considered, I can probably safely change this to use xmlMemSetup without any change in behavior. Honestly, though, it doesn't matter as far as the logic is concerned. Basically what this does is have both libxml2 and ruby use the same allocator and allow ruby to run it's GC even in response to libxml2 allocations which keeps memory usage down much more easily. I'm already running this on all of our production servers and it yielded instant benefits. Just wondering: does anyone have any feedback on this change or on this subject? FYI: I've been running it on 32-bit ruby on OS X Leopard and on several 64-bit UNIX instances for half a week now. -- *Joe Khoobyar * Chief Technical Officer & Lead Developer CollectiveX - /Groups that work!/ mobile: 585.245.2902 email: joe at collectivex.com web: www.collectivex.com www.groupsites.com The third-rate mind is only happy when it is thinking with the majority. The second-rate mind is only happy when it is thinking with the minority. The first-rate mind is only happy when it is thinking. --- A.A. Milne -------------- next part -------------- An HTML attachment was scrubbed... URL: From cfis at savagexi.com Mon Feb 9 21:25:18 2009 From: cfis at savagexi.com (Charlie Savage) Date: Mon, 09 Feb 2009 19:25:18 -0700 Subject: [libxml-devel] Improving memory usage in libxml-devel In-Reply-To: <4990DA6B.3060806@collectivex.com> References: <4990DA6B.3060806@collectivex.com> Message-ID: <4990E58E.5010103@savagexi.com> Hey Joe, > Revision 783 in subversion applies this change. With this change, > test08.rb holds about 13 MB on a Mac, where it held over 120 MB before. Ah, interesting. It is amazing how much memory libxml seems to grab on simple test cases. > We are simply using libxml2's memory management hooks to direct it's > alloc/free functions and related functions to ruby's versions of those > functions. What I originally planned (which is why I'm using > xmlGcMemSetup) was to direct what libxml2 calls "atomic" memory > allocations to the version of the ruby's allocator which is able to run > the garbage collector. > Basically what this does is have both libxml2 and ruby use the same > allocator and allow ruby to run it's GC even in response to libxml2 > allocations which keeps memory usage down much more easily. I'm already > running this on all of our production servers and it yielded instant > benefits. What I gather from this is that somehow libxml is not freeing memory that it allocates. Except that memory is freed, albeit indirectly, when Ruby when it runs a garbage collection freeing Ruby objects and then the underlying libxml objects. And from using valgrind, I haven't seen any large memory leaks in libxml-ruby for quite a while. So I'm wondering what is going on. Does libxml just allocate a big chunk of memory at startup, and then allocates/frees from that memory over the period that a process runs? And by telling libxml to use Ruby's allocator, then that issue goes away? If that's the case, is there some setting to libxml to set the starting memory. Or is there something else entirely different going on? And is there any downside to having libxml get memory from Ruby? Something along the lines that ruby's memory allocation wouldn't be as efficient for libxml's usage patterns? I'd just like to understand this a bit better since its such a big change (not code wise, but running wise). Should we hedge our bets and make it settable somehow - is that even possible? > Just wondering: does anyone have any feedback on this change or on this > subject? FYI: I've been running it on 32-bit ruby on OS X Leopard and > on several 64-bit UNIX instances for half a week now. Yeah, I'd have to give it a try on Windows. Theoretically, it sounds fine, but we'll have to see what really happens. Anyway, excellent work to figure this out! Charlie -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 3237 bytes Desc: S/MIME Cryptographic Signature URL: From cfis at savagexi.com Mon Feb 9 21:30:22 2009 From: cfis at savagexi.com (Charlie Savage) Date: Mon, 09 Feb 2009 19:30:22 -0700 Subject: [libxml-devel] problem with version .98 on Mac OS X In-Reply-To: References: Message-ID: <4990E6BE.9080102@savagexi.com> Hey Robert, and everyone, So this seems to be a common problem on OS X. The switch to use htmlNewParserCtxt happened in the last release and was done: * To make the html parser api consistent with the xml parser api * Expose some of the more advanced features of libxml for developers that need them. So I'd prefer to keep this feature if at all possible. Somehow the bindings compile and link, but doesn't actually run. There is a long discussion about it here: http://rubyforge.org/tracker/index.php?func=detail&aid=23743&group_id=494&atid=1971 Joe's proposed solution is to change the way libxml-ruby builds on OS X to use --with-xml2-config. I'm not sure if that change is checked into trunk or not, Joe? Anyway, any help from the community to test this out and fix the problem would be much appreciated. Thanks, Charlie -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 3237 bytes Desc: S/MIME Cryptographic Signature URL: From joe at collectivex.com Mon Feb 9 22:06:11 2009 From: joe at collectivex.com (Joe Khoobyar) Date: Mon, 09 Feb 2009 22:06:11 -0500 Subject: [libxml-devel] problem with version .98 on Mac OS X In-Reply-To: <4990E6BE.9080102@savagexi.com> References: <4990E6BE.9080102@savagexi.com> Message-ID: <4990EF23.6080303@collectivex.com> Yes, it's checked in. I've been using -- --with-xml2-config on the Mac as the arguments to gem install or ruby setup.rb in the extension directory. I'll make this the default in HEAD since it seems like this is the desirable behavior - especially with so many people using MacPorts, or Gentoo Portage, to manage their development tools environment. *Joe Khoobyar * Chief Technical Officer & Lead Developer CollectiveX - /Groups that work!/ mobile: 585.245.2902 email: joe at collectivex.com web: www.collectivex.com www.groupsites.com The third-rate mind is only happy when it is thinking with the majority. The second-rate mind is only happy when it is thinking with the minority. The first-rate mind is only happy when it is thinking. ? A.A. Milne Charlie Savage wrote: > Hey Robert, and everyone, > > So this seems to be a common problem on OS X. The switch to use > htmlNewParserCtxt happened in the last release and was done: > > * To make the html parser api consistent with the xml parser api > * Expose some of the more advanced features of libxml for developers > that need them. > > So I'd prefer to keep this feature if at all possible. > > Somehow the bindings compile and link, but doesn't actually run. > There is a long discussion about it here: > > http://rubyforge.org/tracker/index.php?func=detail&aid=23743&group_id=494&atid=1971 > > > Joe's proposed solution is to change the way libxml-ruby builds on OS > X to use --with-xml2-config. I'm not sure if that change is checked > into trunk or not, Joe? > > Anyway, any help from the community to test this out and fix the > problem would be much appreciated. > > Thanks, > > Charlie > ------------------------------------------------------------------------ > > _______________________________________________ > libxml-devel mailing list > libxml-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/libxml-devel -------------- next part -------------- An HTML attachment was scrubbed... URL: From joe at collectivex.com Mon Feb 9 22:16:12 2009 From: joe at collectivex.com (Joe Khoobyar) Date: Mon, 09 Feb 2009 22:16:12 -0500 Subject: [libxml-devel] Improving memory usage in libxml-devel In-Reply-To: <4990E58E.5010103@savagexi.com> References: <4990DA6B.3060806@collectivex.com> <4990E58E.5010103@savagexi.com> Message-ID: <4990F17C.9050201@collectivex.com> The one thing I didn't include in the post is the output of "vmmap" on OS X prior to this change. It showed large amounts of memory both marked as "(freed)" and some not. That pattern reminded me of internal fragmentation in a malloc implementation, which is a standard problem to be overcome when implementing one. If anyone is interested, I could post that as well. *Joe Khoobyar * Chief Technical Officer & Lead Developer CollectiveX - /Groups that work!/ mobile: 585.245.2902 email: joe at collectivex.com web: www.collectivex.com www.groupsites.com The third-rate mind is only happy when it is thinking with the majority. The second-rate mind is only happy when it is thinking with the minority. The first-rate mind is only happy when it is thinking. ? A.A. Milne Charlie Savage wrote: > Hey Joe, > >> Revision 783 in subversion applies this change. With this change, >> test08.rb holds about 13 MB on a Mac, where it held over 120 MB before. > > Ah, interesting. It is amazing how much memory libxml seems to grab > on simple test cases. > >> We are simply using libxml2's memory management hooks to direct it's >> alloc/free functions and related functions to ruby's versions of >> those functions. What I originally planned (which is why I'm using >> xmlGcMemSetup) was to direct what libxml2 calls "atomic" memory >> allocations to the version of the ruby's allocator which is able to >> run the garbage collector. > > > Basically what this does is have both libxml2 and ruby use the same > > allocator and allow ruby to run it's GC even in response to libxml2 > > allocations which keeps memory usage down much more easily. I'm > already > > running this on all of our production servers and it yielded instant > > benefits. > > What I gather from this is that somehow libxml is not freeing memory > that it allocates. Except that memory is freed, albeit indirectly, > when Ruby when it runs a garbage collection freeing Ruby objects and > then the underlying libxml objects. And from using valgrind, I > haven't seen any large memory leaks in libxml-ruby for quite a while. > > So I'm wondering what is going on. Does libxml just allocate a big > chunk of memory at startup, and then allocates/frees from that memory > over the period that a process runs? And by telling libxml to use > Ruby's allocator, then that issue goes away? If that's the case, is > there some setting to libxml to set the starting memory. Or is there > something else entirely different going on? > > And is there any downside to having libxml get memory from Ruby? > Something along the lines that ruby's memory allocation wouldn't be as > efficient for libxml's usage patterns? > > I'd just like to understand this a bit better since its such a big > change (not code wise, but running wise). Should we hedge our bets > and make it settable somehow - is that even possible? > >> Just wondering: does anyone have any feedback on this change or on >> this subject? FYI: I've been running it on 32-bit ruby on OS X >> Leopard and on several 64-bit UNIX instances for half a week now. > > Yeah, I'd have to give it a try on Windows. Theoretically, it sounds > fine, but we'll have to see what really happens. > > Anyway, excellent work to figure this out! > > Charlie > -------------- next part -------------- An HTML attachment was scrubbed... URL: From transfire at gmail.com Tue Feb 10 09:45:26 2009 From: transfire at gmail.com (Trans) Date: Tue, 10 Feb 2009 06:45:26 -0800 (PST) Subject: [libxml-devel] Better performance from find or parsing? Message-ID: <47df3677-922d-495f-893c-328eee4be772@v13g2000yqm.googlegroups.com> Say I have about 20 tags to find in a document and manipulate. What is going to be the faster approach, reading through the entire tree and handling the matching tags as it comes across them, or doing a #find on the document for each tag. Thanks. From aaron at tenderlovemaking.com Tue Feb 10 11:36:40 2009 From: aaron at tenderlovemaking.com (Aaron Patterson) Date: Tue, 10 Feb 2009 08:36:40 -0800 Subject: [libxml-devel] Better performance from find or parsing? In-Reply-To: <47df3677-922d-495f-893c-328eee4be772@v13g2000yqm.googlegroups.com> References: <47df3677-922d-495f-893c-328eee4be772@v13g2000yqm.googlegroups.com> Message-ID: <20090210163640.GA17104@amac.local> On Tue, Feb 10, 2009 at 06:45:26AM -0800, Trans wrote: > Say I have about 20 tags to find in a document and manipulate. What is > going to be the faster approach, reading through the entire tree and > handling the matching tags as it comes across them, or doing a #find > on the document for each tag. I think it depends on the size of the document. I've found that there is a certain point (around 8MB or so IIRC) where doing SAX parsing becomes cheaper than a DOM + XPath search. I guess it also depends on the searches you're doing. I'm not being very helpful! My answer is "it depends". :-( -- Aaron Patterson http://tenderlovemaking.com/ From sean at chittenden.org Tue Feb 10 11:49:48 2009 From: sean at chittenden.org (Sean Chittenden) Date: Tue, 10 Feb 2009 08:49:48 -0800 Subject: [libxml-devel] Better performance from find or parsing? In-Reply-To: <47df3677-922d-495f-893c-328eee4be772@v13g2000yqm.googlegroups.com> References: <47df3677-922d-495f-893c-328eee4be772@v13g2000yqm.googlegroups.com> Message-ID: <2B2F21FF-73D4-4A30-8610-26806AC11ABC@chittenden.org> > Say I have about 20 tags to find in a document and manipulate. What is > going to be the faster approach, reading through the entire tree and > handling the matching tags as it comes across them, or doing a #find > on the document for each tag. Neither. Using a SAX interface is the fastest, actually. Using both of the methods you suggest require creating a DOM tree. -sc -- Sean Chittenden sean at chittenden.org From port001 at gmail.com Thu Feb 12 11:35:21 2009 From: port001 at gmail.com (ileitch) Date: Thu, 12 Feb 2009 08:35:21 -0800 (PST) Subject: [libxml-devel] Memory usage compared to REXML Message-ID: <04047928-6487-4bb1-9936-fceeae1f5baf@a12g2000yqm.googlegroups.com> Hey, I just switched over a chunk of code to use libxml-ruby expecting to see improvements in performance and memory usage. While performance is drastically better, memory usage has increased quite significantly. With REXML parsing a 3.8mb document would consume 259mb, however with libxml-ruby parsing the same document consumes around 407mb. Is this expected or am I doing something wrong? I've tried niling objects and starting GC at various points in my code but to no effect. I'm using 0.9.8 on OS X. Thanks Ian From cfis at savagexi.com Fri Feb 13 02:26:44 2009 From: cfis at savagexi.com (Charlie Savage) Date: Fri, 13 Feb 2009 00:26:44 -0700 Subject: [libxml-devel] Improving memory usage in libxml-devel In-Reply-To: <4990F17C.9050201@collectivex.com> References: <4990DA6B.3060806@collectivex.com> <4990E58E.5010103@savagexi.com> <4990F17C.9050201@collectivex.com> Message-ID: <499520B4.4050900@savagexi.com> > The one thing I didn't include in the post is the output of "vmmap" on > OS X prior to this change. > > It showed large amounts of memory both marked as "(freed)" and some > not. That pattern reminded me of internal fragmentation in a malloc > implementation, which is a standard problem to be overcome when > implementing one. If anyone is interested, I could post that as well. This seems to work fine on Windows. In my quick look, it improved memory usage by about 15%.... Charlie -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 3237 bytes Desc: S/MIME Cryptographic Signature URL: From cfis at savagexi.com Fri Feb 13 02:59:20 2009 From: cfis at savagexi.com (Charlie Savage) Date: Fri, 13 Feb 2009 00:59:20 -0700 Subject: [libxml-devel] Memory usage compared to REXML In-Reply-To: <04047928-6487-4bb1-9936-fceeae1f5baf@a12g2000yqm.googlegroups.com> References: <04047928-6487-4bb1-9936-fceeae1f5baf@a12g2000yqm.googlegroups.com> Message-ID: <49952858.8080902@savagexi.com> Hi Ian, > I just switched over a chunk of code to use libxml-ruby expecting to > see improvements in performance and memory usage. While performance is > drastically better, memory usage has increased quite significantly. > > With REXML parsing a 3.8mb document would consume 259mb, however with > libxml-ruby parsing the same document consumes around 407mb. Try trunk. Joe recently checked in a change that significantly reduces memory usage on OS X. See: http://rubyforge.org/pipermail/libxml-devel/2009-February/001375.html Charlie -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 3237 bytes Desc: S/MIME Cryptographic Signature URL: From nemesisdesign at gmail.com Wed Feb 18 17:33:33 2009 From: nemesisdesign at gmail.com (Michael Xavier) Date: Wed, 18 Feb 2009 14:33:33 -0800 (PST) Subject: [libxml-devel] Odd Segmentation Faults In-Reply-To: <498F725F.4060605@savagexi.com> References: <93e17ac0-a32e-4874-90ff-6b2449d51ab2@f40g2000pri.googlegroups.com> <498F725F.4060605@savagexi.com> Message-ID: <8a929634-31f3-4d10-ad60-eeab060a89cd@u18g2000pro.googlegroups.com> On Feb 8, 4:01?pm, Charlie Savage wrote: > Hi Michael, > > > ? > > ? %product; > > >>From that I get: > > ./app/models/queue_item.rb:44: [BUG] Segmentation fault > > Line 44 is: ? ? dtd = XML::Dtd.new(File.read(Rails.root + "/lib/dtds/" > > + dtd_filename)) > > where dtd_filename is definition.dtd > > Works fine here, but you didn't post lib/dtds/product.dtd, so the test I > have is different than yours. > > > So my question is, does anyone know if [BUG] is indicating a bug in > > libxml-ruby or something that I'm doing wrong? > > It sounds like a libxml bug - can you post a test case that shows it > happening? ?If so, it should be easy to fix. > > Charlie > > ?smime.p7s > 4KViewDownload > > _______________________________________________ > libxml-devel mailing list > libxml-de... at rubyforge.orghttp://rubyforge.org/mailman/listinfo/libxml-devel From havanap at gmail.com Thu Feb 19 08:57:31 2009 From: havanap at gmail.com (Norihito YAMAKAWA) Date: Thu, 19 Feb 2009 22:57:31 +0900 Subject: [libxml-devel] XML::Error patch for Merb app. Message-ID: <8154f7ea0902190557h5bc9fe6elb1f6e9339b98e190@mail.gmail.com> Hi. When I was writing Merb app which does XML Schema validation with LibXML 2.7.3, global error handler registerd on XML::Error is never callbacked even if validation was failed. I tried to add a code which re-register error handler to Libxml whenever XML::Error.set_handler is called. It seems works well, error handler can be callbacked. This is strange sinse simple XML Schema validation script works no problem on global error callback, but inside Merb app, this trouble occured... It may be multi-thread issue which Merb framework works as. (Of course I use Mutex to synchronize libxml access.) At first, I was tried to use block with XML::Document#validate_schema, but callback was never invoked when validation was failed. I spent a half of day to fix it with involiving LibXML C-APIs, but I couldn't, I decided to use global XML error handler. Trouble with the block for validate_schema is occured even minimum test script, so I gave it up. Also I found XML::Error#domain_to_s doesn't work, this patch includes additional constants for XML::Error. -------------- next part -------------- A non-text attachment was scrubbed... Name: libxml-ruby-0.9.8.error_handler.patch Type: application/octet-stream Size: 2185 bytes Desc: not available URL: From jacob.lauemoeller at iteray.com Fri Feb 20 09:03:28 2009 From: jacob.lauemoeller at iteray.com (=?UTF-8?Q?Jacob_Lauem=C3=B8ller?=) Date: Fri, 20 Feb 2009 15:03:28 +0100 Subject: [libxml-devel] 0.9.8 crashes on Mac OS X Message-ID: <645B2DCA-D605-4A32-9F5D-ACFF15FC3EC2@iteray.com> Hi guys I'm having major problems with libxml-ruby version 0.9.8 on Mac OS X 10.5.6. The gem installs without problems, but schema validation errors result in a Bus Error or Segmentation Fault. The problem doesn't occur when the same code is run against 0.9.7, nor does it occur with 0.9.8. on Ubuntu. I have constructed the following sample program which illustrates the problem: -----------------------[ validate.rb ]---------------- require 'rubygems' require 'libxml' include LibXML if ARGV.length < 2 puts 'Wrong number of arguments' puts 'ruby validate.rb path/to/schema/file path/to/xml/file...' exit 1 end schema = XML::Schema.document(XML::Document.file(ARGV.shift)) ARGV.each do | xml_file_name | instance = XML::Document.file(xml_file_name) result = instance.validate_schema(schema) do |message, is_error| puts "#{is_error ? 'ERROR' : 'WARNING'}: #{xml_file_name}: #{message}" end puts "#{xml_file_name} is #{result ? 'VALID' : 'INVALID'}" end -------------------------------------------------- -----------------------[ schema.rb ]---------------- --------------------------------------------------- -----------------------[ sample1.xml ]--------------- release Ready --------------------------------------------------- Running validate.rb with the schema and sample xml file should generate a validation error since the 'name' element is misspelled. But instead (on Mac OS X) I get $ ruby validate.rb schema.xsd sample1.xml validate.rb:16: [BUG] Segmentation fault ruby 1.8.6 (2008-03-03) [universal-darwin9.0] Abort trap With 0.9.7 I get Error: Element 'release' [CT 'Release']: The element content is not valid. at sample1.xml:0. validate.rb:16:in `validate_schema': Error: Element 'release' [CT 'Release']: The element content is not valid. at sample1.xml:0. (LibXML::XML::Error) from validate.rb:16 from validate.rb:14:in `each' from validate.rb:14 On Ubuntu, 0.9.8 gives me Error: Element '{http://not.important/status}Name': This element is not expected. Expected is ( {http://not.important/status}name ). at sample1.xml:3. validate.rb:16:in `validate_schema': Error: Element '{http://not.important/status }Name': This element is not expected. Expected is ( {http://not.important/status }name ). at sample1.xml:3. (LibXML::XML::Error) from validate.rb:16 from validate.rb:14:in `each' from validate.rb:14 which is spot on. ruby --version on the Mac yields ruby 1.8.6 (2008-03-03 patchlevel 114) [universal-darwin9.0] and on Ubuntu ruby 1.8.7 (2008-08-11 patchlevel 72) [i486-linux] The Mac OS X Problem Report for the 0.9.8 case looks like this: Process: ruby [26788] Path: /usr/bin/ruby Identifier: ruby Version: ??? (???) Code Type: X86 (Native) Parent Process: bash [24391] Date/Time: 2009-02-20 13:27:09.694 +0100 OS Version: Mac OS X 10.5.6 (9G55) Report Version: 6 Exception Type: EXC_BAD_ACCESS (SIGABRT) Exception Codes: KERN_INVALID_ADDRESS at 0x0000000014000004 Crashed Thread: 0 Thread 0 Crashed: 0 libSystem.B.dylib 0x93d15e42 __kill + 10 1 libSystem.B.dylib 0x93d8823a raise + 26 2 libSystem.B.dylib 0x93d94679 abort + 73 3 libruby.1.dylib 0x000cdc10 rb_exc_new + 0 4 libruby.1.dylib 0x0013460f rb_gc_mark_trap_list + 508 5 libSystem.B.dylib 0x93d142bb _sigtramp + 43 6 ??? 0xffffffff 0 + 4294967295 7 libruby.1.dylib 0x000cfdfc rb_clear_cache_by_class + 141 8 libruby.1.dylib 0x000db655 rb_obj_respond_to + 129 9 libruby.1.dylib 0x000db6e1 rb_respond_to + 32 10 libxml_ruby.bundle 0x0019f4ce structuredErrorFunc + 110 11 libxml2.2.dylib 0x915402ba __xmlRaiseError + 1029 12 libxml2.2.dylib 0x915a4eef xmlSchemaSetValidOptions + 5875 13 libxml2.2.dylib 0x915aa076 xmlSchemaNewDocParserCtxt + 8158 14 libxml2.2.dylib 0x915a90fa xmlSchemaNewDocParserCtxt + 4194 15 libxml2.2.dylib 0x915b8733 xmlSchemaNewDocParserCtxt + 67227 16 libxml2.2.dylib 0x915b8841 xmlSchemaValidateDoc + 129 17 libxml_ruby.bundle 0x0019e462 rxml_document_validate_schema + 129 18 libruby.1.dylib 0x000da14c rb_eval_string_wrap + 16637 19 libruby.1.dylib 0x000dad2a rb_eval_string_wrap + 19675 20 libruby.1.dylib 0x000d809a rb_eval_string_wrap + 8267 21 libruby.1.dylib 0x000d70d7 rb_eval_string_wrap + 4232 22 libruby.1.dylib 0x000d89ec rb_eval_string_wrap + 10653 23 libruby.1.dylib 0x000de138 rb_thread_trap_eval + 2393 24 libruby.1.dylib 0x000defe6 rb_yield + 33 25 libruby.1.dylib 0x000bfc7d rb_ary_each + 30 26 libruby.1.dylib 0x000da14c rb_eval_string_wrap + 16637 27 libruby.1.dylib 0x000dad2a rb_eval_string_wrap + 19675 28 libruby.1.dylib 0x000d809a rb_eval_string_wrap + 8267 29 libruby.1.dylib 0x000d70d7 rb_eval_string_wrap + 4232 30 libruby.1.dylib 0x000e702e rb_load_protect + 298 31 libruby.1.dylib 0x000e705f ruby_exec + 22 32 libruby.1.dylib 0x000e708b ruby_run + 42 33 ruby 0x00001fff 0x1000 + 4095 34 ruby 0x00001fa6 start + 54 Thread 0 crashed with X86 Thread State (32-bit): eax: 0x00000000 ebx: 0x93d94639 ecx: 0xbfffd65c edx: 0x93d15e42 edi: 0xa04a2690 esi: 0x00000010 ebp: 0xbfffd678 esp: 0xbfffd65c ss: 0x0000001f efl: 0x00000286 eip: 0x93d15e42 cs: 0x00000007 ds: 0x0000001f es: 0x0000001f fs: 0x00000000 gs: 0x00000037 cr2: 0xffe2aa74 Binary Images: 0x1000 - 0x1ffe +ruby ??? (???) <660a81a680415ef4ca4d85d3104eed85> /usr/bin/ruby 0x3a000 - 0x3bffc thread.bundle ??? (???) /System/Library/Frameworks/ Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/universal-darwin9.0/ thread.bundle 0x98000 - 0x9affd stringio.bundle ??? (???) <6ef963050f33481408e309d4ac8a06c7> /System/Library/Frameworks/ Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/universal-darwin9.0/ stringio.bundle 0xbf000 - 0x15cffb libruby.1.dylib ??? (???) /System/Library/Frameworks/ Ruby.framework/Versions/1.8/usr/lib/libruby.1.dylib 0x17e000 - 0x191ff7 syck.bundle ??? (???) <12c497c718eb3c5b47d3f286b531dfc4> /System/Library/Frameworks/ Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/universal-darwin9.0/ syck.bundle 0x197000 - 0x197ffc etc.bundle ??? (???) /System/Library/Frameworks/ Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/universal-darwin9.0/ etc.bundle 0x19b000 - 0x1b4ffc +libxml_ruby.bundle ??? (???) <7dadcfbc46249cc29bd4d9451269cba6> /Library/Ruby/Gems/1.8/gems/libxml- ruby-0.9.8/lib/libxml_ruby.bundle 0x8fe00000 - 0x8fe2db43 dyld 97.1 (???) <100d362e03410f181a34e04e94189ae5> /usr/lib/dyld 0x90853000 - 0x9085afe9 libgcc_s.1.dylib ??? (???) /usr/lib/libgcc_s.1.dylib 0x90ab6000 - 0x90beeff7 libicucore.A.dylib ??? (???) <18098dcf431603fe47ee027a60006c85> /usr/lib/libicucore.A.dylib 0x9151a000 - 0x915fbff7 libxml2.2.dylib ??? (???) /usr/lib/libxml2.2.dylib 0x93ca7000 - 0x93e0eff3 libSystem.B.dylib ??? (???) /usr/lib/libSystem.B.dylib 0x95416000 - 0x95424ffd libz.1.dylib ??? (???) <5ddd8539ae2ebfd8e7cc1c57525385c7> /usr/lib/libz.1.dylib 0x95a2f000 - 0x95a8cffb libstdc++.6.dylib ??? (???) <04b812dcec670daa8b7d2852ab14be60> /usr/lib/libstdc++.6.dylib 0x95d38000 - 0x95d3cfff libmathCommon.A.dylib ??? (???) /usr/lib/ system/libmathCommon.A.dylib 0x95f97000 - 0x9608bff4 libiconv.2.dylib ??? (???) /usr/lib/libiconv.2.dylib 0xffff0000 - 0xffff1780 libSystem.B.dylib ??? (???) /usr/lib/ libSystem.B.dylib Any ideas? Kind regards, Jacob -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 1933 bytes Desc: not available URL: From havanap at gmail.com Fri Feb 20 23:56:13 2009 From: havanap at gmail.com (Norihito YAMAKAWA) Date: Sat, 21 Feb 2009 13:56:13 +0900 Subject: [libxml-devel] 0.9.8 crashes on Mac OS X In-Reply-To: <645B2DCA-D605-4A32-9F5D-ACFF15FC3EC2@iteray.com> References: <645B2DCA-D605-4A32-9F5D-ACFF15FC3EC2@iteray.com> Message-ID: <8154f7ea0902202056x763c4eb5o101dd71b613d5473@mail.gmail.com> Hi. I tested on OSX 10.4.11 with LibXML 2.7.2 from Darwin ports, but just works fine. Validation error printed same as your Ubuntsu one, no segmentation fault. This might be Leopard or LibXML2 issue. Which LibXML version does you use ? It can be given by LibXML::XML::LIBXML_VERSION. In addition, validate_schema's callback seems not work currently. Though LibXML2 has error call back API on Schema validation, ruby-libxml also use it, but never invoked. It would be LibXML2 bug. You can use global error handler to capture validation errors, set by XML::Error#set_handler. 2009/2/20 Jacob Lauem?ller : > Hi guys > > I'm having major problems with libxml-ruby version 0.9.8 on Mac OS X 10.5.6. > The gem installs without problems, but schema validation errors result in a > Bus Error or Segmentation Fault. > > The problem doesn't occur when the same code is run against 0.9.7, nor does > it occur with 0.9.8. on Ubuntu. > > I have constructed the following sample program which illustrates the > problem: > > -----------------------[ validate.rb ]---------------- > require 'rubygems' > require 'libxml' > > include LibXML > > if ARGV.length < 2 > puts 'Wrong number of arguments' > puts 'ruby validate.rb path/to/schema/file path/to/xml/file...' > exit 1 > end > > schema = XML::Schema.document(XML::Document.file(ARGV.shift)) > > ARGV.each do | xml_file_name | > instance = XML::Document.file(xml_file_name) > result = instance.validate_schema(schema) do |message, is_error| > puts "#{is_error ? 'ERROR' : 'WARNING'}: #{xml_file_name}: #{message}" > end > > puts "#{xml_file_name} is #{result ? 'VALID' : 'INVALID'}" > end > -------------------------------------------------- > > -----------------------[ schema.rb ]---------------- > > targetNamespace="http://not.important/status" > xmlns:tns="http://not.important/status" > xmlns:xs="http://www.w3.org/2001/XMLSchema" > elementFormDefault="qualified" > attributeFormDefault="unqualified"> > > > > > > > > > > > --------------------------------------------------- > > -----------------------[ sample1.xml ]--------------- > > > release > Ready > > --------------------------------------------------- > > Running validate.rb with the schema and sample xml file should generate a > validation error since the 'name' element is misspelled. But instead (on > Mac OS X) I get > > $ ruby validate.rb schema.xsd sample1.xml > validate.rb:16: [BUG] Segmentation fault > ruby 1.8.6 (2008-03-03) [universal-darwin9.0] > > Abort trap > > With 0.9.7 I get > > Error: Element 'release' [CT 'Release']: The element content is not valid. > at sample1.xml:0. > validate.rb:16:in `validate_schema': Error: Element 'release' [CT > 'Release']: The element content is not valid. at sample1.xml:0. > (LibXML::XML::Error) > from validate.rb:16 > from validate.rb:14:in `each' > from validate.rb:14 > > On Ubuntu, 0.9.8 gives me > > Error: Element '{http://not.important/status}Name': This element is not > expected. Expected is ( {http://not.important/status}name ). at > sample1.xml:3. > validate.rb:16:in `validate_schema': Error: Element > '{http://not.important/status}Name': This element is not expected. Expected > is ( {http://not.important/status}name ). at sample1.xml:3. > (LibXML::XML::Error) > from validate.rb:16 > from validate.rb:14:in `each' > from validate.rb:14 > > which is spot on. > > ruby --version on the Mac yields > > ruby 1.8.6 (2008-03-03 patchlevel 114) [universal-darwin9.0] > > and on Ubuntu > > ruby 1.8.7 (2008-08-11 patchlevel 72) [i486-linux] > > The Mac OS X Problem Report for the 0.9.8 case looks like this: > > Process: ruby [26788] > Path: /usr/bin/ruby > Identifier: ruby > Version: ??? (???) > Code Type: X86 (Native) > Parent Process: bash [24391] > > Date/Time: 2009-02-20 13:27:09.694 +0100 > OS Version: Mac OS X 10.5.6 (9G55) > Report Version: 6 > > Exception Type: EXC_BAD_ACCESS (SIGABRT) > Exception Codes: KERN_INVALID_ADDRESS at 0x0000000014000004 > Crashed Thread: 0 > > Thread 0 Crashed: > 0 libSystem.B.dylib 0x93d15e42 __kill + 10 > 1 libSystem.B.dylib 0x93d8823a raise + 26 > 2 libSystem.B.dylib 0x93d94679 abort + 73 > 3 libruby.1.dylib 0x000cdc10 rb_exc_new + 0 > 4 libruby.1.dylib 0x0013460f rb_gc_mark_trap_list + > 508 > 5 libSystem.B.dylib 0x93d142bb _sigtramp + 43 > 6 ??? 0xffffffff 0 + 4294967295 > 7 libruby.1.dylib 0x000cfdfc rb_clear_cache_by_class + > 141 > 8 libruby.1.dylib 0x000db655 rb_obj_respond_to + 129 > 9 libruby.1.dylib 0x000db6e1 rb_respond_to + 32 > 10 libxml_ruby.bundle 0x0019f4ce structuredErrorFunc + 110 > 11 libxml2.2.dylib 0x915402ba __xmlRaiseError + 1029 > 12 libxml2.2.dylib 0x915a4eef xmlSchemaSetValidOptions > + 5875 > 13 libxml2.2.dylib 0x915aa076 xmlSchemaNewDocParserCtxt > + 8158 > 14 libxml2.2.dylib 0x915a90fa xmlSchemaNewDocParserCtxt > + 4194 > 15 libxml2.2.dylib 0x915b8733 xmlSchemaNewDocParserCtxt > + 67227 > 16 libxml2.2.dylib 0x915b8841 xmlSchemaValidateDoc + > 129 > 17 libxml_ruby.bundle 0x0019e462 > rxml_document_validate_schema + 129 > 18 libruby.1.dylib 0x000da14c rb_eval_string_wrap + > 16637 > 19 libruby.1.dylib 0x000dad2a rb_eval_string_wrap + > 19675 > 20 libruby.1.dylib 0x000d809a rb_eval_string_wrap + > 8267 > 21 libruby.1.dylib 0x000d70d7 rb_eval_string_wrap + > 4232 > 22 libruby.1.dylib 0x000d89ec rb_eval_string_wrap + > 10653 > 23 libruby.1.dylib 0x000de138 rb_thread_trap_eval + > 2393 > 24 libruby.1.dylib 0x000defe6 rb_yield + 33 > 25 libruby.1.dylib 0x000bfc7d rb_ary_each + 30 > 26 libruby.1.dylib 0x000da14c rb_eval_string_wrap + > 16637 > 27 libruby.1.dylib 0x000dad2a rb_eval_string_wrap + > 19675 > 28 libruby.1.dylib 0x000d809a rb_eval_string_wrap + > 8267 > 29 libruby.1.dylib 0x000d70d7 rb_eval_string_wrap + > 4232 > 30 libruby.1.dylib 0x000e702e rb_load_protect + 298 > 31 libruby.1.dylib 0x000e705f ruby_exec + 22 > 32 libruby.1.dylib 0x000e708b ruby_run + 42 > 33 ruby 0x00001fff 0x1000 + 4095 > 34 ruby 0x00001fa6 start + 54 > > Thread 0 crashed with X86 Thread State (32-bit): > eax: 0x00000000 ebx: 0x93d94639 ecx: 0xbfffd65c edx: 0x93d15e42 > edi: 0xa04a2690 esi: 0x00000010 ebp: 0xbfffd678 esp: 0xbfffd65c > ss: 0x0000001f efl: 0x00000286 eip: 0x93d15e42 cs: 0x00000007 > ds: 0x0000001f es: 0x0000001f fs: 0x00000000 gs: 0x00000037 > cr2: 0xffe2aa74 > > Binary Images: > 0x1000 - 0x1ffe +ruby ??? (???) <660a81a680415ef4ca4d85d3104eed85> > /usr/bin/ruby > 0x3a000 - 0x3bffc thread.bundle ??? (???) > > /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/universal-darwin9.0/thread.bundle > 0x98000 - 0x9affd stringio.bundle ??? (???) > <6ef963050f33481408e309d4ac8a06c7> > /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/universal-darwin9.0/stringio.bundle > 0xbf000 - 0x15cffb libruby.1.dylib ??? (???) > > /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/libruby.1.dylib > 0x17e000 - 0x191ff7 syck.bundle ??? (???) > <12c497c718eb3c5b47d3f286b531dfc4> > /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/universal-darwin9.0/syck.bundle > 0x197000 - 0x197ffc etc.bundle ??? (???) > > /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/universal-darwin9.0/etc.bundle > 0x19b000 - 0x1b4ffc +libxml_ruby.bundle ??? (???) > <7dadcfbc46249cc29bd4d9451269cba6> > /Library/Ruby/Gems/1.8/gems/libxml-ruby-0.9.8/lib/libxml_ruby.bundle > 0x8fe00000 - 0x8fe2db43 dyld 97.1 (???) <100d362e03410f181a34e04e94189ae5> > /usr/lib/dyld > 0x90853000 - 0x9085afe9 libgcc_s.1.dylib ??? (???) > /usr/lib/libgcc_s.1.dylib > 0x90ab6000 - 0x90beeff7 libicucore.A.dylib ??? (???) > <18098dcf431603fe47ee027a60006c85> /usr/lib/libicucore.A.dylib > 0x9151a000 - 0x915fbff7 libxml2.2.dylib ??? (???) > /usr/lib/libxml2.2.dylib > 0x93ca7000 - 0x93e0eff3 libSystem.B.dylib ??? (???) > /usr/lib/libSystem.B.dylib > 0x95416000 - 0x95424ffd libz.1.dylib ??? (???) > <5ddd8539ae2ebfd8e7cc1c57525385c7> /usr/lib/libz.1.dylib > 0x95a2f000 - 0x95a8cffb libstdc++.6.dylib ??? (???) > <04b812dcec670daa8b7d2852ab14be60> /usr/lib/libstdc++.6.dylib > 0x95d38000 - 0x95d3cfff libmathCommon.A.dylib ??? (???) > /usr/lib/system/libmathCommon.A.dylib > 0x95f97000 - 0x9608bff4 libiconv.2.dylib ??? (???) > /usr/lib/libiconv.2.dylib > 0xffff0000 - 0xffff1780 libSystem.B.dylib ??? (???) > /usr/lib/libSystem.B.dylib > > Any ideas? > Kind regards, > > Jacob > > _______________________________________________ > libxml-devel mailing list > libxml-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/libxml-devel > From jacob.lauemoeller at iteray.com Sat Feb 21 02:50:28 2009 From: jacob.lauemoeller at iteray.com (=?ISO-8859-1?Q?Jacob_Lauem=F8ller?=) Date: Sat, 21 Feb 2009 08:50:28 +0100 Subject: [libxml-devel] 0.9.8 crashes on Mac OS X In-Reply-To: <8154f7ea0902202056x763c4eb5o101dd71b613d5473@mail.gmail.com> References: <645B2DCA-D605-4A32-9F5D-ACFF15FC3EC2@iteray.com> <8154f7ea0902202056x763c4eb5o101dd71b613d5473@mail.gmail.com> Message-ID: Hi, LibXML::XML::LIBXML_VERSION reports 2.6.16 but ports reports libxml2 @2.7.2 textproc/libxml2 something is not quite right here. Thanks for the tip about the validation call-back! Cheers, Jacob On 21/02/2009, at 05.56, Norihito YAMAKAWA wrote: > Hi. > > I tested on OSX 10.4.11 with LibXML 2.7.2 from Darwin ports, > but just works fine. > Validation error printed same as your Ubuntsu one, > no segmentation fault. > > This might be Leopard or LibXML2 issue. > Which LibXML version does you use ? > It can be given by LibXML::XML::LIBXML_VERSION. > > > In addition, validate_schema's callback seems not work currently. > Though LibXML2 has error call back API on Schema validation, > ruby-libxml also use it, but never invoked. > It would be LibXML2 bug. > > You can use global error handler to > capture validation errors, set by XML::Error#set_handler. > > > > 2009/2/20 Jacob Lauem?ller : >> Hi guys >> >> I'm having major problems with libxml-ruby version 0.9.8 on Mac OS >> X 10.5.6. >> The gem installs without problems, but schema validation errors >> result in a >> Bus Error or Segmentation Fault. >> >> The problem doesn't occur when the same code is run against 0.9.7, >> nor does >> it occur with 0.9.8. on Ubuntu. >> >> I have constructed the following sample program which illustrates the >> problem: >> >> -----------------------[ validate.rb ]---------------- >> require 'rubygems' >> require 'libxml' >> >> include LibXML >> >> if ARGV.length < 2 >> puts 'Wrong number of arguments' >> puts 'ruby validate.rb path/to/schema/file path/to/xml/file...' >> exit 1 >> end >> >> schema = XML::Schema.document(XML::Document.file(ARGV.shift)) >> >> ARGV.each do | xml_file_name | >> instance = XML::Document.file(xml_file_name) >> result = instance.validate_schema(schema) do |message, is_error| >> puts "#{is_error ? 'ERROR' : 'WARNING'}: #{xml_file_name}: >> #{message}" >> end >> >> puts "#{xml_file_name} is #{result ? 'VALID' : 'INVALID'}" >> end >> -------------------------------------------------- >> >> -----------------------[ schema.rb ]---------------- >> >> > targetNamespace="http://not.important/status" >> xmlns:tns="http://not.important/status" >> xmlns:xs="http://www.w3.org/2001/XMLSchema" >> elementFormDefault="qualified" >> attributeFormDefault="unqualified"> >> >> >> >> >> >> >> >> >> >> >> --------------------------------------------------- >> >> -----------------------[ sample1.xml ]--------------- >> >> >> release >> Ready >> >> --------------------------------------------------- >> >> Running validate.rb with the schema and sample xml file should >> generate a >> validation error since the 'name' element is misspelled. But >> instead (on >> Mac OS X) I get >> >> $ ruby validate.rb schema.xsd sample1.xml >> validate.rb:16: [BUG] Segmentation fault >> ruby 1.8.6 (2008-03-03) [universal-darwin9.0] >> >> Abort trap >> >> With 0.9.7 I get >> >> Error: Element 'release' [CT 'Release']: The element content is not >> valid. >> at sample1.xml:0. >> validate.rb:16:in `validate_schema': Error: Element 'release' [CT >> 'Release']: The element content is not valid. at sample1.xml:0. >> (LibXML::XML::Error) >> from validate.rb:16 >> from validate.rb:14:in `each' >> from validate.rb:14 >> >> On Ubuntu, 0.9.8 gives me >> >> Error: Element '{http://not.important/status}Name': This element is >> not >> expected. Expected is ( {http://not.important/status}name ). at >> sample1.xml:3. >> validate.rb:16:in `validate_schema': Error: Element >> '{http://not.important/status}Name': This element is not expected. >> Expected >> is ( {http://not.important/status}name ). at sample1.xml:3. >> (LibXML::XML::Error) >> from validate.rb:16 >> from validate.rb:14:in `each' >> from validate.rb:14 >> >> which is spot on. >> >> ruby --version on the Mac yields >> >> ruby 1.8.6 (2008-03-03 patchlevel 114) [universal-darwin9.0] >> >> and on Ubuntu >> >> ruby 1.8.7 (2008-08-11 patchlevel 72) [i486-linux] >> >> The Mac OS X Problem Report for the 0.9.8 case looks like this: >> >> Process: ruby [26788] >> Path: /usr/bin/ruby >> Identifier: ruby >> Version: ??? (???) >> Code Type: X86 (Native) >> Parent Process: bash [24391] >> >> Date/Time: 2009-02-20 13:27:09.694 +0100 >> OS Version: Mac OS X 10.5.6 (9G55) >> Report Version: 6 >> >> Exception Type: EXC_BAD_ACCESS (SIGABRT) >> Exception Codes: KERN_INVALID_ADDRESS at 0x0000000014000004 >> Crashed Thread: 0 >> >> Thread 0 Crashed: >> 0 libSystem.B.dylib 0x93d15e42 __kill + 10 >> 1 libSystem.B.dylib 0x93d8823a raise + 26 >> 2 libSystem.B.dylib 0x93d94679 abort + 73 >> 3 libruby.1.dylib 0x000cdc10 rb_exc_new + 0 >> 4 libruby.1.dylib 0x0013460f >> rb_gc_mark_trap_list + >> 508 >> 5 libSystem.B.dylib 0x93d142bb _sigtramp + 43 >> 6 ??? 0xffffffff 0 + 4294967295 >> 7 libruby.1.dylib 0x000cfdfc >> rb_clear_cache_by_class + >> 141 >> 8 libruby.1.dylib 0x000db655 >> rb_obj_respond_to + 129 >> 9 libruby.1.dylib 0x000db6e1 rb_respond_to + 32 >> 10 libxml_ruby.bundle 0x0019f4ce >> structuredErrorFunc + 110 >> 11 libxml2.2.dylib 0x915402ba __xmlRaiseError >> + 1029 >> 12 libxml2.2.dylib 0x915a4eef >> xmlSchemaSetValidOptions >> + 5875 >> 13 libxml2.2.dylib 0x915aa076 >> xmlSchemaNewDocParserCtxt >> + 8158 >> 14 libxml2.2.dylib 0x915a90fa >> xmlSchemaNewDocParserCtxt >> + 4194 >> 15 libxml2.2.dylib 0x915b8733 >> xmlSchemaNewDocParserCtxt >> + 67227 >> 16 libxml2.2.dylib 0x915b8841 >> xmlSchemaValidateDoc + >> 129 >> 17 libxml_ruby.bundle 0x0019e462 >> rxml_document_validate_schema + 129 >> 18 libruby.1.dylib 0x000da14c >> rb_eval_string_wrap + >> 16637 >> 19 libruby.1.dylib 0x000dad2a >> rb_eval_string_wrap + >> 19675 >> 20 libruby.1.dylib 0x000d809a >> rb_eval_string_wrap + >> 8267 >> 21 libruby.1.dylib 0x000d70d7 >> rb_eval_string_wrap + >> 4232 >> 22 libruby.1.dylib 0x000d89ec >> rb_eval_string_wrap + >> 10653 >> 23 libruby.1.dylib 0x000de138 >> rb_thread_trap_eval + >> 2393 >> 24 libruby.1.dylib 0x000defe6 rb_yield + 33 >> 25 libruby.1.dylib 0x000bfc7d rb_ary_each + 30 >> 26 libruby.1.dylib 0x000da14c >> rb_eval_string_wrap + >> 16637 >> 27 libruby.1.dylib 0x000dad2a >> rb_eval_string_wrap + >> 19675 >> 28 libruby.1.dylib 0x000d809a >> rb_eval_string_wrap + >> 8267 >> 29 libruby.1.dylib 0x000d70d7 >> rb_eval_string_wrap + >> 4232 >> 30 libruby.1.dylib 0x000e702e rb_load_protect >> + 298 >> 31 libruby.1.dylib 0x000e705f ruby_exec + 22 >> 32 libruby.1.dylib 0x000e708b ruby_run + 42 >> 33 ruby 0x00001fff 0x1000 + 4095 >> 34 ruby 0x00001fa6 start + 54 >> >> Thread 0 crashed with X86 Thread State (32-bit): >> eax: 0x00000000 ebx: 0x93d94639 ecx: 0xbfffd65c edx: 0x93d15e42 >> edi: 0xa04a2690 esi: 0x00000010 ebp: 0xbfffd678 esp: 0xbfffd65c >> ss: 0x0000001f efl: 0x00000286 eip: 0x93d15e42 cs: 0x00000007 >> ds: 0x0000001f es: 0x0000001f fs: 0x00000000 gs: 0x00000037 >> cr2: 0xffe2aa74 >> >> Binary Images: >> 0x1000 - 0x1ffe +ruby ??? (???) >> <660a81a680415ef4ca4d85d3104eed85> >> /usr/bin/ruby >> 0x3a000 - 0x3bffc thread.bundle ??? (???) >> >> /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/ >> 1.8/universal-darwin9.0/thread.bundle >> 0x98000 - 0x9affd stringio.bundle ??? (???) >> <6ef963050f33481408e309d4ac8a06c7> >> /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/ >> 1.8/universal-darwin9.0/stringio.bundle >> 0xbf000 - 0x15cffb libruby.1.dylib ??? (???) >> >> /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ >> libruby.1.dylib >> 0x17e000 - 0x191ff7 syck.bundle ??? (???) >> <12c497c718eb3c5b47d3f286b531dfc4> >> /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/ >> 1.8/universal-darwin9.0/syck.bundle >> 0x197000 - 0x197ffc etc.bundle ??? (???) >> >> /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/ >> 1.8/universal-darwin9.0/etc.bundle >> 0x19b000 - 0x1b4ffc +libxml_ruby.bundle ??? (???) >> <7dadcfbc46249cc29bd4d9451269cba6> >> /Library/Ruby/Gems/1.8/gems/libxml-ruby-0.9.8/lib/libxml_ruby.bundle >> 0x8fe00000 - 0x8fe2db43 dyld 97.1 (???) >> <100d362e03410f181a34e04e94189ae5> >> /usr/lib/dyld >> 0x90853000 - 0x9085afe9 libgcc_s.1.dylib ??? (???) >> /usr/lib/libgcc_s.1.dylib >> 0x90ab6000 - 0x90beeff7 libicucore.A.dylib ??? (???) >> <18098dcf431603fe47ee027a60006c85> /usr/lib/libicucore.A.dylib >> 0x9151a000 - 0x915fbff7 libxml2.2.dylib ??? (???) >> /usr/lib/libxml2.2.dylib >> 0x93ca7000 - 0x93e0eff3 libSystem.B.dylib ??? (???) >> /usr/lib/libSystem.B.dylib >> 0x95416000 - 0x95424ffd libz.1.dylib ??? (???) >> <5ddd8539ae2ebfd8e7cc1c57525385c7> /usr/lib/libz.1.dylib >> 0x95a2f000 - 0x95a8cffb libstdc++.6.dylib ??? (???) >> <04b812dcec670daa8b7d2852ab14be60> /usr/lib/libstdc++.6.dylib >> 0x95d38000 - 0x95d3cfff libmathCommon.A.dylib ??? (???) >> /usr/lib/system/libmathCommon.A.dylib >> 0x95f97000 - 0x9608bff4 libiconv.2.dylib ??? (???) >> /usr/lib/libiconv.2.dylib >> 0xffff0000 - 0xffff1780 libSystem.B.dylib ??? (???) >> /usr/lib/libSystem.B.dylib >> >> Any ideas? >> Kind regards, >> >> Jacob >> >> _______________________________________________ >> libxml-devel mailing list >> libxml-devel at rubyforge.org >> http://rubyforge.org/mailman/listinfo/libxml-devel >> > _______________________________________________ > libxml-devel mailing list > libxml-devel at rubyforge.org > http://rubyforge.org/mailman/listinfo/libxml-devel -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 1933 bytes Desc: not available URL: From fritzvl at gmail.com Tue Feb 24 15:45:31 2009 From: fritzvl at gmail.com (fritzvl) Date: Tue, 24 Feb 2009 22:45:31 +0200 Subject: [libxml-devel] Installation problem - ruby 1.9.0 on Ubuntu 8.10 Message-ID: <49A45C6B.80408@gmail.com> After some pause I returned to the extension building problem. So version 0.9.7 has been build successfully.I just add an include of stdlib.h by default into my /usr/../i486-linux/rbconfig.rb, and solved the problem. With 0.9.8 i have to change include path for st.h to "ruby/st.h"in the : ruby_xml_xpath_context.c and ruby_xml_document.c Then i run make mannualy ang got the error message : ruby_xml_node.c: In function ?rxml_constant_stringref?: ruby_xml_node.c:1248: error: lvalue required as left operand of assignment ruby_xml_node.c:1249: warning: pointer targets in passing argument 1 of ?strlen? differ in signedness ruby_xml_node.c:1249: error: lvalue required as left operand of assignment ruby_xml_node.c:1250: error: ?struct RString? has no member named ?aux? From skamalakumar at yahoo.com Tue Feb 24 10:15:34 2009 From: skamalakumar at yahoo.com (Kamal) Date: Tue, 24 Feb 2009 07:15:34 -0800 (PST) Subject: [libxml-devel] validating a xml message with XML schema in multiple files Message-ID: Hi All, I am validating a xml message with XML schema in multiple files. XML message has no reference to schema. I am getting "Element : No matching global declaration available for the validation root. schema error" error message. Can you please tell me how to solve it. I have given the source code here. xmlDocPtr schema_doc = xmlReadFile("fixml-main-4-4.xsd", NULL, XML_PARSE_NONET); if (schema_doc == NULL) { /* the schema cannot be loaded or is not well-formed */ return -1; } xmlSchemaParserCtxtPtr parser_ctxt = xmlSchemaNewDocParserCtxt (schema_doc); if (parser_ctxt == NULL) { /* unable to create a parser context for the schema */ xmlFreeDoc(schema_doc); return -2; } xmlSchemaPtr schema = xmlSchemaParse(parser_ctxt); if (schema == NULL) { /* the schema itself is not valid */ xmlSchemaFreeParserCtxt(parser_ctxt); xmlFreeDoc(schema_doc); return -3; } xmlSchemaValidCtxtPtr valid_ctxt = xmlSchemaNewValidCtxt(schema); if (valid_ctxt == NULL) { /* unable to create a validation context for the schema */ xmlSchemaFree(schema); xmlSchemaFreeParserCtxt(parser_ctxt); xmlFreeDoc(schema_doc); return -4; } xmlSchemaSetValidErrors(valid_ctxt, (xmlSchemaValidityErrorFunc) fprintf, (xmlSchemaValidityWarningFunc) fprintf, stderr); doc = xmlReadMemory(buffer, size, "noname.xml", NULL, 0); if (doc == NULL) { fprintf(stderr, "Failed to parse document\n"); return; } #if 1 int nRet; if((nRet = xmlSchemaValidateDoc(valid_ctxt, doc)) != 0) { fprintf(stderr, "schema error\n"); return; } #endif cur = xmlDocGetRootElement(doc); if (cur == NULL) { fprintf(stderr,"empty document\n"); xmlFreeDoc(doc); return; }