From rosco at roscopeco.co.uk Sat Dec 2 05:00:19 2006 From: rosco at roscopeco.co.uk (Ross Bamford) Date: Sat, 02 Dec 2006 10:00:19 -0000 Subject: [libxml-devel] [PATCH] XML::Reader In-Reply-To: References: Message-ID: Hi Laurent, On Thu, 30 Nov 2006 22:09:26 -0000, Laurent Sansonetti wrote: >> There are just a couple of things I'd propose: maybe we should rename >> XML::Reader.walker to XML::Reader.document, and also alias >> XML::Reader.new >> to XML::Reader.string. IMO it seems more consistent (all constructors >> named for the argument they take). > > Good ideas. > > Maybe we could perhaps add XML::Document#reader that would do > XML::Reader.walker(self). > Good thinking, that'd be a handy addition. >>> The patch contains some test cases, but everything is not >>> covered (yet). Also, there is no RDoc comments yet, I would like to >>> be >>> sure that you agree with the API before starting to document it :) >> >> I wonder, any chance you might join the project, commit this patch, >> and >> document it in CVS? That way, you'll be able to help maintain the >> code too >> ;) > > Sure, I would be honored. My rubyforge username is 'lrz'. > Cool. I've added you to the project, and you should have commit rights now, so please go ahead and merge your patch straight to HEAD. Welcome aboard! Cheers, Ross -- Ross Bamford - rosco at roscopeco.co.uk From lrz at chopine.be Mon Dec 4 18:44:17 2006 From: lrz at chopine.be (Laurent Sansonetti) Date: Tue, 5 Dec 2006 00:44:17 +0100 Subject: [libxml-devel] [PATCH] XML::Reader In-Reply-To: References: Message-ID: <4B5F7AE5-0423-41AD-B2BC-302B975D2EDA@chopine.be> Hi Ross, On Dec 2, 2006, at 11:00 AM, Ross Bamford wrote: > On Thu, 30 Nov 2006 22:09:26 -0000, Laurent Sansonetti > > wrote: > >>> There are just a couple of things I'd propose: maybe we should >>> rename >>> XML::Reader.walker to XML::Reader.document, and also alias >>> XML::Reader.new >>> to XML::Reader.string. IMO it seems more consistent (all >>> constructors >>> named for the argument they take). >> >> Good ideas. >> >> Maybe we could perhaps add XML::Document#reader that would do >> XML::Reader.walker(self). > > Good thinking, that'd be a handy addition. I added all of them. >>>> The patch contains some test cases, but everything is not >>>> covered (yet). Also, there is no RDoc comments yet, I would like to >>>> be >>>> sure that you agree with the API before starting to document it :) >>> >>> I wonder, any chance you might join the project, commit this patch, >>> and >>> document it in CVS? That way, you'll be able to help maintain the >>> code too >>> ;) >> >> Sure, I would be honored. My rubyforge username is 'lrz'. >> > > Cool. I've added you to the project, and you should have commit rights > now, so please go ahead and merge your patch straight to HEAD. Welcome > aboard! Excellent! I had trouble checkouting the project in the past days (it looks like I got the same problem with another RubyForge project), but now it's working. I merged the patch along with the suggestions. I will add documentation soon. I filed the CHANGELOG as well though it doesn't seem to be maintained :-) What's the purpose of this file? It is only for major changes? Cheers, Laurent From rosco at roscopeco.co.uk Tue Dec 5 11:34:00 2006 From: rosco at roscopeco.co.uk (Ross Bamford) Date: Tue, 05 Dec 2006 16:34:00 -0000 Subject: [libxml-devel] [PATCH] XML::Reader In-Reply-To: <4B5F7AE5-0423-41AD-B2BC-302B975D2EDA@chopine.be> References: <4B5F7AE5-0423-41AD-B2BC-302B975D2EDA@chopine.be> Message-ID: On Mon, 04 Dec 2006 23:44:17 -0000, Laurent Sansonetti wrote: > Hi Ross, > > On Dec 2, 2006, at 11:00 AM, Ross Bamford wrote: > >> On Thu, 30 Nov 2006 22:09:26 -0000, Laurent Sansonetti >> >> wrote: >> >>>> There are just a couple of things I'd propose: maybe we should >>>> rename >>>> XML::Reader.walker to XML::Reader.document, and also alias >>>> XML::Reader.new >>>> to XML::Reader.string. IMO it seems more consistent (all >>>> constructors >>>> named for the argument they take). >>> >>> Good ideas. >>> >>> Maybe we could perhaps add XML::Document#reader that would do >>> XML::Reader.walker(self). >> >> Good thinking, that'd be a handy addition. > > I added all of them. > >>>>> The patch contains some test cases, but everything is not >>>>> covered (yet). Also, there is no RDoc comments yet, I would like to >>>>> be >>>>> sure that you agree with the API before starting to document it :) >>>> >>>> I wonder, any chance you might join the project, commit this patch, >>>> and >>>> document it in CVS? That way, you'll be able to help maintain the >>>> code too >>>> ;) >>> >>> Sure, I would be honored. My rubyforge username is 'lrz'. >>> >> >> Cool. I've added you to the project, and you should have commit rights >> now, so please go ahead and merge your patch straight to HEAD. Welcome >> aboard! > > Excellent! I had trouble checkouting the project in the past days (it > looks like I got the same problem with another RubyForge project), but > now it's working. I merged the patch along with the suggestions. I > will add documentation soon. > Excellent, it all looks good from this end. I can't tell you how great it is that this is now in :) > I filed the CHANGELOG as well though it doesn't seem to be > maintained :-) What's the purpose of this file? It is only for major > changes? > It probably should be updated, but I tend to forget when I'm working on something in public version control (it's easy to grab a changelog from cvs or svn anyway). I guess it does make life easier for distribution package maintainers though, so I'll get around to updating it from the CVS logs in the next few days. Cheers, -- Ross Bamford - rosco at roscopeco.co.uk From rosco at roscopeco.co.uk Thu Dec 21 19:10:09 2006 From: rosco at roscopeco.co.uk (Ross Bamford) Date: Fri, 22 Dec 2006 00:10:09 -0000 Subject: [libxml-devel] 0.4 release Message-ID: Hi, As you know, I've been working on a few things for the forthcoming 0.4 release, and recently there've been a few fixes and features gone into CVS, including Laurent's XML::Reader and the HTML parser. There is more that I had on my list of things to do, but unfortunately my time has become very limited at the moment and it's looking like I won't be getting the block of time I was hoping for over christmas. In light of this, I'm now of the opinion that we should get the 0.4 release out, since most of the remaining changes will be purely additive, and I feel it's best to get the new features out there so people can start hammering on them. I would like to have Laurent's documentation in, but I don't think it's critical at this stage - the reader API is nice and self-explanatory I think, and the test-cases for it show it in action, so I think it's okay to work on that during 0.4 development. Looking ahead a bit, I propose that we work on the remaining features and fixes (win32 compatibility, namespaces, and relaxng being my personal big three, but there are others...) during 0.4, heading toward stabilising the code with a view to branching off to a 1.0 release eventually. As a side-note to the win32 compatibility issue, I've recently made some changes that make the codebase *theoretically* compatible (i.e. I can't actually test it at the present time) with VC 6 on Win32, so anyone out there with the appropriate microsoftware should be able to compile up binaries for the one-click. I'm trying to work toward providing pre-compiled binaries for win32/one-click, but I still have some figuring out to do on that platform, and not much time to do it right now. Any thoughts on any of this would be most welcome. Cheers, -- Ross Bamford - rosco at roscopeco.co.uk From rosco at roscopeco.co.uk Fri Dec 22 05:49:05 2006 From: rosco at roscopeco.co.uk (Ross Bamford) Date: Fri, 22 Dec 2006 10:49:05 -0000 Subject: [libxml-devel] 0.4 release In-Reply-To: References: Message-ID: Hi, On Fri, 22 Dec 2006 00:10:09 -0000, Ross Bamford wrote: > As you know, I've been working on a few things for the forthcoming 0.4 > release, and recently there've been a few fixes and features gone into > CVS, including Laurent's XML::Reader and the HTML parser. There is more > that I had on my list of things to do, but unfortunately my time has > become very limited at the moment and it's looking like I won't be > getting > the block of time I was hoping for over christmas. In light of this, I'm > now of the opinion that we should get the 0.4 release out, since most of > the remaining changes will be purely additive, and I feel it's best to > get > the new features out there so people can start hammering on them. > As an update to this, I've been talking publicity and bug days with Pat Eyler (check out [1] if you've not yet seen it) and he recommends that we make the next release a 0.4 preview, and collect the small bug reports from that for a bug day in the near future. I think that makes sense, so I'm going to make a start on getting that set up. We're also talking about other bug day activities that might help get people excited about / involved in the project - like extending test coverage and writing documentation. We're lucky to have Pat's help on this, because it's hardly my strong suit :). Any other suggestions anyone has would be most welcome. I'm not sure yet exactly when this would happen, but it'd be soon - we don't want to be keeping even small bugs hanging around for too long. Thanks, Ross [1]: http://on-ruby.blogspot.com/2006/12/not-just-every-day-its-bug-day.html -- Ross Bamford - rosco at roscopeco.co.uk From rosco at roscopeco.co.uk Tue Dec 26 10:31:09 2006 From: rosco at roscopeco.co.uk (Ross Bamford) Date: Tue, 26 Dec 2006 15:31:09 -0000 Subject: [libxml-devel] Libxml-ruby 0.4.0pre01 Message-ID: Hi, I've just released 0.4.0pre01 (in a new development package on Rubyforge). This release is purely a snapshot from current CVS head. Don't use it in production! Happy holidays, Ross -- Ross Bamford - rosco at roscopeco.co.uk From ast at atownley.org Wed Dec 27 05:22:21 2006 From: ast at atownley.org (Andrew S. Townley) Date: Wed, 27 Dec 2006 10:22:21 +0000 Subject: [libxml-devel] libxml2 SAX2 support Message-ID: <1167214940.9022.36.camel@macross> Hi, I'm looking for a good SAX implementation for Ruby to do some XML parsing. I see from the mailing list archives you've just added the pull-based XML::Reader based on the Microsoft C# API, but I want "real" SAX-based event-driven parsing. I also noticed that the current support for this in the library is based on the now deprecated SAXv1 interface rather than the newer SAX2 interface. Is there any plans to migrate this or change it? Also, I think it would be worthwhile implementing a filter mechanism similar to the Java SAX API's XMLReader, XMLFilter and friends (originally, I saw the XML::Reader subject in the archives and thought it may have been already done). Any info would be most appreciated. My C is really rusty (did a little about 3 years ago, but haven't seriously done anything with it for about 6 years), and I've never written a Ruby extension before. However, I'd really rather use Ruby for my project than Python (which already has all of this stuff) because I just like Ruby better. I might be able to help, but I probably won't be the most efficient person in the world to do this. Thanks in advance, ast -- Andrew S. Townley http://atownley.org From transfire at gmail.com Thu Dec 28 07:26:03 2006 From: transfire at gmail.com (TRANS) Date: Thu, 28 Dec 2006 07:26:03 -0500 Subject: [libxml-devel] libxml2 SAX2 support In-Reply-To: <1167214940.9022.36.camel@macross> References: <1167214940.9022.36.camel@macross> Message-ID: <4b6f054f0612280426t5687be64w2630e1c896adb0dd@mail.gmail.com> On 12/27/06, Andrew S. Townley wrote: > Hi, > > I'm looking for a good SAX implementation for Ruby to do some XML > parsing. I see from the mailing list archives you've just added the > pull-based XML::Reader based on the Microsoft C# API, but I want "real" > SAX-based event-driven parsing. What are you planning to do with it? I'm curious, is it because you prefer that way of doing it, or is there a techincal reason. > I also noticed that the current support for this in the library is based > on the now deprecated SAXv1 interface rather than the newer SAX2 > interface. Is there any plans to migrate this or change it? Also, I > think it would be worthwhile implementing a filter mechanism similar to > the Java SAX API's XMLReader, XMLFilter and friends (originally, I saw > the XML::Reader subject in the archives and thought it may have been > already done). I don't know enought about this to say, but since this is a binding to libxml, what is libxml's support of this? /trans From ast at atownley.org Thu Dec 28 11:39:57 2006 From: ast at atownley.org (Andrew S. Townley) Date: Thu, 28 Dec 2006 16:39:57 +0000 Subject: [libxml-devel] libxml2 SAX2 support In-Reply-To: <4b6f054f0612280426t5687be64w2630e1c896adb0dd@mail.gmail.com> References: <1167214940.9022.36.camel@macross> <4b6f054f0612280426t5687be64w2630e1c896adb0dd@mail.gmail.com> Message-ID: <1167323996.9022.154.camel@macross> On Thu, 2006-12-28 at 12:26, TRANS wrote: > On 12/27/06, Andrew S. Townley wrote: > > Hi, > > > > I'm looking for a good SAX implementation for Ruby to do some XML > > parsing. I see from the mailing list archives you've just added the > > pull-based XML::Reader based on the Microsoft C# API, but I want "real" > > SAX-based event-driven parsing. > > What are you planning to do with it? I'm curious, is it because you > prefer that way of doing it, or is there a techincal reason. A bit of both :) I want to be able to efficiently process some arbitrarily large documents, but I also want to be able to provide some processing filters to perform some normalization somewhat equivalent to the C14N/XCL-C14N process as well as some other, non-XSLT types of transformations. I've also done quite a bit of this with Java, so it is a model which is familiar to me, and, once you get used to it, I find that it's easier to deal with than pull parsing for these types of tasks. There's also a desire to not tie myself too tightly to a particular XML parser implementation because certain ones are better for certain tasks. Ideally, I'd like to see a Ruby XML::Parser::SAX module which would either be implemented by or at least support a number of different parsers, based on what you needed at the time. > > > I also noticed that the current support for this in the library is based > > on the now deprecated SAXv1 interface rather than the newer SAX2 > > interface. Is there any plans to migrate this or change it? Also, I > > think it would be worthwhile implementing a filter mechanism similar to > > the Java SAX API's XMLReader, XMLFilter and friends (originally, I saw > > the XML::Reader subject in the archives and thought it may have been > > already done). > > I don't know enought about this to say, but since this is a binding to > libxml, what is libxml's support of this? libxml2's support of SAX2 seems to be reasonably complete. I haven't looked at it under the magnifying glass yet, though. As long as the base events are there, implementing stuff like the SAX XMLReader/XMLFilter Java API and friends would be done on top of the C code using Ruby. There is some support for filtering, but I'd need to look at how hard it would be to integrate what's there with Ruby. If I had to do it myself, I'd be able to do it quicker in Ruby than in C because I'm out of practice, and I've never worked with extending Ruby in C before. However, from the base library support, there's no current code that I could find in this project that uses the new SAX2 API in libxml2 (http://xmlsoft.org/html/libxml-SAX2.html). I keep coming back to a personal Ruby vs. Python debate for some projects that I want to work on. Python has pretty rich XML support, but I don't like it as much as Ruby. However, I don't want to have to start building the project from the ground up, including all of the supporting libraries. There's also the Unicode thing in Ruby, but I can probably live with the double-byte hacks. I'm also more current in Ruby than Python since it's been 4 years since I did much with Python, but it'll probably boil down to how much I'm really going to need to write before I can focus on the problem I want to try and solve. Cheers, ast -- Andrew S. Townley http://atownley.org From rosco at roscopeco.co.uk Fri Dec 29 08:04:54 2006 From: rosco at roscopeco.co.uk (Ross Bamford) Date: Fri, 29 Dec 2006 13:04:54 -0000 Subject: [libxml-devel] libxml2 SAX2 support In-Reply-To: <1167214940.9022.36.camel@macross> References: <1167214940.9022.36.camel@macross> Message-ID: On Wed, 27 Dec 2006 10:22:21 -0000, Andrew S. Townley wrote: > Hi, > > I'm looking for a good SAX implementation for Ruby to do some XML > parsing. I see from the mailing list archives you've just added the > pull-based XML::Reader based on the Microsoft C# API, but I want "real" > SAX-based event-driven parsing. > Are you sure? The new reader API is nice... :) > I also noticed that the current support for this in the library is based > on the now deprecated SAXv1 interface rather than the newer SAX2 > interface. Is there any plans to migrate this or change it? Also, I > think it would be worthwhile implementing a filter mechanism similar to > the Java SAX API's XMLReader, XMLFilter and friends (originally, I saw > the XML::Reader subject in the archives and thought it may have been > already done). > > Any info would be most appreciated. My C is really rusty (did a little > about 3 years ago, but haven't seriously done anything with it for about > 6 years), and I've never written a Ruby extension before. However, I'd > really rather use Ruby for my project than Python (which already has all > of this stuff) because I just like Ruby better. I might be able to > help, but I probably won't be the most efficient person in the world to > do this. > There are no current plans integrate the SAX2 interface from libxml2, but that's not to say there never will be. If you're willing to help there's a better chance it'll happen, but right now I don't have the time myself to work on it. I think it's a great idea to Alternatively, you could file a feature request at [1], so anyone else who wants this will know they're not alone (and so will we ;)). Cheers, Ross [1] http://rubyforge.org/tracker/?atid=1974&group_id=494&func=browse -- Ross Bamford - rosco at roscopeco.co.uk