[libxml-devel] Improving memory usage in libxml-devel

Joe Khoobyar joe at collectivex.com
Mon Feb 9 22:16:12 EST 2009


The one thing I didn't include in the post is the output of "vmmap" on 
OS X prior to this change.

It showed large amounts of memory both marked as "(freed)" and some 
not.  That pattern reminded me of internal fragmentation in a malloc 
implementation, which is a standard problem to be overcome when 
implementing one.  If anyone is interested, I could post that as well.

*Joe Khoobyar
*

Chief Technical Officer & Lead Developer

CollectiveX - /Groups that work!/

mobile:   585.245.2902     

email:     joe at collectivex.com <mailto:joe at collectivex.com>

web:        www.collectivex.com 
<http://www.collectivex.com/>     www.groupsites.com 
<http://www.groupsites.com/>

The third-rate mind is only happy when it is thinking with the majority.
The second-rate mind is only happy when it is thinking with the minority.
The first-rate mind is only happy when it is thinking.

— A.A. Milne



Charlie Savage wrote:
> Hey Joe,
>
>> Revision 783 in subversion applies this change.  With this change, 
>> test08.rb holds about 13 MB on a Mac, where it held over 120 MB before.
>
> Ah, interesting.  It is amazing how much memory libxml seems to grab 
> on simple test cases.
>
>> We are simply using libxml2's memory management hooks to direct it's 
>> alloc/free functions and related functions to ruby's versions of 
>> those functions.  What I originally planned (which is why I'm using 
>> xmlGcMemSetup) was to direct what libxml2 calls "atomic" memory 
>> allocations to the version of the ruby's allocator which is able to 
>> run the garbage collector. 
>
> > Basically what this does is have both libxml2 and ruby use the same
> > allocator and allow ruby to run it's GC even in response to libxml2
> > allocations which keeps memory usage down much more easily.  I'm 
> already
> > running this on all of our production servers and it yielded instant
> > benefits.
>
> What I gather from this is that somehow libxml is not freeing memory 
> that it allocates.  Except that memory is freed, albeit indirectly, 
> when Ruby when it runs a garbage collection freeing Ruby objects and 
> then the underlying libxml objects.  And from using valgrind, I 
> haven't seen any large memory leaks in libxml-ruby for quite a while.
>
> So I'm wondering what is going on.  Does libxml just allocate a big 
> chunk of memory at startup, and then allocates/frees from that memory 
> over the period that a process runs?  And by telling libxml to use 
> Ruby's allocator, then that issue goes away?  If that's the case, is 
> there some setting to libxml to set the starting memory.  Or is there 
> something else entirely different going on?
>
> And is there any downside to having libxml get memory from Ruby? 
> Something along the lines that ruby's memory allocation wouldn't be as 
> efficient for libxml's usage patterns?
>
> I'd just like to understand this a bit better since its such a big 
> change (not code wise, but running wise).  Should we hedge our bets 
> and make it settable somehow - is that even possible?
>
>> Just wondering:  does anyone have any feedback on this change or on 
>> this subject?   FYI:  I've been running it on 32-bit ruby on OS X 
>> Leopard and on several 64-bit UNIX instances for half a week now.
>
> Yeah, I'd have to give it a try on Windows.  Theoretically, it sounds 
> fine, but we'll have to see what really happens.
>
> Anyway, excellent work to figure this out!
>
> Charlie
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://rubyforge.org/pipermail/libxml-devel/attachments/20090209/17a296e2/attachment-0001.html>


More information about the libxml-devel mailing list