Re: C-based garbage collection for Inkscape
Thanks Keith, passing along to the crew...
Bryce
On Tue, 30 Nov 2004, Keith Packard wrote:
I've been using a garbage collector in C for about twenty years which is different from the Boehm system.
advantages:
- Completely portable -- no machine dependent code at all
- Integrates with malloc/free using code
- Precise pointer knowledge - the set of referenced objects is known,
not discovered.
disadvantages:
- Requires stylized function call/return macros both where the
allocator is used as well as in at least one place above all allocation calls in the call graph.
It's part of the nickle language implementation (http://nickle.org) and has been used in several other projects to good effect.
-keith
On Tue, 30 Nov 2004, Keith Packard wrote:
I've been using a garbage collector in C for about twenty years which is different from the Boehm system.
It looks really interesting to me. I'm not sure it would address the general problems we've had with garbage collection[1], nor that it offers everything we need for C++.
advantages:
- Completely portable -- no machine dependent code at all
I have to say I find the use of a "reference stack" in avoiding the need to examine the stack or registers to be very elegant.
- Integrates with malloc/free using code
Hmm. We haven't really had many integration problems with the boehm collector. About 1/4 of our codebase uses the collector, and the remainder uses the system allocator (generally via glib).
The one issue that does come to mind is the problem of storing references to collector-managed objects in malloc-managed memory.
In Inkscape's case we have some collector-managed classes which have supplementary refcounts. Objects of these hybrid classes are pinned while they have a nonzero refcount, and can safely be referenced from untraced memory if refcounts are managed appropriately.
Of course, refcounts used in this fashion can create some interesting issues with cycles if they're used redundantly with references the collector knows about -- for example, leaks from incompletely removing internal refcounting from refcounted classes when they were rewritten to be such hybrids.
The only other major issue has been pre-existing refcount leaks whose effects were magnified when the collector was introduced, since the collector knows about a lot of references naive refcounting could not.
Is there any alternate way the nickle collector could assist us in marrying the collected and uncollected worlds?
- Precise pointer knowledge - the set of referenced objects is known,
not discovered.
That would be exceptionally nice to have, especially as large RGBA image buffers have a decent chance of looking a lot like arrays of valid pointers in places.
Right now we have to expose a distinction between "atomic" and "traced" classes to deal with that sort of thing; it would be nicer if classes intrinsically knew to mark themselves appropriately, but things are certainly managable as-is.
disadvantages:
- Requires stylized function call/return macros both where the
allocator is used as well as in at least one place above all allocation calls in the call graph.
For Inkscape that needn't be as ugly as it might be in C, since in C++ you could do things like this[2]:
class Thing : public Managed { ... };
void fleem() { MemStackFrame frame; ... Thing *blah = new Thing(); ... }
Thing *blah() { MemStackFrame frame; ... Thing *foo = new Thing(); Thing *bar = new Thing(); ... return frame.retain(something_p() ? foo : bar); // only one kept }
and would have the advantage of being exception-safe, obeying C++ scoping rules, and so forth.
There would be some issues with this that I'm not immediately sure how to solve, though:
* implementing other allocation interfaces: - array new/delete - an STL allocator - use in automatic variables
* finalization (required e.g. when collector-managed objects reference purely refcounted objects)
* situations where 'Managed' isn't the root of a class hierarchy (e.g. as a result of multiple inheritance) -- we'd basically need a facility to get the base address of a collector-managed object, given an interior pointer like boehm's GC_base()
Admittedly I'm not sure how necessary some of these are.
-mental
[1] As far as I can tell there's only one problem we've had that's specific to the Boehm collector (the allocation hangs), and that one appears likely to be a bug in the particular version of the collector rather than an issue with conservative collection.
[2] The implementation, so far as the example goes, could probably be something like the following...
class MemStackFrame;
class Managed { public: void *operator new(size_t size) throw(std::bad_alloc) { void *header=MemAllocateRef(datatype_(), size); if (header) { return object_(mem); } else { throw std::bad_alloc(); } } void operator delete(void *) { /* do something? */ }
protected: virtual void mark_() {}
private: Managed *object_(void *object) { return (Managed *)((char *)object + sizeof(struct bfree)); } void *header_(Managed *object) { return (char *)object - sizeof(struct bfree); } static void invoke_mark_(void *object) { object_(object)->mark_(); } static DataType *datatype_() { static DataType type = { &Managed::invoke_mark_, NULL, "C++" }; return &type; }
friend class MemStackFrame; };
class MemStackFrame { public: MemStackFrame() : stack_pointer_(STACK_TOP(MemStack)), retained_(NULL) {}
~MemStackFrame() { if (retained_) { STACK_RETURN(MemStack, stack_pointer_, retained_); } else { STACK_RESET(MemStack, stack_pointer_); } }
template <typename T> T *retain(T *object) { retained_ = Managed::header_(object); return object; }
private: StackPointer stack_pointer_; StackElement retained_; };
Around 23 o'clock on Nov 30, MenTaLguY wrote:
It looks really interesting to me. I'm not sure it would address the general problems we've had with garbage collection[1], nor that it offers everything we need for C++.
I don't know that it is a good solution, I've just heard from other people using the Boehm collector that it had "issues" which my little GC system appears to solve and was wondering if it might be of interest to you as well.
I have to say I find the use of a "reference stack" in avoiding the need to examine the stack or registers to be very elegant.
I was much more interested in portability than transparent use, and with some practice, it's not that hard to use the macros. One nice thing is that forgettin to use the macros causes no harm, it just delays the collection of garbage collected below the function failing to invoke the macros.
The one issue that does come to mind is the problem of storing references to collector-managed objects in malloc-managed memory.
And this is easy enough with the nickle collector -- you create a synthetic object and add it as another root in the system. The 'mark' function then walks the malloced memory to reference the gc'able objects. The malloc'ed storage need never know that it's being used in this fashion.
Is there any alternate way the nickle collector could assist us in marrying the collected and uncollected worlds?
I think the fact that the mark functions are under your control could yield some benefits, it permits the kind of redirection suggested above.
That would be exceptionally nice to have, especially as large RGBA image buffers have a decent chance of looking a lot like arrays of valid pointers in places.
heh. One wonders how much of the GC time is spent wandering through non-pointer data. Think of the cache thrashing when you hit something like this which isn't explicitly marked as non-pointer data...
- implementing other allocation interfaces:
- array new/delete
- an STL allocator
- use in automatic variables
I don't know enough about C++ to understand why this is a special case...
- finalization (required e.g. when collector-managed objects reference purely refcounted objects)
The nickle collector includes finalizers.
- situations where 'Managed' isn't the root of a class hierarchy (e.g. as a result of multiple inheritance) -- we'd basically need a facility to get the base address of a collector-managed object, given an interior pointer like boehm's GC_base()
The collector used to have this ability; each malloc'd block of memory (containing multiple objects) was stored in a balanced binary tree so that you could search for an object by interior address. I used to need this when the collector was part of a lisp implementation that exposed strings as cdr'able objects. Nickle no longer needs this, so I ripped it out.
Restoring this to the collector would be straightforward, it would mostly be a matter of going back to a previous version in CVS and stealing the AVL code. Although, I'd prefer to reimplement it using skip lists as AVL trees are difficult to walk in-order without recursion, while skip lists are trivial.
Thanks much for taking time to look at the code; the prototype C++ code looks a lot easier to use than the C macros. Note that you'll still want to use the macros in places like:
a () { return new a_type; } b () { return new b_type; } c () { return new c_type; } bar () { operate (a (), b(), c()); } foo () { for (i = 0; i < 100; i++) bar (); }
Failing to use the macros in bar will result in a large pile of uncollectable garbage until foo returns to some higher level function which does use the macros. This is probably the most annoying part of the whole system.
Oh, one thing I used to do was erase newly allocated memory. This was a huge simplification as it meant you didn't need to be particularily careful about allocation and initialization order. As it is, any potential pointer values in newly allocated data must be initialized before the next allocation call is made. This turned out to be a non-trivial source of bugs when I stopped calling memset. I'm not sure it's worth the modest performance improvements myself.
-keith
On Wed, 2004-12-01 at 01:07, Keith Packard wrote:
I have to say I find the use of a "reference stack" in avoiding the need to examine the stack or registers to be very elegant.
I was much more interested in portability than transparent use, and with some practice, it's not that hard to use the macros. One nice thing is that forgettin to use the macros causes no harm, it just delays the collection of garbage collected below the function failing to invoke the macros.
That's part of the beauty of it -- I am very fond of intrinsically fail-safe designs. Sadly they're rather rare in the software world ("defensive programming" doesn't count).
The one issue that does come to mind is the problem of storing references to collector-managed objects in malloc-managed memory.
And this is easy enough with the nickle collector -- you create a synthetic object and add it as another root in the system. The 'mark' function then walks the malloced memory to reference the gc'able objects. The malloc'ed storage need never know that it's being used in this fashion.
That wouldn't always be possible; in our case the referencing malloc()ed memory is often abstracted away in e.g. a glib or sigc++ closure. In those instances there is generally no way for such a 'mark' function to get at it.
The only general solution I've found so far is to create synthetic "anchor" objects which are attached to a root and contain a refcount and a pointer to the managed memory. Usually the interfaces provided by the closures or whatever are sufficient to manage a refcount.
That would be exceptionally nice to have, especially as large RGBA image buffers have a decent chance of looking a lot like arrays of valid pointers in places.
heh. One wonders how much of the GC time is spent wandering through non-pointer data. Think of the cache thrashing when you hit something like this which isn't explicitly marked as non-pointer data...
Supposedly it isn't a big deal in practice, but I still avoided using it as a wholesale malloc replacement for that reason.
As things stand right now, GTK and other libraries still use the system allocator for everything. In our own code we use either the system allocator or GC_malloc_atomic() for strings and image data (the latter function marks the allocated memory as containing non-pointer data).
The boehm collector does also provide a facility for typed allocation, but with this approach we've never needed to resort to it. It does some clever things like blacklisting unallocated ranges while bogus pointers or pointer-like values
- implementing other allocation interfaces:
- array new/delete
- an STL allocator
The fundamental problem in these two cases is that no type information is provided to the allocator -- the only information available is the raw size of the memory requested. Figuring out what's actually been put in the memory and how to mark it gets painful.
For example, if I use the new[] operator to allocate an array of 32 objects of some type (which happens to have a size of 4 bytes), all the function implementing the operator will be told is that 128 bytes were requested. For all it knows, the requested memory might get used for e.g. 16 8-byte objects.
In the case of an STL allocator, I don't think it's even guaranteed that the objects placed in the requested memory will be homogenously typed or sized. I know it's certainly possible that they will not be homogenously spaced nor the memory block fully used.
The simple new/delete case isn't really different, but there we could at least assume that the memory would only contain one object and that its address would lie at the start of the allocated block, which would let us find a pointer to its 'mark' function easily enough.
In a nutshell, C++ is really unkind to typed allocators.
- use in automatic variables
Whether stack-allocated objects are a problem depends on the approach taken in marrying the C++ type system to the allocator's.
Some approaches, like the sketch in my last email (which has some bugs, so don't read too much into it), place the struct bfree (at least the type pointer; I am assuming the 'next' pointer was safe to reuse when it wasn't in a free list) before the start of the object. In those cases, a stack-allocated object will be missing the bfree "header".
Thanks much for taking time to look at the code; the prototype C++ code looks a lot easier to use than the C macros. Note that you'll still want to use the macros in places like:
a () { return new a_type; } b () { return new b_type; } c () { return new c_type; } bar () { operate (a (), b(), c()); } foo () { for (i = 0; i < 100; i++) bar (); }
Failing to use the macros in bar will result in a large pile of uncollectable garbage until foo returns to some higher level function which does use the macros. This is probably the most annoying part of the whole system.
Hmm. One of the reasons I adopted a garbage collector was to facilitate writing in a functional style which would unfortunately be susceptible to this sort of problem.
Also in some cases the equivalent of bar() may be a generic function written by someone else (e.g. an STL generic algorithm). In those cases it might not always be appropriate to set up a reference stack frame, and we couldn't anyway if we wanted to.
In this specific example, though, I don't believe the macros are necessary. I would expect this to work:
void bar() { MemStackFrame frame; operate(a(), b(), c()); }
(MemStack would be reset by MemStackFrame::~MemStackFrame() when 'frame' goes out of scope)
Oh, one thing I used to do was erase newly allocated memory. This was a huge simplification as it meant you didn't need to be particularily careful about allocation and initialization order. As it is, any potential pointer values in newly allocated data must be initialized before the next allocation call is made. This turned out to be a non-trivial source of bugs when I stopped calling memset. I'm not sure it's worth the modest performance improvements myself.
At least it's more portable. As you probably know, at least on paper memset() isn't a portable way to initialize pointers.
[ For those following along at home: the compiler will maintain the invariant NULL == (void *)0 mandated by the C and C++ languages, but on some (rare) architectures the NULL pointer is not the zero address, requring gymnastics on the compiler's part which would be defeated by the use of memset(). ]
In my experience another problem with the allocator zeroing memory is social: bugs tend to creep in if folks are relying on the allocator for implicit initialization. I think this is partly because explicit initialization forces people to think through the initial state of the object, and serves something of a documentation role as well.
In Inkscape I actually fill memory allocated for some classes with garbage before their constructor is called. It's helped shake out some subtle bugs that were previously flying below the radar because zero was close to a sensible value (and pages claimed from the OS are often initally zero-filled).
Er, forgive my curmudgeonly ranting. ^^; I still wonder whether there might be another good conservative solution to the initialization/allocation ordering problem...
-mental
Around 1 o'clock on Dec 2, MenTaLguY wrote:
The only general solution I've found so far is to create synthetic "anchor" objects which are attached to a root and contain a refcount and a pointer to the managed memory. Usually the interfaces provided by the closures or whatever are sufficient to manage a refcount.
That's what I tried to suggest, but your wording is quite a bit clearer...
Hmm. One of the reasons I adopted a garbage collector was to facilitate writing in a functional style which would unfortunately be susceptible to this sort of problem.
Except for loops (and recursion), the overhead is all constant. For example, Nickle does lots of stuff like:
av = Reduce (NewInteger (sign, NaturalRsl (IntegerMag(IntegerRep.promote (av,0)), b)));
which does leave four extra values referenced temporarily, but those are dereferenced on exit from this function. When necessary, you can also just explicitly bracket sequences of code with ENTER()/EXIT() macros; obviously a pain, but at least you can manage recursion and iteration without running out of memory.
In this specific example, though, I don't believe the macros are necessary. I would expect this to work:
void bar() { MemStackFrame frame; operate(a(), b(), c()); }
Yes, that's clearly the way to manage this in C++.
At least it's more portable. As you probably know, at least on paper memset() isn't a portable way to initialize pointers.
When I find a modern machine for which memset doesn't work, I'll alert the media, but I agree it's nicer to just not assume things like this. The real reason for the change was to avoid ripping across large chunks of ram for no good reason.
In Inkscape I actually fill memory allocated for some classes with garbage before their constructor is called. It's helped shake out some subtle bugs that were previously flying below the radar because zero was close to a sensible value (and pages claimed from the OS are often initally zero-filled).
This was my basic reason -- it provided a clean invarient (all allocated memory is zero) which prevents surprises when you suddenly start re-using previously allocated memory instead of new memory.
Of course, I'd rather have the Nickle behaviour where using uninitialized values cause an exception to be raised, but C isn't that helpful.
Er, forgive my curmudgeonly ranting. ^^; I still wonder whether there might be another good conservative solution to the initialization/allocation ordering problem...
Yes, that would be nice. The real problem was across the transition where the code implicitly assumed that data were initialized to zero (and, of course, where NULL was represented by zero bits). New code is written in a more stylized fashion:
foo = ALLOCATE (&type, sizeof (type)); foo->value1 = 0; foo->pointer1 = NULL;
... code using foo
Which, as you say, provides some level of self-documentation.
-keith
participants (3)
-
Bryce Harrington
-
Keith Packard
-
MenTaLguY