On Tue, 30 Nov 2004, Keith Packard wrote:
>
> I've been using a garbage collector in C for about twenty years which is
> different from the Boehm system.
It looks really interesting to me. I'm not sure it would address the
general problems we've had with garbage collection[1], nor that it
offers everything we need for C++.
> advantages:
>
> 1) Completely portable -- no machine dependent code at all
I have to say I find the use of a "reference stack" in avoiding the need
to examine the stack or registers to be very elegant.
> 2) Integrates with malloc/free using code
Hmm. We haven't really had many integration problems with the boehm
collector. About 1/4 of our codebase uses the collector, and the
remainder uses the system allocator (generally via glib).
The one issue that does come to mind is the problem of storing
references to collector-managed objects in malloc-managed memory.
In Inkscape's case we have some collector-managed classes which have
supplementary refcounts. Objects of these hybrid classes are pinned
while they have a nonzero refcount, and can safely be referenced from
untraced memory if refcounts are managed appropriately.
Of course, refcounts used in this fashion can create some interesting
issues with cycles if they're used redundantly with references the
collector knows about -- for example, leaks from incompletely removing
internal refcounting from refcounted classes when they were rewritten to
be such hybrids.
The only other major issue has been pre-existing refcount leaks whose
effects were magnified when the collector was introduced, since the
collector knows about a lot of references naive refcounting could not.
Is there any alternate way the nickle collector could assist us in
marrying the collected and uncollected worlds?
> 3) Precise pointer knowledge - the set of referenced objects is
known,
> not discovered.
That would be exceptionally nice to have, especially as large RGBA image
buffers have a decent chance of looking a lot like arrays of valid
pointers in places.
Right now we have to expose a distinction between "atomic" and
"traced"
classes to deal with that sort of thing; it would be nicer if classes
intrinsically knew to mark themselves appropriately, but things are
certainly managable as-is.
> disadvantages:
>
> 1) Requires stylized function call/return macros both where the
> allocator is used as well as in at least one place above all
> allocation calls in the call graph.
For Inkscape that needn't be as ugly as it might be in C, since in C++
you could do things like this[2]:
class Thing : public Managed { ... };
void fleem() {
MemStackFrame frame;
...
Thing *blah = new Thing();
...
}
Thing *blah() {
MemStackFrame frame;
...
Thing *foo = new Thing();
Thing *bar = new Thing();
...
return frame.retain(something_p() ? foo : bar); // only one kept
}
and would have the advantage of being exception-safe, obeying C++
scoping rules, and so forth.
There would be some issues with this that I'm not immediately sure how
to solve, though:
* implementing other allocation interfaces:
- array new/delete
- an STL allocator
- use in automatic variables
* finalization (required e.g. when collector-managed objects
reference purely refcounted objects)
* situations where 'Managed' isn't the root of a class hierarchy
(e.g. as a result of multiple inheritance) -- we'd basically
need a facility to get the base address of a collector-managed
object, given an interior pointer like boehm's GC_base()
Admittedly I'm not sure how necessary some of these are.
-mental
[1] As far as I can tell there's only one problem we've had that's
specific to the Boehm collector (the allocation hangs), and that one
appears likely to be a bug in the particular version of the collector
rather than an issue with conservative collection.
[2] The implementation, so far as the example goes, could probably be
something like the following...
class MemStackFrame;
class Managed {
public:
void *operator new(size_t size) throw(std::bad_alloc) {
void *header=MemAllocateRef(datatype_(), size);
if (header) {
return object_(mem);
} else {
throw std::bad_alloc();
}
}
void operator delete(void *) { /* do something? */ }
protected:
virtual void mark_() {}
private:
Managed *object_(void *object) {
return (Managed *)((char *)object + sizeof(struct bfree));
}
void *header_(Managed *object) {
return (char *)object - sizeof(struct bfree);
}
static void invoke_mark_(void *object) {
object_(object)->mark_();
}
static DataType *datatype_() {
static DataType type = { &Managed::invoke_mark_, NULL, "C++" };
return &type;
}
friend class MemStackFrame;
};
class MemStackFrame {
public:
MemStackFrame()
: stack_pointer_(STACK_TOP(MemStack)), retained_(NULL) {}
~MemStackFrame() {
if (retained_) {
STACK_RETURN(MemStack, stack_pointer_, retained_);
} else {
STACK_RESET(MemStack, stack_pointer_);
}
}
template <typename T>
T *retain(T *object) {
retained_ = Managed::header_(object);
return object;
}
private:
StackPointer stack_pointer_;
StackElement retained_;
};