
Migrating to GIO for file handling means modifying the extension system rather extensively, so I think it might be worth refactoring it as well, keeping compatibility with existing extensions.
Currently there is a class for every type of extension, but they don't define much - everything of substance is handled by the "implementation" class which contains methods for everything. A subclass of it is mandatory for every extension. I understand the reason for this was to support XSLT and script extensions transparently, but I don't think this is the proper way to do it. Since each extension only provides one function (either an effect, or an input filter, etc. but not more than one of those types at a time) the proper inheritance would be e.g.: Extension -> Output -> XsltOutput Extension -> Effect -> ScriptEffect Extension -> Input -> SvgzInput
If there are common methods among XSLT or scripted extensions, those could be put in a separate class from which all other extensions implemented in this way derive (multiple inheritance). This wouldn't preclude putting more than one function/effect in the .inx file, since more than one object could be created for a given .inx file by the manager.
There should also be a proper manager object, instead of some things being done via Extension::DB (which BTW is not a singleton but a statically constructed object) and some via global functions.
GIO would be used in input and output extensions - parameters to save() and load() would be GInputStreams or GOutputStreams instead of filenames. This way extensions could save to memory and to remote locations without any special handling in the extensions themselves.
Do you have any other ideas for improvements or changes?
Regards, Krzysztof Kosiński

On Tue, 2009-01-13 at 17:42 -0800, Krzysztof Kosiński wrote:
Migrating to GIO for file handling means modifying the extension system rather extensively, so I think it might be worth refactoring it as well, keeping compatibility with existing extensions.
Of course, I think that is a good idea.
Currently there is a class for every type of extension, but they don't define much - everything of substance is handled by the "implementation" class which contains methods for everything. A subclass of it is mandatory for every extension. I understand the reason for this was to support XSLT and script extensions transparently, but I don't think this is the proper way to do it. Since each extension only provides one function (either an effect, or an input filter, etc. but not more than one of those types at a time) the proper inheritance would be e.g.: Extension -> Output -> XsltOutput Extension -> Effect -> ScriptEffect Extension -> Input -> SvgzInput
Well, if you go back on the Sodipodi archives you'll notice that I fretted about this for a while. It's not clear what the right answer is. For the most part, the larger implementations all implement most of the functions. The smaller (internal) implement just a few. So what's the overhead in the two cases?
In the large single object case we end up with a few extra function pointers that never get used. Let's say an extra 100 bytes of RAM per extension. On the other hand, we burn a lot of extra objects for things like XSLT which will have to create a meta object and have dual inheritance to work right. The complexity overhead is much greater along with the bytes that are wasted on a per object basis.
I'm not sure that there is a clear winner here, but in general, I try to choose simplicity. And the large object was MUCH simpler when doing this is GObject before the whole G_DEFINE_TYPE days.
There should also be a proper manager object, instead of some things being done via Extension::DB (which BTW is not a singleton but a statically constructed object) and some via global functions.
Sure, that's a more standard way to do it. I'm not sure of the gains, but why not, shouldn't be any different.
GIO would be used in input and output extensions - parameters to save() and load() would be GInputStreams or GOutputStreams instead of filenames. This way extensions could save to memory and to remote locations without any special handling in the extensions themselves.
I figured an easier change would be to just get the local filename from GIO because it's gotten by setting up a FUSE filesystem. This would take less code change (less bugs) and I'm not sure that there'd be any more gain.
--Ted

Ted Gould wrote:
Well, if you go back on the Sodipodi archives you'll notice that I fretted about this for a while. It's not clear what the right answer is. For the most part, the larger implementations all implement most of the functions. The smaller (internal) implement just a few. So what's the overhead in the two cases?
In the large single object case we end up with a few extra function pointers that never get used. Let's say an extra 100 bytes of RAM per extension. On the other hand, we burn a lot of extra objects for things like XSLT which will have to create a meta object and have dual inheritance to work right. The complexity overhead is much greater along with the bytes that are wasted on a per object basis.
I'm not sure that there is a clear winner here, but in general, I try to choose simplicity. And the large object was MUCH simpler when doing this is GObject before the whole G_DEFINE_TYPE days.
I think this new design wouldn't burn too much extra space: Let's say we have an XSLT implementation for extensions. This means there is a class for every extension point, currently XsltOutput, XsltInput, XsltEffect and XsltPathEffect (and probably XsltPrint, but it doesn't make much sense to me). All those derive from the appropriate extension point classes and from XsltExtension, which in turn derives from Extension and implements the common methods (probably delegating some of them to subclasses) as well as providing common functions to deal with the stylesheet.
In the current design we have 2 objects per extension (e.g. XSLT implementation and a "facade" class that stands for what this extension does, e.g. Output, and forwards all methods). In the new design we would have just one self-contained object. I'm not sure where's the wasted space - the protected variables can be reused, so the specialized Xslt* classes wouldn't need to define any additional member variables if all they need is an XSLT stylesheet pointer, whereas in the current design all variables any type of XSLT extension needs must be present in all XSLT extensions regardless of their type.
Sure, that's a more standard way to do it. I'm not sure of the gains, but why not, shouldn't be any different.
It would make the code that interfaces with extensions a lot more obvious. Currently it involves a lot of boilerplate, e.g. to find a single extension one has to iterate over a list. There could be methods e.g. getOutputByMimetype() or getInputByFilename() that would return the proper I/O extensions. It would also take care of some things that are done using separate objects, e.g. dependencies, unloading timers etc.
I figured an easier change would be to just get the local filename from GIO because it's gotten by setting up a FUSE filesystem. This would take less code change (less bugs) and I'm not sure that there'd be any more gain.
Using real GIO instead of the FUSE mount would simplify some codepaths, for instance clipboard and image embedding, that rely on processing some file to some other format in memory, because we could just pass a GMemoryOutputStream to them instead of creating a temporary file.
(To be correct, image embedding is not even in the codebase but rather done by a perl extension... I think it should be fleshed out more and moved into the C++ codebase)
Regards, Krzysztof Kosiński

On Wed, 2009-01-14 at 08:33 -0800, Krzysztof Kosiński wrote:
Ted Gould wrote:
Well, if you go back on the Sodipodi archives you'll notice that I fretted about this for a while. It's not clear what the right answer is. For the most part, the larger implementations all implement most of the functions. The smaller (internal) implement just a few. So what's the overhead in the two cases?
In the large single object case we end up with a few extra function pointers that never get used. Let's say an extra 100 bytes of RAM per extension. On the other hand, we burn a lot of extra objects for things like XSLT which will have to create a meta object and have dual inheritance to work right. The complexity overhead is much greater along with the bytes that are wasted on a per object basis.
I'm not sure that there is a clear winner here, but in general, I try to choose simplicity. And the large object was MUCH simpler when doing this is GObject before the whole G_DEFINE_TYPE days.
I think this new design wouldn't burn too much extra space: Let's say we have an XSLT implementation for extensions. This means there is a class for every extension point, currently XsltOutput, XsltInput, XsltEffect and XsltPathEffect (and probably XsltPrint, but it doesn't make much sense to me). All those derive from the appropriate extension point classes and from XsltExtension, which in turn derives from Extension and implements the common methods (probably delegating some of them to subclasses) as well as providing common functions to deal with the stylesheet.
In the current design we have 2 objects per extension (e.g. XSLT implementation and a "facade" class that stands for what this extension does, e.g. Output, and forwards all methods). In the new design we would have just one self-contained object. I'm not sure where's the wasted space - the protected variables can be reused, so the specialized Xslt* classes wouldn't need to define any additional member variables if all they need is an XSLT stylesheet pointer, whereas in the current design all variables any type of XSLT extension needs must be present in all XSLT extensions regardless of their type.
Oh, I thought you just wanted to rework the implementation class. No, what you're talking about is a bad idea. I can understand how you get that from looking at I/O, but the real intention there (thought I never did it) is to move all of src/file.cpp in to those two files. If you look at src/extension/effect.cpp I think you'll have a hard time adapting that into your new scheme without effectively making the same separation.
Sure, that's a more standard way to do it. I'm not sure of the gains, but why not, shouldn't be any different.
It would make the code that interfaces with extensions a lot more obvious. Currently it involves a lot of boilerplate, e.g. to find a single extension one has to iterate over a list. There could be methods e.g. getOutputByMimetype() or getInputByFilename() that would return the proper I/O extensions. It would also take care of some things that are done using separate objects, e.g. dependencies, unloading timers etc.
That's not dependent on whether it's a singleton or a static object. Sure, the DB object should have more intelligence in it's indexing. It should be a real DB. But, for the small gains I'm not sure of the effort tradeoff.
I figured an easier change would be to just get the local filename from GIO because it's gotten by setting up a FUSE filesystem. This would take less code change (less bugs) and I'm not sure that there'd be any more gain.
Using real GIO instead of the FUSE mount would simplify some codepaths, for instance clipboard and image embedding, that rely on processing some file to some other format in memory, because we could just pass a GMemoryOutputStream to them instead of creating a temporary file.
Sure, I can see that.
(To be correct, image embedding is not even in the codebase but rather done by a perl extension... I think it should be fleshed out more and moved into the C++ codebase)
Well, while I agree here, I'm against the sentiment that it being an extension is somewhat second class.
--Ted

Ted Gould wrote:
Oh, I thought you just wanted to rework the implementation class. No, what you're talking about is a bad idea. I can understand how you get that from looking at I/O, but the real intention there (thought I never did it) is to move all of src/file.cpp in to those two files. If you look at src/extension/effect.cpp I think you'll have a hard time adapting that into your new scheme without effectively making the same separation.
I looked there and I didn't see anything that would make my design awkward yet, but I'll investigate more and come up with an inheritance graph and some more detailed ideas. The only thing that may be problematic is interfacing with verbs - I don't know much about them yet.
Regards, Krzysztof Kosiński
participants (2)
-
Krzysztof Kosiński
-
Ted Gould