Proposal: use std::iostream and friends instead of Inkscape::IO
Hello,
I had a look at the stream classes used to output SVG to flies and memory while working on fixing the clipboard bug. The functionality of those classes is largely the same as std::iostream and the related class hierarchy. I think Inkscape could switch from its own stream implementation to the standard library streams. This would, among other things, reduce the complexity of the application and the size of the codebase.
Is there some good reason Inkscape uses its own stream implementation instead of the standard streams?
BTW, you might want to look at this before replying: http://spec.winprog.org/streams/
Regards, Krzysztof (Chris) Kosiński
On Tue, 2008-03-11 at 04:22 -0700, Krzysztof Kosiński wrote:
I had a look at the stream classes used to output SVG to flies and memory while working on fixing the clipboard bug. The functionality of those classes is largely the same as std::iostream and the related class hierarchy. I think Inkscape could switch from its own stream implementation to the standard library streams. This would, among other things, reduce the complexity of the application and the size of the codebase.
Is there some good reason Inkscape uses its own stream implementation instead of the standard streams?
I think for most things switching to gio probably makes the most sense, but I do agree that would be best. std where gio doesn't make sense. Problem is that GIO is rather new to the Glib stack, so I'm not sure when we'll update to it.
I think the best thing to do is propose a patch. Less code in Inkscape is always a good thing.
--Ted
On Wed, Mar 12, 2008 at 07:43:46PM -0700, Ted Gould wrote:
On Tue, 2008-03-11 at 04:22 -0700, Krzysztof Kosi??ski wrote:
I had a look at the stream classes used to output SVG to flies and memory while working on fixing the clipboard bug. The functionality of those classes is largely the same as std::iostream and the related class hierarchy. I think Inkscape could switch from its own stream implementation to the standard library streams. This would, among other things, reduce the complexity of the application and the size of the codebase.
Is there some good reason Inkscape uses its own stream implementation instead of the standard streams?
I think for most things switching to gio probably makes the most sense, but I do agree that would be best. std where gio doesn't make sense. Problem is that GIO is rather new to the Glib stack, so I'm not sure when we'll update to it.
If gio is not up to the task, perhaps we should consider sending our changes upstream?
Bryce
On Wed, 2008-03-12 at 21:11 -0700, Bryce Harrington wrote:
On Wed, Mar 12, 2008 at 07:43:46PM -0700, Ted Gould wrote:
On Tue, 2008-03-11 at 04:22 -0700, Krzysztof Kosi??ski wrote:
I had a look at the stream classes used to output SVG to flies and memory while working on fixing the clipboard bug. The functionality of those classes is largely the same as std::iostream and the related class hierarchy. I think Inkscape could switch from its own stream implementation to the standard library streams. This would, among other things, reduce the complexity of the application and the size of the codebase.
Is there some good reason Inkscape uses its own stream implementation instead of the standard streams?
I think for most things switching to gio probably makes the most sense, but I do agree that would be best. std where gio doesn't make sense. Problem is that GIO is rather new to the Glib stack, so I'm not sure when we'll update to it.
If gio is not up to the task, perhaps we should consider sending our changes upstream?
I didn't want to imply that it wasn't up to the task as much as mention that it was a bleeding edge version of Glib, probably newer than we want to depend on.
(actually there were backports to earlier versions and some distros shipped at as an additional package to help developers, but for all practical purposes, it's very new)
--Ted
Ted Gould wrote:
I think for most things switching to gio probably makes the most sense, but I do agree that would be best. std where gio doesn't make sense. Problem is that GIO is rather new to the Glib stack, so I'm not sure when we'll update to it.
I think the best thing to do is propose a patch. Less code in Inkscape is always a good thing.
I had a look at GIO, and it seems that it would be indeed the best choice. I thought about deriving char_traits and other stream classes and making them wrap the relevant Glib routines, but it seems that the design of the standard library makes this prohibitively laborious, if not downright impossible, specifically because of locale issues.
GIO is in Glib 2.16, and is required by Gnome 2.22, which had a stable release recently. I think we can move the Inkscape codebase to GIO when Ubuntu 8.04 and Fedora 9 hit the streets, because we won't be able to get the next Inkscape release into distributions before Ubuntu 8.10 and Fedora 10 anyway. Our only concern therefore is ensuring that people can develop on reasonably recent systems. The additional benefit of moving to GIO would be eliminating Gnome-VFS from the codebase (as GIO's primary objective is to provide a replacement of Gnome-VFS). Gnome-VFS is obsolete, doesn't work very well and suffers from design flaws as well as extreme dependency bloat, so it will be a relief to see it removed.
The mapping is as follows (same for output streams): Inkscape::IO::InputStream -> Gio::InputStream Inkscape::IO::BasicInputStream -> Gio::FilterInputStream Inkscape::IO::URIInputStream -> Gio::FileInputStream, can only be acquired from a Gio::File object Inkscape::IO::StringInputStream -> Gio::MemoryInputStream Inkscape::IO::StdInputStream -> special case of Gio::UnixInputStream, I suppose, but not sure how this would work on Windows. However, most Inkscape operations involving stdio I can think of are of relevance to Unix users only.
Writers are not addressed specifically, but reimplementing the current ones should be easy. UTF-8 is not specifically covered, but it should be easy to derive a specialized seekable text stream to handle UTF-8 encoded data if it's required.
Regards, Krzysztof (Chris) Kosiński
On Mar 11, 2008, at 4:22 AM, Krzysztof Kosiński wrote:
Hello,
I had a look at the stream classes used to output SVG to flies and memory while working on fixing the clipboard bug. The functionality of those classes is largely the same as std::iostream and the related class hierarchy. I think Inkscape could switch from its own stream implementation to the standard library streams. This would, among other things, reduce the complexity of the application and the size of the codebase.
Is there some good reason Inkscape uses its own stream implementation instead of the standard streams?
One main issue is that std::iostream is broken in regards to internationalization. One can't really use the wide API's, and other behavior starts to get weird when one goes to use things in a non- American context.
Switching to std::iostream actually could *increase* both the size of the codebase and the complexity of the application. In order to safely use std::iostream one would need to add more code everywhere IO is done, and ensure it is done correctly everywhere.
If you look at the classes an API, you might see a similarity to Java's IO classes. That is due much to Java needing to properly handle Unicode and byte<->character conversions. Remember, bytes and characters do not always correspond one-to-one, so conceptually a byte stream is a *very* different thing from a character stream.
Additionally you can look at the top of inkscapestream.cpp:
/** * Our base input/output stream classes. These are is directly * inherited from iostreams, and includes any extra * functionality that we might need. *
On Mar 12, 2008, at 7:58 PM, Jon A. Cruz wrote: ...
/**
- Our base input/output stream classes. These are is directly
- inherited from iostreams, and includes any extra
- functionality that we might need.
I am sorry. I would like to apologize for the tone of that last mail. After it was sent I realized that it might sound terse or rude (extra tired since I've been up way too late this week). That tone was definitely not my intention.
I forgot to explicitly state it, but your questions were all very good ones, and I was trying to give you facts on the points you raised. Please keep asking such questions. At the very least this lets us see areas where we need to add comments and clarifications to the appropriate places in code and in documentation.
Thanks again for your attention.
Jon A. Cruz wrote:
On Mar 12, 2008, at 7:58 PM, Jon A. Cruz wrote: ...
/**
- Our base input/output stream classes. These are is directly
- inherited from iostreams, and includes any extra
- functionality that we might need.
I failed to notice that comment somehow, . I was also a bit puzzled that there are "OutputStreams" and "Writers" instead of "stringbufs" and "streams". I'll look into that a bit more once I'm done with the clipboard.
I know that using bare iostreams would cause breakage if someone attempted to use them for console output and I have learned it the hard way :) (though I think Gettext would take care of the needed encoding conversions). But writing UTF-8 only files, like the SVG ones, should be fine.
Anyway, I think removing super-ancient and not very clean code (e.g. main.cpp, *-chemistry.cpp etc.) is more important than purism debates.
Regards, Krzysztof (Chris) Kosiński
On Mar 13, 2008, at 10:51 AM, Krzysztof Kosiński wrote:
I failed to notice that comment somehow, . I was also a bit puzzled that there are "OutputStreams" and "Writers" instead of "stringbufs" and "streams". I'll look into that a bit more once I'm done with the clipboard.
I think most of that comes from Java successfully addressing such issues (using the reasearch form Taligent), while C++ did not.
Just to go on naming, "stringbuf" sounds like it should be a buffer for a string. Perhaps some "writers" might write to those, but they might write to something else.
http://java.sun.com/docs/books/tutorial/essential/io/streams.html http://java.sun.com/docs/books/tutorial/essential/io/bytestreams.html http://java.sun.com/docs/books/tutorial/essential/io/charstreams.html http://java.sun.com/j2se/1.5.0/docs/api/java/io/package-summary.html
For the naming Java uses, "Streams" deal with transport of *bytes* and "Writers" deal with sending of *characters*.
Also in C there is no 'byte' only 'char' that means a byte but often is used for characters, and 'strings' are just arrays of 8-bit units that happen to have a zero value in one of their members. On the other hand Java has a 'byte' type for 8-bit data, a 'char' type for character data, and 'String' that is actually a class. in fact, all instances of string literals in Java programs are really instances of java.lang.String and you can do things like if ( "magic".equals(myVar) ) { }
Anyway, much of the reason to use terminology that sounds more like Java's than it sounds like C/C++ is that C/C++ terminology is very mismatched and ambiguous compared to Java's.
Jon,
I totally agree. Until iostreams treats a >21bit number as a unit, then to hell with it. Whenever we put(ch), ch = get(), or iterate over a stream, or randomly access a codepoint as stream[i], the codepoint needs to be able to handle a full Unicode character. People who arrived late into this might not appreciate why we did this. It's not "NIH" , I swear :-)
The braindead morons who designed iostreams believe the 1960s' theory that a 7 bit byte is a character.
Even if you consider UTF8 good enough, then you still must be able to grab a single Unicode character, which can be up to 6 (count em, six) bytes.
Ref here: http://www.unicode.org/ucd/
(For those who don't know, Jon has actually worked on Unicode as a day job, so we shouldn't argue with him :-)
bob(ishmal)
Jon A. Cruz wrote:
On Mar 13, 2008, at 10:51 AM, Krzysztof Kosiński wrote:
I failed to notice that comment somehow, . I was also a bit puzzled that
there are "OutputStreams" and "Writers" instead of "stringbufs" and
"streams". I'll look into that a bit more once I'm done with the clipboard.
I think most of that comes from Java successfully addressing such issues (using the reasearch form Taligent), while C++ did not.
Just to go on naming, "stringbuf" sounds like it should be a buffer for a string. Perhaps some "writers" might write to those, but they might write to something else.
http://java.sun.com/docs/books/tutorial/essential/io/streams.html http://java.sun.com/docs/books/tutorial/essential/io/bytestreams.html http://java.sun.com/docs/books/tutorial/essential/io/charstreams.html http://java.sun.com/j2se/1.5.0/docs/api/java/io/package-summary.html
For the naming Java uses, "Streams" deal with transport of *bytes* and "Writers" deal with sending of *characters*.
Also in C there is no 'byte' only 'char' that means a byte but often is used for characters, and 'strings' are just arrays of 8-bit units that happen to have a zero value in one of their members. On the other hand Java has a 'byte' type for 8-bit data, a 'char' type for character data, and 'String' that is actually a class. in fact, all instances of string literals in Java programs are really instances of java.lang.String and you can do things like if ( "magic".equals(myVar) ) { }
Anyway, much of the reason to use terminology that sounds more like Java's than it sounds like C/C++ is that C/C++ terminology is very mismatched and ambiguous compared to Java's.
This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
Inkscape-devel mailing list Inkscape-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/inkscape-devel
Thanks for all the comments and explanation. I forgot about UTF-8 support, I think that's sufficient to inherit. I have one more question about streams, but that's a separate issue.
participants (5)
-
Bob Jamison
-
Bryce Harrington
-
Jon A. Cruz
-
Krzysztof Kosiński
-
Ted Gould