Query: I/O cleanup, reorganization?

Hi all,
I am back from holidays today, and have been catching up on my email.
I was wondering: would a good task for this cycle be a reorganization of our set of scattered, disparate I/O classes into a nice, clean heirarchy of its own, such as /src/io and the namespace Inkscape::IO? We have been talking about moving from filenames to URIs for months now, and changing slowly from FILE * to iostreams. Maybe now at the beginning of a release cycle would be a good time to bite the bullet. This would not be a roadmap thing, but a nice clean top-to-bottom refactoring that would finally take care of our recurring migraines with files and paths and localization. Consider it to be a big bugfix.
If we could upgrade from char * ---to--->FILE *
to a much cleaner URI ---to--->istream and URI ---to--->ostream
...with built-in support for localization, it would make things a LOT easier.
I have been looking around the net a lot, and there already exists a wonderful 'gzstreams' lib for supporting .svgz without pipes. I experimented with it by recoding my copy of repr-io.cpp to use iostreams instead of FILE *, and it worked wonderfully.
There is also a pair of iostream implementations I found, that provide printf() and scanf() style formatting, to alleviate the pain of conversion somewhat.
This would be a lot like Java's "java.io.*" heirarchy, putting streams, files, URI/URL's and paths into one place.
Any thoughts, opinions? This is just something I have been thinking about recently.
Bob

Bob Jamison wrote:
to a much cleaner URI ---to--->istream and URI ---to--->ostream
...with built-in support for localization, it would make things a LOT easier.
Ew ew ew.
There is also a pair of iostream implementations I found, that provide printf() and scanf() style formatting, to alleviate the pain of conversion somewhat.
Eww ew ew ew!
This would be a lot like Java's "java.io.*" heirarchy, putting streams, files, URI/URL's and paths into one place.
Kinda.
Any thoughts, opinions? This is just something I have been thinking about recently.
Well, as you can tell, I do have a few opinions. :-)
Seriously, though, there are some issues I have with things. Some is the 'built-in' localization in streams. Generally good for the first 80% of a project, but breaks down in the final 20%. Same with printf.
One huge beef I have with the latest Java is their introduction of printf, instead of sticking with the nice i18n stuff that Taligent figured out. Sure, it will lure in a few more C programmers, but things break down in the long run.
Soo... switching to some higher-level classes and possibly streams is good. Overloading operations may not be, and 'automatic' or 'transparent' localization is generally something I try to avoid at all costs.

Jon A. Cruz wrote:
Seriously, though, there are some issues I have with things. Some is the 'built-in' localization in streams. Generally good for the first 80% of a project, but breaks down in the final 20%. Same with printf.
Well, I'm not saying that the plain vanilla iostream or fstream packages are great, either. What we can do, though, is create our own classes that implement istream and ostream, but have the features that we need. For example, your Win9x fix for the WinAPI's incorrect mapping of fopen() can be wrapped in iostreams. And we can use the wonderful Glib::ustring for a nice stringstream. What we get by using the interfaces is the ability to extend/specialize then, and daisy-chain them together, pipe them, etc.
Bob

On Thu, 2 Dec 2004, Bob Jamison wrote:
Jon A. Cruz wrote:
Seriously, though, there are some issues I have with things. Some is the 'built-in' localization in streams. Generally good for the first 80% of a project, but breaks down in the final 20%. Same with printf.
Well, I'm not saying that the plain vanilla iostream or fstream packages are great, either. What we can do, though, is create our own classes that implement istream and ostream, but have the features that we need. For example, your Win9x fix for the WinAPI's incorrect mapping of fopen() can be wrapped in iostreams. And we can use the wonderful Glib::ustring for a nice stringstream. What we get by using the interfaces is the ability to extend/specialize then, and daisy-chain them together, pipe them, etc.
Can you show us an example of the before/after code? I'm curious what the existing code looks like, and how it'll get streamlined with the new stuff.
Also, take a look at my ftos code - this was written as a part of a iostream subclassing thing I did, and worked pretty good. I'd love to see this incorporated since it provides a better approach to formatting numbers (so you can say x="4.21" instead of x="4.209999999").
Bryce

Bryce Harrington wrote:
Can you show us an example of the before/after code? I'm curious what the existing code looks like, and how it'll get streamlined with the new stuff.
I'll paste below what repr-io.cpp's output section would look like with streams (omitting some stuff). It is just an example. I would expect that we would have something like Inkscape::IO::ostream or ozstream. But I have actually tested it, and it works, and does .svgz without a pipe. It looks very readable to me, at least (at least as well as the previous printfs). Note how save_file() and save_file_gzip() can now use the same code.
FYI: we have been using Mental's Inkscape::SVGOStringStream for inserting text data into Repr nodes for months now, and it works very, very well.
Also, take a look at my ftos code - this was written as a part of a iostream subclassing thing I did, and worked pretty good. I'd love to see this incorporated since it provides a better approach to formatting numbers (so you can say x="4.21" instead of x="4.209999999").
I think it would fit in easily as the code that implements the operator overload for <<(double) and <<(float)
Bob
===== SNIP SNIP ========
//######################################################################### //### W R I T E //#########################################################################
static void sp_repr_quote_write (std::ostream &outs, const gchar * val) { if (!val) return;
for (; *val != '\0'; val++) { switch (*val) { case '"': outs << """; break; case '&': outs << "&"; break; case '<': outs << "<"; break; case '>': outs << ">"; break; default : outs << *val; break; } } }
static void sp_repr_write_stream_element (SPRepr *repr, std::ostream &outs, gint indent_level, gboolean add_whitespace) { gint i;
g_return_if_fail (repr != NULL);
if ( indent_level > 16 ) indent_level = 16;
if (add_whitespace) { for ( i = 0 ; i < indent_level ; i++ ) { outs << " "; } }
outs << "<" << sp_repr_name (repr);
// if this is a <text> element, suppress formatting whitespace // for its content and children:
if (!strcmp(sp_repr_name(repr), "text")) { add_whitespace = false; }
SPReprAttr *attr; for ( attr = repr->attributes ; attr != NULL ; attr = attr->next ) { gchar const * const key = SP_REPR_ATTRIBUTE_KEY(attr); gchar const * const val = SP_REPR_ATTRIBUTE_VALUE(attr); outs << "\n"; for ( i = 0 ; i < indent_level + 1 ; i++ ) { outs << " "; } outs << " " << key << '"'; sp_repr_quote_write (outs, val); outs << '"'; }
gboolean loose = TRUE; SPRepr *child; for (child = repr->children; child != NULL; child = child->next) { if (child->type == SP_XML_TEXT_NODE) { loose = FALSE; break; } } if (repr->children) { outs << ">"; if (loose && add_whitespace) { outs << "\n"; } for (child = repr->children; child != NULL; child = child->next) { sp_repr_write_stream (child, outs, ( loose) ? (indent_level + 1) : 0, add_whitespace); }
if (loose && add_whitespace) { for (i = 0; i < indent_level; i++) { outs << " "; } } outs << "</" << sp_repr_name (repr) << ">"; } else { outs << " />"; }
// text elements cannot nest, so we can output newline // after closing text
if (add_whitespace || !strcmp (sp_repr_name (repr), "text")) { outs << "\n"; } }
static void sp_repr_write_stream (SPRepr *repr, std::ostream &outs, gint indent_level, gboolean add_whitespace) { if (repr->type == SP_XML_TEXT_NODE) { sp_repr_quote_write (outs, sp_repr_content (repr)); } else if (repr->type == SP_XML_COMMENT_NODE) { outs << "<!--" << sp_repr_content (repr) << "-->"; } else if (repr->type == SP_XML_ELEMENT_NODE) { sp_repr_write_stream_element(repr, outs, indent_level, add_whitespace); } else { g_assert_not_reached(); } }
void sp_repr_save_stream (SPReprDoc *doc, std::ostream &outs) {
outs << "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\"?>\n";
const gchar *str = sp_repr_attr ((SPRepr *) doc, "doctype"); if (str) outs << str;
SPRepr *repr = sp_repr_document_first_child(doc); for ( repr = sp_repr_document_first_child(doc) ; repr ; repr = sp_repr_next(repr) ) { sp_repr_write_stream(repr, outs, 0, TRUE); if ( repr->type == SP_XML_COMMENT_NODE ) { outs << '\n'; } } }
/* Returns TRUE if file successfully saved; FALSE if not */ gboolean sp_repr_save_file (SPReprDoc *doc, const gchar *filename) { if (filename == NULL) { return false; }
std::ofstream outs(filename);
sp_repr_save_stream (doc, outs);
outs.close();
return true; }
/* Returns TRUE if file successfully saved; FALSE if not */ gboolean sp_repr_save_file_gzip (SPReprDoc *doc, const gchar *filename) { if (filename == NULL) { return false; }
ogzstream outs(filename);
sp_repr_save_stream (doc, outs);
outs.close();
return true; }
void sp_repr_print(SPRepr *repr) { sp_repr_write_stream (repr, std::cout, 0, true); }

FYI: we have been using Mental's Inkscape::SVGOStringStream for inserting text data into Repr nodes for months now, and it works very, very well.
It works, but the next thing I wanted to do (and I had thought it would be easy with the iostream and Bryce's ftos code) is to drop insignificant 0s from numeric values written to SVG. Can anyone suggest how to do this, if it is possible at all? Those 0s are very annoying.

bulia byak wrote:
FYI: we have been using Mental's Inkscape::SVGOStringStream for inserting text data into Repr nodes for months now, and it works very, very well.
It works, but the next thing I wanted to do (and I had thought it would be easy with the iostream and Bryce's ftos code) is to drop insignificant 0s from numeric values written to SVG. Can anyone suggest how to do this, if it is possible at all? Those 0s are very annoying.
Heh. Look at its definition at the bottom below. ^^
Bob
====== SNIP ========= #ifndef INKSCAPE_STRINGSTREAM_H #define INKSCAPE_STRINGSTREAM_H
#include <glib/gtypes.h> #include <sstream> #include "svg/ftos.h"
namespace Inkscape { class SVGOStringStream : public std::ostringstream {
public: SVGOStringStream() { this->imbue(std::locale::classic()); this->setf(std::ios::showpoint); this->precision(8); }
gchar const *gcharp() const { return reinterpret_cast<gchar const *>(str().c_str()); }
};
}
#ifdef BRYCE_FLOATS Inkscape::SVGOStringStream &operator<<(Inkscape::SVGOStringStream &os, double d) { return os << ftos(d, 'g', -1, -1, 0); } #endif
#endif

will now try to recompile with BRYCE_FLOATS and see if it works.
Here's what I get, I'm not sure how to deal with this:
In file included from desktop-style.cpp:19: svg/stringstream.h: In function `Inkscape::SVGOStringStream& operator<<(Inkscape::SVGOStringStream&, double)': svg/stringstream.h:31: error: could not convert `std::operator<<(std::basic_ostream<_CharT, _Traits>&, const std::basic_string<_CharT, _Traits, _Alloc>&) [with _CharT = char, _Traits = std::char_traits<char>, _Alloc = std::allocator<char>]((+os), (&ftos(double, char, int, int, int)(103, -1, -1, 0)))' to `Inkscape::SVGOStringStream&' desktop-style.cpp: In function `void sp_desktop_set_color(SPDesktop*, const ColorRGBA&, bool, bool)': desktop-style.cpp:47: error: ISO C++ says that `Inkscape::SVGOStringStream& operator<<(Inkscape::SVGOStringStream&, double)' and `std::basic_ostream<_CharT, _Traits>& std::basic_ostream<_CharT, _Traits>::operator<<(float) [with _CharT = char, _Traits = std::char_traits<char>]' are ambiguous even though the worst conversion for the former is better than the worst conversion for the latter

bulia byak wrote:
will now try to recompile with BRYCE_FLOATS and see if it works.
Here's what I get, I'm not sure how to deal with this:
In file included from desktop-style.cpp:19: svg/stringstream.h: In function `Inkscape::SVGOStringStream& operator<<(Inkscape::SVGOStringStream&, double)': svg/stringstream.h:31: error: could not convert `std::operator<<(std::basic_ostream<_CharT, _Traits>&, const std::basic_string<_CharT, _Traits, _Alloc>&) [with _CharT = char, _Traits = std::char_traits<char>, _Alloc = std::allocator<char>]((+os), (&ftos(double, char, int, int, int)(103, -1, -1, 0)))' to `Inkscape::SVGOStringStream&' desktop-style.cpp: In function `void sp_desktop_set_color(SPDesktop*, const ColorRGBA&, bool, bool)': desktop-style.cpp:47: error: ISO C++ says that `Inkscape::SVGOStringStream& operator<<(Inkscape::SVGOStringStream&, double)' and `std::basic_ostream<_CharT, _Traits>& std::basic_ostream<_CharT, _Traits>::operator<<(float) [with _CharT = char, _Traits = std::char_traits<char>]' are ambiguous even though the worst conversion for the former is better than the worst conversion for the latter
Maybe 'explicit' would help?

On Thu, 2 Dec 2004, bulia byak wrote:
FYI: we have been using Mental's Inkscape::SVGOStringStream for inserting text data into Repr nodes for months now, and it works very, very well.
It works, but the next thing I wanted to do (and I had thought it would be easy with the iostream and Bryce's ftos code) is to drop insignificant 0s from numeric values written to SVG. Can anyone suggest how to do this, if it is possible at all? Those 0s are very annoying.
Yes, the ftos code provides several mechanisms for trimming numbers.
First is a basic rounding thing. In other words, if you give it the number 4.20000 it returns a string "4.2". It also handles 4.200000001 and 4.199999999 this way.
Second is a precision mechanism. This is exactly analogous to printf's capability. You specify the exact number of digits you want and it rounds to that. So you can make it give you 4.20, 0.01, 1.95, 9.99, and 1.00.
Third is a significant figures mechanism. This is somewhat different from precision in that you say "I want N digits worth of information". So if you want to keep 4 sig figs, it will return 123.4, 0.000001111, and 1234000. This approach is common in engineering mathematics, or areas where unit conversions may make a precision-oriented approach cause problems.
You can of course use combinations of the above approaches if desired.
The one thing it doesn't do is localization. For SVG output, though, that's probably undesireable anyway. Would not be hard to add though.
Usage of all of this, plus more detailed examples, are documented in the comments at the top of the ftos.c file.
Bryce x

On Thu, 2 Dec 2004, Bob Jamison wrote:
Can you show us an example of the before/after code? I'm curious what the existing code looks like, and how it'll get streamlined with the new stuff.
I'll paste below what repr-io.cpp's output section would look like with streams (omitting some stuff). It is just an example. I would expect that we would have something like Inkscape::IO::ostream or ozstream. But I have actually tested it, and it works, and does .svgz without a pipe. It looks very readable to me, at least (at least as well as the previous printfs). Note how save_file() and save_file_gzip() can now use the same code.
Hmm, yeah... Looks like for this case the need is less about formatting output and more for passing streams around from function to function, so streams does better at that. Of course, performance may need to be considered, but this only occurs at file load/save, right? So slight performance changes there may not be a problem even if they exist.
Also, take a look at my ftos code - this was written as a part of a iostream subclassing thing I did, and worked pretty good. I'd love to see this incorporated since it provides a better approach to formatting numbers (so you can say x="4.21" instead of x="4.209999999").
I think it would fit in easily as the code that implements the operator overload for <<(double) and <<(float)
*Nod*
Actually I found it particularly useful to implement it such that you can pass parameters to it (e.g., for num of sig figs, precision, or how to handle exponents, or etc.)
Bryce

Bryce Harrington wrote:
Hmm, yeah... Looks like for this case the need is less about formatting output and more for passing streams around from function to function, so streams does better at that. Of course, performance may need to be considered, but this only occurs at file load/save, right? So slight performance changes there may not be a problem even if they exist.
Yes. That is a very reasonable focus.
Then again, we need to check for all the cases where we might want to use something, especially chaining.
It's not unheard of (in C++ and Java) to get in somewhere to add a final bit of funtionality, only to discover that you're blocked from some needed info by a stream chain abstraction.

Bob Jamison wrote: ...
If we could upgrade from char * ---to--->FILE *
to a much cleaner URI ---to--->istream and URI ---to--->ostream
In general a good idea, but I'd be very careful with using iostreams, they can be VERY slow, difficult to work with and can cause interesting problems, while their advanced features are usually hardly worth the trouble of using them (I've actually reverted to using FILE* or Win32 specific APIs as I find them much easier and more reliable).
I realize this might not all apply to InkScape and if you have to do a lot of "text" io it might a good idea. Also, if you're sure you're using a good implementation it might not be such a problem. In any case I'd thorougly test and benchmark relevant pieces of code before deciding on using it, as well as making sure it will really make things easier.
If you're worried about using FILE* I'd recommend creating a custom wrapper or looking whether you can find something better than iostreams to suit your needs. This is what I do and has saved me a lot of trouble as I have much more control over what it does (I need support for 64bit file sizes for example) and can extend it with useful features.
...with built-in support for localization, it would make things a LOT easier.
Or give a lot of problems, things like . vs. , as decimal separator can be extremely frustrating (especially when exchanging files). Things like this tend to also work extremely well in cases where they're NOT wanted (which is usually more crucial than whether they always work in places where they are wanted).
I have been looking around the net a lot, and there already exists a wonderful 'gzstreams' lib for supporting .svgz without pipes. I experimented with it by recoding my copy of repr-io.cpp to use iostreams instead of FILE *, and it worked wonderfully.
That is nice of course, but probably not worth switching for, as it probably isn't too difficult to implement for whatever system you end up using (if it isn't implemented already).
There is also a pair of iostream implementations I found, that provide printf() and scanf() style formatting, to alleviate the pain of conversion somewhat.
Personally I like printf (especially after having benchmarked similar conversions using streams), but again I doubt this is a decisive feature.

On Thu, 2 Dec 2004, Jasper van de Gronde wrote:
Bob Jamison wrote:
If we could upgrade from char * ---to--->FILE *
to a much cleaner URI ---to--->istream and URI ---to--->ostream
In general a good idea, but I'd be very careful with using iostreams, they can be VERY slow, difficult to work with and can cause interesting problems, while their advanced features are usually hardly worth the trouble of using them (I've actually reverted to using FILE* or Win32 specific APIs as I find them much easier and more reliable).
Personally I like printf (especially after having benchmarked similar conversions using streams), but again I doubt this is a decisive feature.
I've had similar experience with iostreams. I had gotten into using them a great deal (including deriving custom stream classes, making custom formatting operators, etc. etc.) and got pretty knowledegable and comfortable with the advanced features, but for the things I was typically using it for (generating tabulated reports of enginering data), making it print nicely took a lot more work than printf. I also found myself having to write my own number<->string converters, etc. because the iostream capabilities were limited in one way or another.
Anyway, when I finally saw that printf-style I/O was common in a lot of programming languages, but iostream was rather C++-unique, that was sort of the deciding factor, and I switched back.
However, I'm certain that for certain specific types of usage models, iostream can be superior. It's good for developers unfamiliar with pointers, or for doing logging/error printing, and interactive cmdline stuff. It doesn't seem to offer much for GUI stuff (sprintf is a workhorse) or formatted text file output.
I haven't really looked into how Inkscape is doing I/O, and don't know the comparitive support for XML that the two models have. Possibly iostream provides a better XML-writing interface?
Anyway, my thinking is, it's worth a shot. We always say 'patch first, discuss later', so I'd say despite my reservations I'd support plugging it in and trying it out. I would imagine it'd be a fairly easy thing to remove if we found it caused some problem. If Bob's right that it majorly cleans up the code, and doesn't hurt performance, that'd be great. If not, then at least we'd know, and perhaps have some better ideas of how to do it in the future. ;-)
Bryce
participants (5)
-
Bob Jamison
-
Bryce Harrington
-
bulia byak
-
Jasper van de Gronde
-
Jon A. Cruz