Parsing a command line to argc/argv
I'm getting back to work on the GIO transition.
The first thing that must change in order to move to GIO is the parameter parsing (to open files it's best to use Gio::File::create_for_commandline_arg, and for this filenames in filename encoding are needed). However, this will interfere with the shell mode. I don't think this is an useful feature - an embedded scripting language would be a lot better - but probably there's someone who relies on it, so it should be preserved until there's a replacement. I would like to reuse the option parser from the main application, but then I have to parse the command line into argc and argv myself, because that's what g_option_context_parse accepts. Does anyone know whether there's an "officially blessed" way of parsing a command line into argc and argv?
Regards, Krzysztof Kosiński
On Dec 6, 2008, at 12:56 PM, Krzysztof Kosiński wrote:
I'm getting back to work on the GIO transition.
The first thing that must change in order to move to GIO is the parameter parsing (to open files it's best to use Gio::File::create_for_commandline_arg, and for this filenames in filename encoding are needed). However, this will interfere with the shell mode. I don't think this is an useful feature - an embedded scripting language would be a lot better - but probably there's someone who relies on it, so it should be preserved until there's a replacement. I would like to reuse the option parser from the main application, but then I have to parse the command line into argc and argv myself, because that's what g_option_context_parse accepts. Does anyone know whether there's an "officially blessed" way of parsing a command line into argc and argv?
Regards, Krzysztof Kosiński
One factor to consider is MS Windows.
If things come through a standard command-line, they get converted to "Current ANSI Code page" and any characters not in that get lost. If representative test characters (Latin-1, but not Cp1252, Cp1252 but not Latin-1, neither Latin-1 nor Cp1252, etc) are not run through, then it is hard to know things don't work.
-----Original Message----- From: Krzysztof Kosiński [mailto:tweenk.pl@...400...] Sent: zaterdag 6 december 2008 21:57 To: inkscape-devel@lists.sourceforge.net Subject: [Inkscape-devel] Parsing a command line to argc/argv
I'm getting back to work on the GIO transition.
The first thing that must change in order to move to GIO is the parameter parsing (to open files it's best to use Gio::File::create_for_commandline_arg, and for this filenames in filename encoding are needed). However, this will interfere with the shell mode. I don't think this is an useful feature - an embedded scripting language would be a lot better - but probably there's someone who relies on it, so it should be preserved until there's a replacement.
What do you mean by shell mode? Calling Inkscape from the commandline? Judging from the maillists this is used quite a lot.
Johan
J.B.C.Engelen wrote:
What do you mean by shell mode? Calling Inkscape from the commandline? Judging from the maillists this is used quite a lot.
No, I mean this: http://wiki.inkscape.org/wiki/index.php/ReleaseNotes047#Shell_mode Without being able to pass parameters to Inkscape from the command line, it wouldn't be able to associate with any file types :)
The codepage issue: I don't know how Windows behaves when a filename is not representable in the system encoding, e.g. when you have Chinese characters in a filename on an English system - this is not documented on MSDN. I think the short name (e.g. INKSCA~1.EXE) may be passed to the program in this case. Anyway, this event should be quite rare.
Regards, Krzysztof Kosiński
I have a potential solution to the codepage issue. If non-representable filenames turn out to be a problem, we'll add code to the WinMain function to replace the filename arguments (anything which doesn't start with a "-") with file:// URIs. GIO will take care of the rest.
Regards, Krzysztof Kosiński
OK, I found the answer to my original question, the function I was looking for was g_shell_parse_argv. That's also what the shell mode uses.
Regards, Krzysztof Kosiński
On Dec 6, 2008, at 3:40 PM, Krzysztof Kosiński wrote:
The codepage issue: I don't know how Windows behaves when a filename is not representable in the system encoding, e.g. when you have Chinese characters in a filename on an English system - this is not documented on MSDN. I think the short name (e.g. INKSCA~1.EXE) may be passed to the program in this case. Anyway, this event should be quite rare.
If a normal version of a function is called, Windows converts the 16- bit UTF-16 Unicode data to CP_ACP, which is current Ansi Code Page.
e.g. when you have Chinese characters in a filename for a *user* that is running in English, you get the characters replaced with the "replacement" character, which is normally '?'.
Also note that cases where this is an issue are not really rare. One of the earlier cases we hit was of a Japanese student in Germany. The computer he was using was set to Cp1252 (Windows Western), but the student's name included Kanji, so his username and user home directory had Kanji in the name.
With any system running Active Directory, any valid Unicode characters can show up in a username (well, any except for maybe half- a-dozen special ones).
Jon A. Cruz wrote:
If a normal version of a function is called, Windows converts the 16- bit UTF-16 Unicode data to CP_ACP, which is current Ansi Code Page.
e.g. when you have Chinese characters in a filename for a *user* that is running in English, you get the characters replaced with the "replacement" character, which is normally '?'.
Also note that cases where this is an issue are not really rare. One of the earlier cases we hit was of a Japanese student in Germany. The computer he was using was set to Cp1252 (Windows Western), but the student's name included Kanji, so his username and user home directory had Kanji in the name.
With any system running Active Directory, any valid Unicode characters can show up in a username (well, any except for maybe half- a-dozen special ones).
OK, but can whatever is passed to the program be resolved to files? I know that the exact name won't be passed because not all characters are present in the charset, but whether what is passed can be opened as a file isn't documented. It seems there's no other way for me other than performing some tests on Windows to finally resolve this issue.
Regards, Krzysztof Kosiński
I learned that Windows + GOption is a combination full of fail.
http://bugzilla.gnome.org/show_bug.cgi?id=522131
Until this bug is fixed, we need to use a modified in-tree version of GOption source code which accepts an UTF-8 encoded argv on Windows (a patch is attached to that bug). If I move the option parser into a separate object, put both the in-tree code and the parser object in the Inkscape namespace and only include the in-tree code's header on Windows, the modified implementation will only be used there with no additional changes. GOption source is a single 56kB file + 6kB header.
Regards, Krzysztof Kosiński
On 12/9/2008 6:48 PM, Krzysztof Kosiński wrote:
I learned that Windows + GOption is a combination full of fail.
http://bugzilla.gnome.org/show_bug.cgi?id=522131
Until this bug is fixed, we need to use a modified in-tree version of GOption source code which accepts an UTF-8 encoded argv on Windows (a patch is attached to that bug). If I move the option parser into a separate object, put both the in-tree code and the parser object in the Inkscape namespace and only include the in-tree code's header on Windows, the modified implementation will only be used there with no additional changes. GOption source is a single 56kB file + 6kB header.
Regards, Krzysztof Kosiński
Actually, popt (which we have used for a long time) does handle utf8 (as far as I know), and it does use gettext for error messages.
bob
Bob Jamison-2 wrote:
Actually, popt (which we have used for a long time) does handle utf8 (as far as I know), and it does use gettext for error messages.
I know that, I was just speculating on what using GOption would take (but in the wrong tense).
By the way, I forgot to mention I did some tests on Windows: I wrote a test program that reads files from the command line using GOption and then uses g_file_new_for_commandline_arg and queries whether the file exists, then prints the filename and says whether that file was found or not. When I run this program in a directory that contains files with e.g. Cyryllic characters (I have an English version of XP Tablet PC Edition) with a command like "optiontest *", the program will print "not found" for files with Cyryllic letters.
Regards, Krzysztof Kosiński
participants (4)
-
unknown@example.com
-
Bob Jamison
-
Jon A. Cruz
-
Krzysztof Kosiński