Hey Devs,
I really do need some other developers to go over the proposed fix for 166371. I'd like to know it's not a crazy fix considering what it does if a file fails to load.
https://code.launchpad.net/~doctormo/inkscape/fix166371/+merge/185064
Don't worry, it's not a large change.
Best Regards, Martin Owens
2013/9/12 Martin Owens <doctormo@...400...>:
Hey Devs,
I really do need some other developers to go over the proposed fix for 166371. I'd like to know it's not a crazy fix considering what it does if a file fails to load.
https://code.launchpad.net/~doctormo/inkscape/fix166371/+merge/185064
Don't worry, it's not a large change.
This patch will break all Adobe SVG files that contain the string "SYSTEM" e.g. in text objects or IDs.
Regards, Krzysztof
On Thu, 2013-09-12 at 17:30 +0200, Krzysztof Kosiński wrote:
This patch will break all Adobe SVG files that contain the string "SYSTEM" e.g. in text objects or IDs.
Yep. Are you saying that's an unacceptable trade-off? Because it is a hypothetically tiny subset of documents.
Martin,
2013/9/12 Martin Owens <doctormo@...400...>:
On Thu, 2013-09-12 at 17:30 +0200, Krzysztof Kosiński wrote:
This patch will break all Adobe SVG files that contain the string "SYSTEM" e.g. in text objects or IDs.
Yep. Are you saying that's an unacceptable trade-off? Because it is a hypothetically tiny subset of documents.
This subset is not tiny by any reasonable standard. You don't need to try very hard to create a document that contains the word "SYSTEM", in fact you can trivially create them in Inkscape without even editing the XML, and breaking all such documents is unacceptable. BTW, it would also break all embedded images which contain the string "SYSTEM" in their base64 encoding, as well as any paths that contain the string "SYSTEM".
This problem could be solved correctly by writing a SAX parser; it was discussed some time ago on the list. However, that involves writing a nontrivial amount of code. A better quick fix is to use g_regex_replace to remove only system entity declarations, which would break only very exotic documents:
GRegex *entity_regex = g_regex_new("<!ENTITY\\s+[^>\s]+\s+SYSTEM\s+"[^>"]+"\s*>", G_REGEX_CASELESS, 0, NULL); gchar *fixed_buffer = g_regex_replace(entity_regex, buffer, len, 0, "", 0, NULL); g_regex_unref(entity_regex);
The above code will break documents that contain a system entity declaration enclosed in CDATA (not XML-encoded) as the content of a text element, but since Inkscape never produces this kind of XML, I guess we can live with that for now.
Regards, Krzysztof
On Thu, 2013-09-12 at 18:28 +0200, Krzysztof Kosiński wrote:
This subset is not tiny by any reasonable standard.
Teeny-tiny wibbly-wobbly subset. ;-)
BUT thanks to your fantastic code (with a few tweaks), even thus subset is doomed to have their files work properly in Inkscape.
The only regression from that to this is that it now fails to open files with both system entities and ns record entity replacement. replacing system with spaces allowed libxml2 to parse the entity as if it were a literal. But I'm not touching that regex, it's perfect enough for me.
Compiled, tested with files, if there's no other problems. I'll merge it and get this bug wrapped up.
Note: this new replacement code is ONLY run when the first open fails with a specific kind of problem seen in adobe files. Other files with entities will continue to not be replaced until libxml2 is fixed properly, but at least those files will still open.
Martin,
2013/9/12 Martin Owens <doctormo@...400...>:
The only regression from that to this is that it now fails to open files with both system entities and ns record entity replacement. replacing system with spaces allowed libxml2 to parse the entity as if it were a literal. But I'm not touching that regex, it's perfect enough for me.
Ah, I missed that part. Here is a small tweak which fixes that:
GRegex *entity_regex = g_regex_new("(<!ENTITY\\s+[^>\s]+\s+)SYSTEM\s+("[^>"]+"\s*>)", G_REGEX_CASELESS, 0, NULL); gchar *fixed_buffer = g_regex_replace(entity_regex, buffer, len, 0, "\1\2", 0, NULL); g_regex_unref(entity_regex);
Regards, Krzysztof
Looks like there is an additional problem. Since we are using xmlReadIO, there is the possibility that the entity declaration will be split between read blocks. By placing the entity declaration at this boundary it's still possible to exploit the vulnerability. So the bug is not fixed even with the regex.
To really fix it, we would have to temporarily load the entire document into memory and use xmlReadDoc.
Regards, Krzysztof
Isn't there a test file in the bug report?
Cheers, Josh On Sep 13, 2013 9:27 AM, "Martin Owens" <doctormo@...400...> wrote:
On Fri, 2013-09-13 at 16:20 +0200, Krzysztof Kosiński wrote:
To really fix it, we would have to temporarily load the entire document into memory and use xmlReadDoc.
I have a fix waiting, but no test svg, did your tests produce such a document?
Martin,
How ServiceNow helps IT people transform IT departments:
- Consolidate legacy IT systems to a single system of record for IT
- Standardize and globalize service processes across IT
- Implement zero-touch automation to replace manual, redundant tasks
http://pubads.g.doubleclick.net/gampad/clk?id=51271111&iu=/4140/ostg.clk... _______________________________________________ Inkscape-devel mailing list Inkscape-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/inkscape-devel
With my last revision I stab at thee! r12513.
We're now loading the file into memory for parsing and then controlling that back into the parser.
Martin,
On Fri, 2013-09-13 at 16:20 +0200, Krzysztof Kosiński wrote:
Looks like there is an additional problem. Since we are using xmlReadIO, there is the possibility that the entity declaration will be split between read blocks. By placing the entity declaration at this boundary it's still possible to exploit the vulnerability. So the bug is not fixed even with the regex.
To really fix it, we would have to temporarily load the entire document into memory and use xmlReadDoc.
Regards, Krzysztof
I get this error when compiling last rev (12510) under Windows:
Make error line 298: problem compiling: src/xml/repr-io.cpp: In static member function 'static int XmlSource::readCb(void*, char*, int)': src/xml/repr-io.cpp:229:36: error: 'g_match_info_unref' was not declared in this scope
Luca
-- View this message in context: http://inkscape.13.x6.nabble.com/Opening-Adobe-SVG-Fix-for-166371-tp4967901p... Sent from the Inkscape - Dev mailing list archive at Nabble.com.
What GLib version are you using?
Thanks for testing too, this would have been annoying if I'd had backported it.
Martin,
On Fri, 2013-09-13 at 05:45 -0700, LucaDC wrote:
I get this error when compiling last rev (12510) under Windows:
Make error line 298: problem compiling: src/xml/repr-io.cpp: In static member function 'static int XmlSource::readCb(void*, char*, int)': src/xml/repr-io.cpp:229:36: error: 'g_match_info_unref' was not declared in this scope
On 13-9-2013 15:31, Martin Owens wrote:
What GLib version are you using?
Windows devlibs uses GLib 2.28.8
Thanks for testing too, this would have been annoying if I'd had backported it.
This is not testing. Building trunk is broken which is always annoying. Fixed in rev12517
Martin,
On Fri, 2013-09-13 at 05:45 -0700, LucaDC wrote:
I get this error when compiling last rev (12510) under Windows:
Make error line 298: problem compiling: src/xml/repr-io.cpp: In static member function 'static int XmlSource::readCb(void*, char*, int)': src/xml/repr-io.cpp:229:36: error: 'g_match_info_unref' was not declared in this scope
participants (5)
-
Johan Engelen
-
Josh Andler
-
Krzysztof Kosiński
-
LucaDC
-
Martin Owens