Re: [Inkscape-user] eps error (bad encoding)

newer
Re: [Inkscape-devel] Memory...

older
listener fixes

bulia byak

2 May 2005 2 May '05

2:58 p.m.

On 5/2/05, Aewyn <aewyn@...326...> wrote:

...

I just try to print from inkscape; as I know the only way to produce a reasonable sized document is save as .eps

Is is ok, but all my accent characters (őúéáűó) goes to its unicode formats, so the result is totally unusable.

Convert text to path - this is perhaps the most reliable solution.

Other than that, I don't know what encoding the text in PS is supposed to be. I doubt it's Unicode. Latin-1? What about non-Western-European languages? Can any experts comment?

-- bulia byak Inkscape. Draw Freely. http://www.inkscape.org

Show replies by date

MenTaLguY

7 May 7 May

6:45 a.m.

New subject: [Inkscape-user] eps error (bad encoding)

On Mon, 2005-05-02 at 10:58, bulia byak wrote:

...

Other than that, I don't know what encoding the text in PS is supposed to be. I doubt it's Unicode. Latin-1? What about non-Western-European languages? Can any experts comment?

The encoding is arbitrary, really. For the Postscript code itself, it's ASCII by convention, but the characters in a Postscript string are just flat bytes in the 0-255 range.

Postscript fonts may have any number of glyphs, but they are named by names rather than integers. Byte values in strings are mapped to glyph names via the current font's Encoding array, which acts as a lookup table.

In general for Unicode you will have to set up multiple encoding arrays and possibly split strings to switch between encodings[1] depending on how many distinct glyphs you need to draw.

I am not sure how devising the encoding is properly handled when generating PostScript. I know there are standard Postscript glyph names for Unicode characters, but they are not universally observed. For that reason I believe there is also an optional Unicode->glyph name table in some fonts.

At least PostScript does provide Type 0 and CID-Keyed fonts which can do arbitrary multi-byte mappings and so forth (they automate much of the above process, with an additional decoding step to allow multi-byte encodings). I don't fully understand them, however[2].

Yes, it's very ugly.

-mental

---

[1] The preferred method is to make multiple copies of a font object, give each its own associated Encoding array, and switch between those. See section 5.3 of the PostScript Language Reference Manual (Third Edition). Such copies are shallow copies, so it doesn't take a lot of extra memory.

[2] See sections 5.10 and 5.11 of the PLRM3.

Craig Bradney

8:35 a.m.

New subject: [Inkscape-user] eps error (bad encoding)

On Saturday 07 May 2005 08:45, MenTaLguY wrote:

...

On Mon, 2005-05-02 at 10:58, bulia byak wrote:

...
Other than that, I don't know what encoding the text in PS is supposed to be. I doubt it's Unicode. Latin-1? What about non-Western-European languages? Can any experts comment?

The encoding is arbitrary, really. For the Postscript code itself, it's ASCII by convention, but the characters in a Postscript string are just flat bytes in the 0-255 range.

Postscript fonts may have any number of glyphs, but they are named by names rather than integers. Byte values in strings are mapped to glyph names via the current font's Encoding array, which acts as a lookup table.

In general for Unicode you will have to set up multiple encoding arrays and possibly split strings to switch between encodings[1] depending on how many distinct glyphs you need to draw.

I am not sure how devising the encoding is properly handled when generating PostScript. I know there are standard Postscript glyph names for Unicode characters, but they are not universally observed. For that reason I believe there is also an optional Unicode->glyph name table in some fonts.

At least PostScript does provide Type 0 and CID-Keyed fonts which can do arbitrary multi-byte mappings and so forth (they automate much of the above process, with an additional decoding step to allow multi-byte encodings). I don't fully understand them, however[2].

Yes, it's very ugly.

-mental

[1] The preferred method is to make multiple copies of a font object, give each its own associated Encoding array, and switch between those. See section 5.3 of the PostScript Language Reference Manual (Third Edition). Such copies are shallow copies, so it doesn't take a lot of extra memory.

[2] See sections 5.10 and 5.11 of the PLRM3.

http://partners.adobe.com/public/developer/opentype/index_glyph.html and http://partners.adobe.com/public/developer/opentype/index_glyph2.html might be of assistance here. In PDF, using the u12345 notation is only supported in Reader 6 and higher.

Craig

7626

Age (days ago)

7631

Last active (days ago)

List overview

Download

2 comments

3 participants

tags (0)

participants (3)

bulia byak
Craig Bradney
MenTaLguY