How are encodings and charset tables organized in CFF font file?

408 views Asked by At

The CFF specification Chapter 11 to Chapter 13 gives a rough description of what encodings and charset data are organized in a file. CFF Specification. Here are some questions.

  1. Considering the possible existence of multi-font file, and that charstrings are accessed in a per-font manner, the corresponding index should also be meaningful only for each font. However, is there only at most one encoding and one charset table for the file? If so, how are the glyph indices correspond to the ones for the charstrings? If not, do they appear multiple times in the TopDict from where they are accessed? (Resolved. See answer below.)

  2. It seems that charsets give names to each glyph. How about encoding? What is that Card8 data stored in each array element? Given its 256 limit, wouldn't the encoding be very restrained? And why in the supplement format the data come via SID? What is the designed method to access glyphs through encodings (In a hybrid string/code way)? And why again are these data strings when it comes to predefined encodings?

Thanks

2

There are 2 answers

0
王凯越 Kaiyue Wang On

Here is an answer to questions 1:

It is a mistake thinking that there is only one TopDict in a single font file. TopDict is an index structure, which contains possibly multiple top table for each font in the FontSet. Therefore the definition of encoding and charset is naturally per-font. It is a little bit confusing that in the specification's Data Layout, that Name and TopDict are not marked with "per-font". See Section 8.

This contains the top-level DICTs of all the fonts in the FontSet stored in an INDEX structure. Objects contained within this INDEX correspond to those in the Name INDEX in both order and number. Each object is a DICT structure that corresponds to the top-level dictionary of a PostScript font. A font is identified by an entry in the Name INDEX and its data is accessed via the corresponding Top DICT.

2
The Ghost of Christmas Past On

I can respond from a practical point-of-view:

About #1, the question appears to be about multiple CFF FontSets, which is a non-starter in the context of OpenType/CFF fonts: only one CFF FontSet is allowed/recognized.

About #2, the encoding issue is also somewhat of a non-starter in the context of an OpenType/CFF font in that the encoding is imposed by the 'cmap' table.

In summary, stand-alone CFFs are effectively worthless, which nullifies any actual or perceived benefits of multiple CFF FontSets and built-in CFF encodings.