Text Rendering Notes

From Inkscape Wiki
Jump to navigation Jump to search

A summary of observations about text rendering!

This page is the result of an on-going process of planning a refactor of Inkscape's text handling.

General Comments

Our software stack relies on the FreeType, HarfBuzz, Pango, and Cairo libraries for rendering text.

  • FreeType: Access to font internals.
  • HarfBuzz: Converts characters to glyphs (i.e. shaping).
  • Pango: Determines best fonts to render text for given style and characters.
  • Cairo: Rendering of glyphs.

The boundaries between these libraries is somewhat murky and often one can do the same things using routines from different libraries.

HarfBuzz is under active development and is used by many major pieces of software. FreeType has some development. Pango and Cairo are poorly maintained at the moment. See https://blogs.gnome.org/mclasen/2019/05/25/pango-future-directions/.

Changes for Inkscape

A number of changes we should make to Inkscape's code even if it is not completely rewritten.

Font Hash

Currently Inkscape uses a custom hash of the Pango Font descriptor to index fonts in a font map. Pango has a built in hash which we should probably use instead. This font hash includes "gravity" (glyph orientation, important for vertical text where, for example, the letter 'A' can be rendered upright or sideways) which is missing in our custom hash.

Use Clusters

When adjusting glyphs, especially in vertical-upright text, we should move glyphs in the same cluster by the same amount.

A cluster is a sequence of characters that need to be treated as a single, indivisible unit for text layout (see https://harfbuzz.github.io/clusters.html). For example, an 'a' with a 'grave' accent can (but need not) be composed of two separate glyphs, which would then belong to the same cluster. Complex scripts (such as south-Asian) can have a many-character to many-glyph mapping within one cluster.

Currently Inkscape does some limited guessing as to which glyphs belong together (basically just that non-spacing marks are considered to belong to the preceding glyphs).

Clusters can also help with positioning the cursor so that one can indicate which character within a cluster is subject to editing. For example, if 'ffi' is represented by one glyph and thus one cluster, one can position the cursor before the cluster (before the first 'f'), one-third of the way into the cluster (between the two 'f's), two-thirds of the way into the cluster (between the second 'f' and the 'i') or at the end of the cluster. In this case, the cursor position closely indicates which character is being edited, with complex scripts, this might not be the case but still will give a visual indication that one is "moving" through the cluster.

HarfBuzz includes some functions to reverse the order of glyphs/clusters. These might be useful as SVG dictates that glyphs should be drawn in character order while the shaper return glyphs in visual order (which is opposite for right-to-left scripts).

Remove USE_PANGO_WIN32

The use of USE_PANGO_WIN32 (directly using Window fonts) was disabled in 2011. Some functionality (e.g. variable fonts) aren't implemented for PANGO_WIN32.

Remove Support for Bitmap Fonts

Pango has dropped support for bitmap fonts (and type 1 fonts) with Pango 1.44. (Bitmap glyphs are still supported inside OpenType fonts.)

General Text Layout Algorithm

  1. Add text with the same style and SVG positioning attributes as spans.
  2. Use Pango to itemize spans to find appropriate fonts for each item in span. (An item is a group of characters that share the same font face which can differ with a span if glyphs are missing in the nominal font face.)
  3. Shape the Pango items to determine which glyphs with positions should be used to render the text. Either Pango or HarfBuzz can be used (Pango uses HarfBuzz under the hood).
  4. Create Character, Glyph, Cluster mappings. These will be needed for text editing. Use a specialized iterator to walk through maps.
  5. Layout items in allocated space, determining glyph positions. Must handle SVG kerning, text-length attributes, white-space (spaces, tabs, line-returns), etc.
  6. Apply SVG positioning attributes.
  7. Render glyphs. We use FreeType to extract out glyph paths. We could use Cairo directly but as we need access to the paths when converting text to path, it doesn't reduce code.
  8. Determine cursor position (with link via iterator into Character, Glyph, and Cluster maps). Cursor should use cluster position with fractional adjustment based on character number within cluster.

Our code gets the PangoFontDescription from PangoItem.PangoAnalysis->PangoFont and uses it as an argument for our FontFactory::Face(). We save a copy of the PangoItem in our own PangoItemInfo structure before the PangoItem glist is destroyed. It is not clear if the PangoFont exists afterwards (PangoItem contains a PangoAnalysis but PangoAnalyis contains a pointer to PangoFont). FontFactory::Face() takes the PangoFontDescription, sets a fixed size, and uses it to search for an already existing font_instance, if not found it calls pango_font_map_load_font() to get the PangoFont corresponding to the description and create a new font_instance for the face. The new instance calls InstallFace(PangoFont), if this fails it tries to load a generic sans-serif face. Once the font_instance is found or created, font_instance->InitTheFace() is called. InitTheFace() finds the underlying FreeType font and reads various OpenType tables. InitTheFace() is called from a variety of functions. It's functionality probably should be handled during constructing a font_instance.

Shaping

Mapping characters into glyphs. This includes determining relative offsets between glyphs that belong to the same cluster.

HarfBuzz vs. Pango Coordinates

  • HarfBuzz: y points upward.
  • Pango: y points downward for horizontal text, to the right for vertical text. (Pango handles vertical text as if it was horizontal. The user must rotate the result.)

To translate between HarfBuzz and Pango shaping output:

  • Horizontal text (and vertical sideways text): invert y directions.
  • Vertical text: dy (HarfBuzz) = dx (Pango) - width (Pango); dx (HarfBuzz) = dy (Pango)

Shaping to rendering

  • East -> Vertical text with the glyphs oriented base down.
  • South -> Vertical text with the glyphs oriented base to the left (sideways).

Pango to Cairo glyph

Direction offset_x offset_y advance
East -dy width - dx y (width)
South -dy dx y (width)
Horizontal dx dy x (width)

HarfBuzz to Cairo glyph

Direction offset_x offset_y advance
East dx -dy y (y_advance)
South dy dx y (x_advance)
Horizontal dx -dy x (x_advance)

Pango to FreeType glyph

Direction offset_x offset_y advance
East width - dx -dy x (width)
South dx -dy x (width)
Horizontal dx -dy x (width)

Notes:

  • Can not directly use FreeType font metrics for advance as it fails for vertical text if font does not contain vertical metrics (e.g. for non-spacing marks).
  • PangoLayout and pango-view get font-metrics from FreeType which doesn't handle vertical text. Avoid!
  • HarfBuzz visually centers glyphs in vertical layout.
  • Pango logical_rect and hb_font_extents use "Win Ascent" and "Win Descent" values (FontForge OS/2 Metrics Tab), which can be greater than em-size height. (OS/2 will clip rendering to this region. Super-confusing! Don't use Pango logical_rect!
  • For vertical text:
ink_rect.width = -hb_glyph_extents.height
ink_rect.height = hb_glyph_extents.width
  • hb-view can choose between using FreeType and getting directly font metrics (so called OpenType functions). There are small differences. In particular:
    • With FreeType, vertical upright glyphs are vertically centered with in the "em-box".
    • With OpenType, vertical upright glyphs are positioned so their ink-rect is at the alignment point (top of "em-box").
  • HarfBuzz has some baseline API but it's not clear if how useful it is.
  • Shaping returns glyphs in visual order (left-to-right). For right-to-left text we must reverse them (SVG dictates glyphs are to be drawn in character order). HarfBuzz has some functions to reverse order that may be useful.
  • Proper shaping requires knowledge of characters before and after those being shaped. Both Pango and HarfBuzz have mechanisms for this (e.g. pango_shape_full()).

Glyph Selection

To do.

Font Features

To do.

Font Metrics

Ascent and Descent

Locating the ascent and descent of a font is important for laying out text. (See http://tavmjong.free.fr/blog/?p=1632.) Pango, HarfBuzz, and FreeType all have methods to extract out this data from a font.

OpenType has two tables that contain these values:

  • OS/2 (Windows)
    • sTypeAscender: Distance from baseline to top of em box (no longer required by OpenType spec).
    • sTypoDescender: Distance from baseline to bottom of em box, normally negative.
    • sTypeLineGap: Default spacing between lines (also called leading). Ignored by DTP software including Inkscape.
    • usWinAscent: Used for clipping in Windows.
    • usWinDescent: Used for clipping in Windows.
    • sxHeight: Nominally the distance between the baseline and the top of the letter 'x'.
    • sCapHeight: Nominally the distance between the baseline and the top of capitol letters like 'X'.
  • HHEA (Mac)
    • ascent: Nominally, the same as usWinAscent.
    • descent: Nominally, the same as usWinDescent.
    • lineGap: Nominally, the default spacing between lines.

CSS dictates that one should use the OS/2 sTypoAscender and sTypeDescender values if available, falling back to the HHEA Ascent and Descent values if missing. See https://www.w3.org/TR/CSS2/visudet.html#sTypoAscender. See also the The Webfont Strategy (2019) section of https://glyphsapp.com/learn/vertical-metrics.

Fonts can request that the Typo values be given preference, but popular fonts like DejaVu Sans do not set the request flag. Pango::FontMetrics::get_ascent() and get_descent(), and hb_font_get_extents_for_direction() return the HHEA numbers if the flag is not set. To ensure that we use the Typo values we must resort to using FreeType to read the 'OS/2' table (with fallback to the HHEA table).

x-height

The distance between the alphabetic baseline and the top of the 'x' glyph is known as the x-height and is one of the fundamental units of SVG. Fonts can set the sxHeight value in the OS/2 table (since version 2). If a font (e.g. DejaVu Sans) does not set the value, one can attempt to measure the value using glyph data. HarfBuzz and FreeType have methods to easily get a glyph's bounding box. Pango could do it but in a convoluted way (requiring itemizing).

Baselines

With Latin scripts, glyphs are normally aligned to the alphabetic baseline. South Asian scripts usually use a hanging baseline. Japanese and Chinese use a center baseline in vertical layouts. SVG allows one to specify which baseline should be used. Mathematical equations can use a mathematical baseline. OpenType has a table BASE for baseline values which can be read by HarfBuzz; neither FreeType or Pango can.

If the values are missing in a font, one can try to measure the values by looking at the ink rectangle (bounding box) of specific glyphs (for example the minus sign can be used to define the mathematical baseline). Both FreeType and HarfBuzz can relatively easily be used to find the ink rectangle of a specific glyph. Pango could but it's complicated (one would need to itemize some text first).

Underline and Strike-Through

Fonts can define the position and width of underlines and strike-throughs. Underline is defined in the post (PostScript) table while strike-through is defined in the OS/2 table. (FontForge handles Underline on the General page while Strikeout is under the OS/2 page's Sub/Super tab.)

FreeType has access to these values in the appropriate tables. Pango also gives access to them (Pango::FontMetrics). HarfBuzz does not give access to these values.

Slant

Inkscape uses the slant for Italic fonts to match the text cursor to the text slope. The slant is stored the hhea and vhea OpenType tables. FreeType can access these tables; Pango and HarfBuzz cannot. The post table also has an italicAngle value, accessible by FreeType.

Terminology

Face or Font Face
A collection of glyphs with the same style (family, slant, weigth, etc.) used to render text.
Font
A font face with a specified size.
Em Box
Nominally a box with the width and height equivalent to the width of the 'M' glyph. The 'em' box is scaled to the CSS 'font-size'. It is the basic unit in font design, usually 1000 (OpenType or 1024/2048 (TrueType) designer units in width/height.
Baseline
The point at which glyphs are aligned along a line of text. For horizontal text, this is usually the Alphabetic baseline which is at the bottom of most capital letters. Vertical text usually uses a Center baseline. Alternative baselines include the "Hanging" baseline, used by South-Asian scripts and the "Math" baseline. Baselines are important when mixing different sized fonts.
Ascent/Ascender
The distance from the alphabetic baseline to the top of the font. What is meant by the top of the font depends on context. In CSS, it is the top of the 'em' box. Fonts may contain ascent values that differ in meaning. In particular, the OS/2 typo table value winAscent is used on Windows to clip rendering. The OpenType specification use to require the ascent (typoAscender) + descent (typoDescender) equal the em box height. It no longer does so. See: https://glyphsapp.com/learn/vertical-metrics. CSS recommends using the OpenType sTypoAscender value. See https://www.w3.org/TR/CSS2/visudet.html#sTypoAscender.
Descent/Descender
The distance from the alphabetic baseline to the bottom of the font. See Ascent.
Gap/Line spacing
The extra space between lines of text. Fonts make contain information on the recommended spacing (OS/2 typeLineGap). This is not used by CSS.
x-height
The distance from the baseline to the top of the lower-case 'x' glyph (or equivalent).

Future Section

Future Section