The Structure and Interpretation of Fonts

As we have chosen to generate a bitmap representation of the font, we could simply iterate over the neighborhood of each texel, performing a minimizing search as we encounter texels on the other side of the edge. To even approach tractability, this requires that we choose a reasonably-sized area in which to conduct our search. A reasonable heuristic is half the average stroke width of the font. Similarly, texels at a distance of more than 10 outside the glyph are unlikely candidates for influencing the way the glyph is rendered. Therefore, we would conduct a search on the 10 x 10 neighborhood around each texel.

According to Green, the brute-force approach was suitably fast for creating distance fields for text and vector artwork on the class of workstations used in the production of TF2. Signed-distance fields have broad applicability, and have therefore been the subject of much research. At each step, the distance value is determined as the minimum distance value over some mask surrounding the center texel, plus the distance along the vector to the nearest previously-discovered edge. Without going into all the details of this algorithm, it is worth noting that it is drastically faster than the brute-force approach.

Although I have not implemented it with a compute shader, I strongly suspect it could be made faster still with a GPU-based implementation. Once we have the signed-distance field representation of a font, we need a way to transform it into glyphs on the screen. The first part of this process is, again, using the Core Text layout engine to tell us which glyphs should be rendered, and how they should be positioned. We use a CTFramesetter object to lay out the text in a chosen rectangle.

The framesetting process produces an array of CTLine objects, each containing series of glyphs. To construct a text mesh for rendering with Metal, we enumerate the glyphs provided by the Core Text framesetter, which gives us their screen-space coordinates and an index into the table of texture coordinates we constructed earlier when building the font atlas texture. These two pieces of data allow us to create an indexed triangle mesh representing the text string.

This mesh can then be rendered in the usual fashion, with a fragment shader doing the heavy lifting of transforming from the signed-distance field texture to an appropriate color for each pixel. When drawing text onto the screen, we use an orthographic , or parallel projection.

This kind of projection flattens the mesh to the near viewing plane without introducing the foreshortening inherent in a perspective projection. When rendering UI elements, it is convenient to select an orthographic projection whose extents match the native resolution of the screen. Here , , , , , and are the left, right, top, bottom, near and far clipping plane values. The near and far planes are assumed to sit at and , respectively.

Each vertex of the text mesh is transformed by a model matrix which can be used to position and scale the text and a combined view-projection matrix, which is simply the orthographic projection matrix discussed above. The fragment function, on the other hand, is a little more involved. We apply antialiasing at the edges of the glyphs by interpolating from opaque to translucent in a narrow band around the edge. The width of this band is computed per-pixel by finding the length of the gradient of the distance field using the built-in dfdx and dfdy functions.

We then use the smoothstep function, which transitions from 0 to 1 across the width of this smoothed edge, using the sampled distance value itself as the final parameter. This produces an edge band that is roughly one pixel wide, regardless of how much the text is scaled up or down. Here is the complete fragment function for rendering a glyph from a signed-distance field representation:.

Note that we return a color that has an alpha component, so the pipeline state we use should have alpha blending enabled in order for the text to properly blend with the geometry behind it. This also implies that the text should be drawn after the rest of the scene geometry. The sample project for this post renders a paragraph of text that can be zoomed and panned in real-time. Interactivity is achieved with a pair of UIGestureRecognizer s.

Notice how the edges of the glyphs remain quite sharp even under extreme magnification, in contrast to the way a pre-rasterized bitmap texture would become jagged or blurry under magnification. In this post we have considered a handful of ways to draw text with the GPU. We took a deep look at the use of signed-distance fields to increase the fidelity of rendered text while keeping texture memory requirements to a minimum.

Rendering Text in Metal with Signed-Distance Fields

