Fonts and Typesetting

One of the seemingly simplest ways to convey information to a player is to use text. However, it turns out that fonts and text rendering are -- or, at least, can be -- incredibly complex.

Bitmap and fixed width fonts: everything is simple.

In the middle, things are tricky.

Big fonts, everything is simple.

Text Data

Before even talking about font rendering, let's talk about how you might store strings. There are a lot of characters out there -- hundreds from alphabetic scripts (e.g., latin, greek) and thousands from logographic scripts (e.g., hanzi). In the past, there have been many many many encoding standards (e.g. ASCII) for mapping the values of bytes to these letters.

These days, the encoding standard of choice is Unicode, which maps 136,755 characters (and has space for more, allowing up to 2²¹). So, how do you store this data?

UTF-32

The (seemingly) most straightforward way to do this would be to use a 32-bit integer for every character. This allows each code point to be easily addressed, makes computing the lengths of strings easy, and -- admittedly -- wastes a lot of bits.

I also say "seemingly" because UTF-32 doesn't seem to have much text-editor support. Also, care must be taken so that big-endian and little-endian systems do not interpret files differently.

UTF-16

A compromise option might instead be to use 16-bit values. In this encoding, values in the ranges 0x0000-0xD7FF and 0xE000 - 0xFFFF map directly to code points. That 0xD800 - 0xDFFF range is used to encode larger values using "surrogate pairs".

uint32_t code_point = /* some value */;
if (code_point < 0x10000) { //single value:
	assert(code_point < 0xD800 || code_point > 0xDFFF); //these code points are reserved
	output.push(code_point);
} else { //surrogate pair:
	code_point -= 0x10000;
	assert(code_point < 0x100000); //must be a 20-bit number
	output.push((code_point >> 10) + 0xD800); //high surrogate
	output.push((code_point & 0x3ff) + 0xDC00); //low surrogate
}

Note that in addition to needing to worry about endianness (there is a special "byte order mark" character for this), our code can't figure out string lengths by counting bytes any more. A precursor to UTF-16 is widely used as an encoding on Windows. One of the fun corner cases is that Windows filenames can include surrogate pair fragments, e.g., can be invalid UTF-16 (this should worry you).

UTF-8

UTF-8 is a byte-based encoding which takes between one and four bytes to encode each code point. The characters 0x00-0x8f (which happen to match "low ASCII", and contain the roman alphabet) are encoded as-is, meaning that almost all ASCII documents are already UTF-8 encoded.

The UTF-8 encoding is variable-width, with the first byte giving the width in leading 1's, and the remaining bytes carrying the appropriate bits:

uint32_t cp = /* some code point */;
if (cp <= 0x7f) { //7 bits packed as 0x0vvvvvvv
	output.push(cp);
} else if (cp <= 0x7ff) { //11 bits packed as 0x110vvvvv 0x10vvvvvv
	output.push(0xC0 | (cp >> 7));
	output.push(0x80 | (cp & 0x3f));
} else if (cp <= 0xffff) { //16 bits packed as 0x1110vvvv 0x10vvvvvv 0x10vvvvvv
	output.push(0xe0 | (cp >> 12));
	output.push(0x80 | ((cp >> 6) & 0x3f));
	output.push(0x80 | (cp & 0x3f));
} else if (cp <= 0x10ffff) { //21 bits packed as 0x11110vvv 0x10vvvvvv 0x10vvvvvv 0x10vvvvvv
	output.push(0xf0 | ((cp >> 18) & 0x7));
	output.push(0x80 | ((cp >> 12) & 0x3f));
	output.push(0x80 | ((cp >> 6) & 0x3f));
	output.push(0x80 | (cp & 0x3f));
} else {
	assert(0 && "will never have a code point this big");
}

(NOTE: code adapted from http-tweak.)

UTF-8 is the most widely used character encoding on the internet, and is probably the one one you want to use when thinking about encoding text. Notice that UTF-8 "just works" when treated as if it were ASCII. Linux and MacOS provide good UTF-8 support, as do many Windows programs. Though some Windows programs (Notepad) have historically done really silly things with UTF-8, like include a UTF-16 BOM encoded as UTF-8 at the start of files.

Font Rendering Basics

Now you now how to encode character data; how do you display it? Let me grab a screenshot of the course web page to illustrate.

Data -> Glyphs

First, you need to figure out how to map your byte stream to glyphs. It turns out that glyphs are not 1-1 with code points, however, thanks to multi-character glyphs ("ligatures") -- these arise from combinations of letters that get so close together it makes sense to combine them for easier readability. Common latin ligatures include "ft", "fi", and "ff".

We can find a ligature in the screenshot!

Idea: ignore ligatures.

Idea: use a tree (can build in UTF-8 decoding, though probably shouldn't).

Glyphs -> Positions

Once you have a list of glyphs, you need to figure out how to position them. This involves two tasks: spacing and line-breaking.

Spacing is also known as "kerning" -- some letters just look better when closer together or further apart. Now that you know the term kerning, you are officially typography nerds.

Generally, kerning information is stored in a big table in the font file.

Spacing, and -- in general -- kerning information is stored in a big table in the font file.

Side note: for scripts with more complicated shaping than the Latin alphabet, the logic isn't necessarily in the font file and you need a text shaping library, e.g., HarfBuzz to do the Data -> Glyphs + Positions translation.

Line breaking can be arbitrarily complex. One of the reasons TeX is still a great text formatting engine is that it has really, really good line breaking. The simple algorithm "if line is too long move word" doesn't always work, but maybe just stick with that for now and hope you don't need something better. (There are fun dynamic programming solutions, with remarkably large state spaces, depending on if you want to deal with aligned margins, hyphenation, and rivers.)

Glyphs + Positions -> Pixels

Finally, there's the task of actually rendering glyphs on the screen. Glyphs are generally stored as vector outlines.

Side note: this makes fonts ~3x as large as if they were based on strokes. If I recall properly, one cell phone manufacturer went so far as to develop stroke-based fonts for the CJK code pages in order to save a startling number of megabytes.

Side note 2: what about color? Traditionally, not supported by font formats, but the prevalence of colorful emoji have dramatically complicated things. (Especially because Apple [color bitmap], Google [different color bitmap], Microsoft [layered vectors], and Adobe [svg] all proposed different ways of storing them all of which were added to OpenType.) (see wikipedia)

First cut: simple inclusion testing. (Possible implementation: outlines -> triangles, call OpenGL to render.) Makes text unreadable at small sizes.

Second cut: coverage-based AA. Makes text blurry. (Possible, approximate, implementation: render to a texture, use bilinear sampling + mipmap. More on this later.)

Third cut: hinting. Nudge letters to lie on pixel boundaries. Sharpens things up, but can interact with kerning and perceived font weight. This used to be such a black art that ".ttf" includes support for adding general-purpose bytecode to fonts that you run to hint them, so that hinting algorithms could be developed by font authors.

Fourth cut: subpixel AA. Use the fact that RGB parts of pixels are disjoint to give better precision. (You could do this, in general, in a video game if you really wanted; but might mess with color fidelity.)

Font Rendering In Games

Texture Font. Store pictures of all of the glyphs you need at reasonably high resolution (alpha-only), along with a kerning table, and maybe a table of ligatures. Render them on quads. Problem: fuzzy text up close. One solution: signed-distance-field text.

Vector Font. Good for up close, bad if you get far away.

Use FreeType. It's a solid library. You can also use it as part of your asset pipeline to pre-render font meshes (see: Rainbow).