Real-world text usually contains codepoints from a mixture of
different Unicode scripts (including punctuation, numbers, symbols,
white-space characters, and other codepoints that do not belong
to any script). Real-world text may also be marked up with
formatting that changes font properties (including the font,
font style, and font size).
For shaping purposes, all real-world text streams must be first
segmented into runs that have a uniform set of properties.
In particular, shaping models always assume that every codepoint
in a text run has the same direction,
script tag, and
language tag.