HarfBuzz's level 2 cluster behavior uses a significantly
different model than that of level 0 and level 1.
The level 2 behavior is easy to describe, but it may be
difficult to understand in practical terms. In brief, level 2
performs no merging of clusters whatsoever.
This means that there is no initial base-and-mark merging step
(as is done in level 0), and it means that reordering moves and
ligature substitutions do not trigger a cluster merge.
Only one shaping operation directly affects clusters when using
level 2:
When glyphs do form a ligature (or when some other feature
substitutes multiple glyphs with one glyph) the cluster value
of the first glyph is retained as the cluster value for the
resulting ligature.
This occurrence sounds similar to a cluster merge, but it is
different. In particular, no subsequent characters —
including marks and modifiers — are affected. They retain
their previous cluster values.
Level 2 cluster behavior is ultimately less complex than level 0
or level 1, but there are several cases for which processing
cluster values produced at level 2 may be tricky.
Ligatures with combining marks in level 2
The first example of how HarfBuzz's level 2 cluster behavior
can be tricky is when the text to be shaped includes combining
marks attached to ligatures.
Let us start with an input sequence with the following
characters (top row) and initial cluster values (bottom row):
A,acute,B,breve,C,circumflex
0,1 ,2,3 ,4,5
If the sequence A,B,C
forms a ligature,
then these are the cluster values HarfBuzz will return under
the various cluster levels:
Level 0:
ABC,acute,breve,circumflex
0 ,0 ,0 ,0
Level 1:
ABC,acute,breve,circumflex
0 ,0 ,0 ,5
Level 2:
ABC,acute,breve,circumflex
0 ,1 ,3 ,5
Making sense of the level 2 result is the hardest for a client
program, because there is nothing in the cluster values that
indicates that B
and C
formed a ligature with A
.
In contrast, the "merged" cluster values of the mark glyphs
that are seen in the level 0 and level 1 output are evidence
that a ligature substitution took place.
Another example of how HarfBuzz's level 2 cluster behavior
can be tricky is when glyphs reorder. Consider an input sequence
with the following characters (top row) and initial cluster
values (bottom row):
A,B,C,D,E
0,1,2,3,4
Now imagine D
moves before
B
in a reordering operation. The cluster
values will then be:
A,D,B,C,E
0,3,1,2,4
Next, if D
forms a ligature with
B
, the output is:
A,DB,C,E
0,3 ,2,4
However, in a different scenario, in which the shaping rules
of the script instead caused A
and
B
to form a ligature
before the D
reordered, the
result would be:
AB,D,C,E
0 ,3,2,4
There is no way for a client program to differentiate between
these two scenarios based on the cluster values
alone. Consequently, client programs that use level 2 might
need to undertake additional work in order to manage cursor
positioning, text attributes, or other desired features.
Other considerations in level 2
There may be other problems encountered with ligatures under
level 2, such as if the direction of the text is forced to
the opposite of its natural direction (for example, Arabic text
that is forced into left-to-right directionality). But,
generally speaking, these other scenarios are minor corner
cases that are too obscure for most client programs to need to
worry about.