Man page - mdbflt(5)
Packages contains this manual
Manual
mdbFLT
NAMEDESCRIPTION
SYNTAX and SEMANTICS
CONTEXT DEPENDENT BEHAVIOR
SEE ALSO
COPYRIGHT
NAME
mdbFLT - Font Layout Table
DESCRIPTION
For simple scripts, the rendering engine converts character codes into glyph codes one by one by consulting the encoding of each selected font. But, to render text that requires complicated layout (e.g. Thai and Indic scripts), one to one conversion is not sufficient. A sequence of characters may have to be drawn as a single ligature. Some glyphs may have to be drawn at 2-dimensionally shifted positions.
To handle those complicated scripts, the m17n library uses Font Layout Tables (FLTs for short). The FLT driver interprets an FLT and converts a character sequence into a glyph sequence that is ready to be passed to the rendering engine.
An FLT can contain information to extract a grapheme cluster from a character sequence and to reorder the characters in the cluster, in addition to information found in OpenType Layout Tables (CMAP, GSUB, and GPOS).
An FLT is a cascade of one or more conversion stages. In each stage, a sequence is converted into another sequence to be read in the next stage. The length of sequences may differ from stage to stage. Each element in a sequence has the following integer attributes.
|
β’ |
code |
In the first conversion stage, this is the character code in the original character sequence. In the last stage, it is the glyph code passed to the rendering engine. In other cases, it is an intermediate glyph code.
|
β’ |
category |
The category code defined in the CATEGORY-TABLE of the current stage, or defined in the one of the former stages and not overwritten by later stages.
|
β’ |
combining-spec |
If nonzero, it specifies how to combine this (intermediate) glyph with the previous one.
|
β’ |
left-padding-flag |
If nonzero, it instructs the rendering function to insert a padding space before this (intermediate) glyph so that the glyph does not overlap with the previous one.
|
β’ |
right-padding-flag |
If nonzero, it instructs the
rendering function to insert a padding space after this
(intermediate) glyph so that the glyph does not overlap with
the next one.
When the layout engine draws text, it at first determines a
font and an FLT for each character in the text. For each
subsequence of characters that use the same font and FLT,
the layout engine generates a corresponding intermediate
glyph sequence. The code attribute of each element in the
intermediate glyph sequence is its character code, and all
other attributes are zeros. This sequence is processed in
the first stage of FLT as the current
run
(substring).
Each stage works as follows.
At first, if the stage has a
CATEGORY-TABLE
, the
category of each glyph in the current run is updated. If
there is a glyph that has no category, the current run ends
before that glyph.
Then, the default values of code-offset, combining-spec, and
left-padding-flag of this stage are initialized to zero.
Next, the initial conversion rule of the stage is applied to
the current run.
Lastly, the current run is replaced with the newly produced
(intermediate) glyph sequence.
SYNTAX and SEMANTICS
The m17n library loads an FLT
from the m17n database using the tag <font, layouter,
FLT-NAME>. The date format of an FLT is as follows:
FONT-LAYOUT-TABLE ::= FLT-DECLARATION ? STAGE0 STAGE *
FLT-DECLARATION
::= β(β βfontβ
βlayouterβ FLT-NAME nil PROP * β)β
FLT-NAME ::= SYMBOL
PROP :: = VERSION | FONT
VERSION ::= β(β βversionβ MTEXT
β)β
FONT ::= β(β βfontβ FONT-SPEC
β)β
FONT-SPEC ::=
β(β [[ FOUNDRY FAMILY
[ WEIGHT [ STYLE [ STRETCH [ ADSTYLE ]]]]]
REGISTRY ]
[ OTF-SPEC ] [ LANG-SPEC ] β)β
STAGE0 ::= CATEGORY-TABLE GENERATOR
STAGE ::= CATEGORY-TABLE ? GENERATOR
CATEGORY-TABLE ::= β(β βcategoryβ CATEGORY-SPEC + β)β
CATEGORY-SPEC
::= β(β CODE CATEGORY β)β
| β(β CODE CODE CATEGORY β)β
CODE ::= INTEGER
CATEGORY ::=
INTEGER
In the definition of
CATEGORY-SPEC
,
CODE
is a glyph code, and
CATEGORY
is ASCII code of an
upper or lower letter, i.e. one of βAβ, ...
βZβ, βaβ, .. βzβ.
The first form of
CATEGORY-SPEC
assigns
CATEGORY
to a glyph whose code is
CODE
.
The second form assigns
CATEGORY
to glyphs whose
code falls between the two
CODEs
.
GENERATOR ::= β(β βgeneratorβ RULE
MACRO-DEF * β)β
RULE ::=
REGEXP-BLOCK | MATCH-BLOCK | SUBST-BLOCK | COND-BLOCK
FONT-FACILITY-BLOCK | DIRECT-CODE | COMBINING-SPEC |
OTF-SPEC
| PREDEFINED-RULE | MACRO-NAME
MACOR-DEF ::=
β(β MACRO-NAME RULE + β)β
Each
RULE
specifies glyphs to be consumed and
glyphs to be produced. When some glyphs are consumed, they
are taken away from the current run. A rule may fail in some
condition. If not described explicitly to fail, it should be
regarded that the rule succeeds.
DIRECT-CODE ::= INTEGER
This rule consumes no glyph and produces a glyph which has
the following attributes:
|
β’ |
code : INTEGER plus the default code-offset |
|||
|
β’ |
combining-spec : default value |
|||
|
β’ |
left-padding-flag : default value |
|||
|
β’ |
right-padding-flag : zero |
After having produced the glyph,
the default code-offset, combining-spec, and
left-padding-flag are all reset to zero.
PREDEFINED-RULE ::= β=β | β*β |
β<β | β>β | β|β |
β[β | β]β
They perform actions as follows.
|
β’ |
= |
This rule consumes the first glyph in the current run and produces the same glyph. It fails if the current run is empty.
|
β’ |
* |
This rule repeatedly executes the previous rule. If the previous rule fails, this rule does nothing and fails.
|
β’ |
< |
This rule specifies the start of a grapheme cluster.
|
β’ |
> |
This rule specifies the end of a grapheme cluster.
|
β’ |
@ [ |
This rule sets the default left-padding-flag to 1. No glyph is consumed. No glyph is produced.
|
β’ |
@ ] |
This rule changes the right-padding-flag of the lastly generated glyph to 1. No glyph is consumed. No glyph is produced.
|
β’ |
| |
This rule consumes no glyph and
produces a special glyph whose category is β β
and other attributes are zero. This is the only rule that
produces that special glyph.
REGEXP-BLOCK ::= β(β REGEXP RULE *
β)β
REGEXP ::= MTEXT
MTEXT
is a regular expression that should match the
sequence of categories of the current run. If a match is
found, this rule executes
RULEs
temporarily
limiting the current run to the matched part. The matched
part is consumed by this rule.
Parenthesized subexpressions, if any, are recorded to be
used in
MATCH-BLOCK
that may appear in one of
RULEs
.
If no match is found, this rule fails.
MATCH-BLOCK ::= β(β MATCH-INDEX RULE *
β)β
MATCH-INDEX ::=
INTEGER
MATCH-INDEX
is an integer specifying a parenthesized
subexpression recorded by the previous
REGEXP-BLOCK
. If such a subexpression was found by
the previous regular expression matching, this rule executes
RULEs
temporarily limiting the current run to the
matched part of the subexpression. The matched part is
consumed by this rule.
If no match was found, this rule fails.
If this is the first rule of the stage,
MATCH-INDEX
must be 0, and it matches the whole current run.
SUBST-BLOCK ::= β(β SOURCE-PATTERN RULE *
β)β
SOURCE-PATTERN
::= β(β CODE + β)β
| (β βrangeβ CODE CODE β)β
If the sequence of codes of the current run matches
SOURCE-PATTERN
, this rule executes
RULEs
temporarily limiting the current run to the matched part.
The matched part is consumed.
The first form of
SOURCE-PATTERN
specifies a
sequence of glyph codes to be matched. In this case, this
rule resets the default code-offset to zero.
The second form specifies a range of codes that should match
the first glyph code of the code sequence. In this case,
this rule sets the default code-offset to the first glyph
code minus the first
CODE
specifying the range.
If no match is found, this rule fails.
FONT-FACILITY-BLOCK ::= β(β FONT-FACILITY RULE *
β)β
FONT-FACILITY = β(β βfont-facilityβ
CODE * β)β
| β(β βfont-facilityβ FONT-SPEC
β)β
If the current font has glyphs for
CODEs
or matches
with
FONT-SPEC
, this rule succeeds and
RULEs
are executed. Otherwise, this rule fails.
COND-BLOCK ::= β(β βcondβ RULE +
β)β
This rule sequentially executes
RULEs
until one
succeeds. If no rule succeeds, this rule fails. Otherwise,
it succeeds.
OTF-SPEC ::= SYMBOL
OTF-SPEC
is a symbol whose name specifies an
instruction to the OTF driver. The name has the following
syntax.
OTF-SPEC-NAME ::= β:otf=β SCRIPT LANGSYS ?
GSUB-FEATURES ? GPOS-FEATURES ?
SCRIPT ::= SYMBOL
LANGSYS ::= β/β SYMBOL
GSUB-FEATURES ::= β=β FEATURE-LIST ?
GPOS-FEATURES ::= β+β FEATURE-LIST ?
FEATURE-LIST ::=
( SYMBOL β,β ) * [ SYMBOL | β*β ].fi
Each
SYMBOL
specifies a tag name defined in the
OpenType specification.
For
SCRIPT
,
SYMBOL
specifies a Script tag
name (e.g. deva for Devanagari).
For
LANGSYS
,
SYMBOL
specifies a Language
System tag name. If
LANGSYS
is omitted, the Default
Language System table is used.
For
GSUB-FEATURES
, each
SYMBOL
in
FEATURE-LIST
specifies a GSUB Feature tag name to
apply. β*β is allowed as the last item to
specify all remaining features. If
SYMBOL
is
preceded by βΛβ and the last item is
β*β,
SYMBOL
is excluded from the
features to apply. If no
SYMBOL
is specified, no
GSUB feature is applied. If
GSUB-FEATURES
itself is
omitted, all GSUB features are applied.
When
OTF-SPEC
appears in a
FONT-SPEC
,
FEATURE-LIST
specifies features that the font must
have (or must not have if preceded by βΛβ),
and the lastβ*β, even if exists, has no meaning.
The specification of
GPOS-FEATURES
is analogous to
that of
GSUB-FEATURES
.
Please note that all the tags above must be 4 ASCII
printable characters.
See the following page for the OpenType specification.
http://www.microsoft.com/typography/otspec/default.htm
COMBINING ::= SYMBOL
COMBINING
is a symbol whose name specifies how to
combine the next glyph with the previous one. This rule sets
the default combining-spec to an integer code that is unique
to the symbol name. The name has the following syntax.
COMBINING-NAME ::= VPOS HPOS OFFSET VPOS HPOS
VPOS ::= βtβ | βcβ | βbβ | βBβ
HPOS ::= βlβ | βcβ | βrβ
OFFSET :: = β.β | XOFF | YOFF XOFF ?
XOFF ::= (β<β | β>β) INTEGER ?
YOFF ::=
(β+β | β-β) INTEGER ?
VPOS
and
HPOS
specify the vertical and
horizontal positions as described below.
POINT VPOS HPOS
----- ---- ----
0----1----2 <---- top 0 t l
| | 1 t c
| | 2 t r
| | 3 B l
9 10 11 <---- center 4 B c
| | 5 B r
--3----4----5-- <-- baseline 6 b l
| | 7 b c
6----7----8 <---- bottom 8 b r
9 c l
| | | 10 c c
left center right 11 c r
The left figure shows 12 reference points of a glyph by
numbers 0 to
|
11. |
The rectangle 0-6-8-2 is the bounding box of the glyph, the positions 3, 4, and 5 are on the baseline, 9-11 are on the vertical center of the box, 0-2 and 6-8 are on the top and on the bottom respectively. 1, 10, 4, and 7 are on the horizontal center of the box. |
The right table shows how those
reference points are specified by a pair of
VPOS
and
HPOS
.
The first
VPOS
and
HPOS
in the definition
of
COMBINING-NAME
specify the reference point of
the previous glyph, and the second
VPOS
and
HPOS
specify that of the next glyph. The next glyph
is drawn so that these two reference points align.
OFFSET
specifies the way of alignment in detail. If it
is β.β, the reference points are on the same
position.
XOFF
specifies how much the X position of the reference
point of the next glyph should be shifted to the left
(β<β) or right (β>β) from the
previous reference point.
YOFF
specifies how much the Y position of the reference
point the next glyph should be shifted upward
(β+β) or downward (β-β) from the
previous reference point.
In both cases,
INTEGER
is the amount of shift
expressed as a percentage of the font size, i.e., if
INTEGER
is 10, it means 10% (1/10) of the font
size. If
INTEGER
is omitted, it is assumed that 5
is specified.
Once the next glyph is combined with the previous one, they
are treated as a single combined glyph.
MACRO-NAME ::= SYMBOL
MACRO-NAME
is a symbol that appears in one of
MACRO-DEF
. It is exapanded to the sequence of the
corresponding
RULEs
.
CONTEXT DEPENDENT BEHAVIOR
So far, it has been assumed that
each sequence, which is drawn with a specific font, is
context free, i.e. not affected by the glyphs preceding or
following that sequence. This is true when sequence S1 is
drawn with font F1 while the preceding sequence S0
unconditionally requires font F0.
sequence S0 S1
currently used font F0 F1
usable font(s) F0 F1
Sometimes, however, a clear separation of sequences is not
possible. Suppose that the preceding sequence S0 can be
drawn not only with F0 but also with F1.
sequence S0 S1
currently used font F0 F1
usable font(s) F0,F1 F1
In this case, glyphs used to draw the preceding S0 may
affect glyph generation of S1. Therefore it is necessary to
access information about S0, which has already been
processed, when processing S1. Generation rules in the first
stage (only in the first stage) accept a special regular
expression to access already processed parts.
"RE0 RE1"
RE0
and
RE1
are regular expressions that match
the preceding sequence S0 and the following sequence S1,
respectively.
Pay attention to the space between the two regular
expressions. It represents the special category β
β (see above). Note that the regular expression above
belongs to glyph generation rules using font F1, therefore
not only RE1 but also RE0 must be expressed with the
categories for F1. This means when the preceding sequence S0
cannot be expressed with the categories for F1 (as in the
first example above) generation rules having these patterns
never match.
SEE ALSO
mdbGeneral(5) , FLTs provided by the m17n database
COPYRIGHT
Copyright (C) 2001
Information-technology Promotion Agency (IPA)
Copyright (C) 2001-2011 National Institute of Advanced
Industrial Science and Technology (AIST)
Permission is granted to copy, distribute and/or modify this
document under the terms of the GNU Free Documentation
License <http://www.gnu.org/licenses/fdl.html>.