[Next] [Up] [Previous]

We refine the quasi-prefix form by adding the following subtypes. This makes recognizing and handling complex mathematical content cleaner.

We first introduce object *math subformula*, which is
used to capture subexpressions appearing within the
[tex2html_wrap5306] and [tex2html_wrap5308] of La)TeX. Object
*math subformula* can be thought of as being the math
equivalent of object *text block* described in
s:high-level-models. It has the following structure:

**Attribute:**Visual attributes.**Content:**The mathematical content represented as a*math object*.

We need object *math subformula* to represent
expressions of the form:

[displaymath5302]

[displaymath5303]

In representing each of the above examples, object *math
subformula* is essential in capturing the expression to
which the overbrace/underbrace applies.

To enable recognition of written mathematics, tokens have to be appropriately classified. Our classification of tokens when processing written mathematics is inspired by appendix F of the TeX Book, [Knu84].

The symbols divide naturally into groups based on their mathematical class (Ord, Op, Bin, Rel, Open, Close, or Punct), [tex2html_wrap5310]

We introduce subtypes of object *math object* to
correspond to each token type:

**Ordinary:**TeX ord. Letters, numbers and some miscellaneous symbols.**Big operator:**TeX Op. The*large*operators that typically appear as unary operators,*e.g.,*[tex2html_wrap5312], [tex2html_wrap5314], [tex2html_wrap5316].**Binary operator:**TeX Bin. The binary operators,*e.g.,*+, [tex2html_wrap5320].**Relational operator:**TeX Rel,*e.g.,*<, [tex2html_wrap5324]. We subdivide the TeX Rel class into relational and arrow operators.**Arrow operators:**Arrows such as [tex2html_wrap5326], [tex2html_wrap5328].**Mathematical function:**Plain TeX and LaTeX define [tex2html_wrap5330] etc. as macros. We introduce an object type,*mathematical function*to represent these.**Open delimiter:**TeX Open,*e.g.,*[tex2html_wrap5332], [tex2html_wrap5334].**Close delimiter:**TeX Close,*e.g.,*[tex2html_wrap5336], [tex2html_wrap5338].**Math punctuation :**TeX Punct -punctuation marks.

Written mathematical notation uses *juxtaposition* as
an infix operator. Juxtaposition, as in [tex2html_wrap5340],
mostly denotes multiplication, but can mean function
application in certain contexts -[tex2html_wrap5342]. We
introduce a new operator to represent juxtaposition, and to
define it precisely, we also assert that all mathematical
variables are single letters. Thus, [tex2html_wrap5344] is
represented as the juxtaposition of three *ordinary*
objects. This assertion is not specific to our internal
representation, rather, it specifies the concrete syntax used
in the electronic markup and reflects the choice made in the
design of TeX. We do allow mathematical variables made up of
more than one character, but these should be clearly marked up
as such, *e.g.,* as [tex2html_wrap5346], by using
`\mbox`

as in `$\mbox{cab}=cab$`

.

The classification of a math object is defined using the
following command: (`define-math-classification`
*token* *classification*)

In certain special cases, the predefined classification
shown above can be modified. A good example of this is
recognizing a mathematical text that consistently uses the
letters [tex2html_wrap5348], [tex2html_wrap5350] and
[tex2html_wrap5352] to denote functions. Using the predefined
classification, the recognizer would treat [tex2html_wrap5354]
as object *ordinary*, leading to [tex2html_wrap5356]
being represented as the juxtaposition of two objects, namely,
[tex2html_wrap5358] and [tex2html_wrap5360]. Declaring
[tex2html_wrap5362] to be a mathematical function by executing
(`define-math-classification` f
`mathematical-function-name`)

results in occurrences of [tex2html_wrap5364] being treated as a function. Hence, [tex2html_wrap5366] is correctly recognized as a function application. Note that the correct interpretation of such notation is more important for browsing than for speaking the expression.

[Next] [Up] [Previous]

TV Raman

Thu Mar 9 20:10:41 EST 1995