The technique used by written mathematical notation to cue tree structure is insufficient for audio renderings. Using a wide array of delimiters to write mathematics works, since the eye is able to quickly traverse the written formulae and pair off matching delimiters. The situation is slightly different in audio; merely announcing the delimiters as they appear is not enough -when listening to a delimited expression, the listener has to remember the enclosing delimiters. This insight was gained as a result of work in summer 91[+], when we implemented a prototype audio formatter for mathematical expressions. Fleeting sound cues (with the pitch conveying nesting level) were used to ``display'' mathematical delimiters, but deeply nested expressions were difficult to understand.
AsTeR enables a listener to keep track of the nesting level by using a persistent speech cue, achieved by moving along dim-children, when rendering the contents of a delimited expression. This, in combination with fleeting cues for signalling the enclosing delimiters, permits a listener to better comprehend deeply nested expressions. This is because the ``nesting level information'' is implicitly cued by the currently active voice (a persistent cue ) used to render the parenthesized expression.
To give some intuition, we can think of different visual delimiters as introducing different ``functional colors'' at different subtrees of the expression. Using different AFL states to render the various subtrees introduces an equivalent ``audio coloring''. The structure imposed on the audio space by the AFL operators enables us to pick ``audio colors'' that introduce relative changes. This notion of relative change is vital in effectively conveying nested structures.
Mathematical expressions are spoken as infix or prefix depending on the operator and the currently active rendering style. The large operators such as [tex2html_wrap5698], in addition to the mathematical functions like [tex2html_wrap5700], are rendered as prefix. All other expressions are rendered as infix. A persistent speech cue indicates the nesting level -the AFL state is varied along audio dimension dim-children before rendering the children of an operator. The number of new states is minimized -complexity of math objects and precedence of mathematical operators determine if a new state is to be used (see s:post_processing for details on the complexity measure used). Thus, while new AFL states are used when rendering the numerator and denominator of [tex2html_wrap5702], no new AFL state is introduced when rendering [tex2html_wrap5704]. Similarly, when rendering [tex2html_wrap5706], no new AFL state is used to speak [tex2html_wrap5708], but when rendering [tex2html_wrap5710], a new AFL state is used to render the argument to [tex2html_wrap5712].
In the context of rendering sub-expressions, introducing new AFL states can be thought of as parenthesizing in the visual context. In the light of this statement, the above assertion about minimizing AFL states can be interpreted as avoiding the use of unnecessary parentheses in the visual context. Thus, we write [tex2html_wrap5714], rather than [tex2html_wrap5716], but we use parentheses to write [tex2html_wrap5718]. Analogously, it is not necessary to introduce a new state for speaking the fraction when rendering [tex2html_wrap5720], whereas a new rendering state is introduced to speak the numerator and denominator of [tex2html_wrap5722].[+]
Dimension dim-children has been chosen to provide five to six unique points. This means that deeply nested structures such as continuous fractions are rendered unambiguously.
Consider the following example:
Here, the voice drops by one step as each level of the continuous fraction is rendered. Since this effect is cumulative, the listener can perceive the deeply nested nature of the expression. The rendering rule for fractions is shown in fig:fraction-rule. Notice that this rendering rule handles simple fractions differently. When rendering fractions of the form [tex2html_wrap5726], no new AFL states are used. In addition, there is a subtle verbal cue; when rendering simple fractions, AsTeR speaks ``over'' instead of ``divided by''. This distinction seems to make the renderings more effective, and in some of the informal tests we have carried out, listeners disambiguated between expressions using this distinction without even being aware of it.
: Rendering rule for fractions.