There is little similarity between developing a written notation and its audio counterpart. However, the evolution of written notation shows the following. Any notational system is a combination of conventions and an intuitive use of the various dimensions that are provided by the perceptual modality and the means available to produce output appropriate for that modality.
We use this insight to develop a concise audio notation for spoken mathematics that exploits the available audio dimensions. It is conceivable that the number of audio dimensions will increase with the improvement in the relevant technology, enabling more sophisticated notational systems in the future.
We characterize all of written mathematical notation as follows:
The visual cues used to project the tree structure are independent of the cues used to produce the attributes. Hence, attributes may themselves contain arbitrarily complex tree structures. Thus, conventional mathematical notation uses a consistent set of visual layout primitives to construct complex displays.
Written notation provides the ability to render mathematical objects without understanding their meaning. The underlying structure can be recreated by a reader familiar with the subject matter at hand and the notational system in use. Internalizing and browsing this structure is helped by the use of different types of visual delimiters such as [tex2html_wrap5646], [tex2html_wrap5648], [tex2html_wrap5650], [tex2html_wrap5652], [tex2html_wrap5654] and [tex2html_wrap5656] -these help the author mark off ``interesting'' subtrees within an expression.
In contrast, plain spoken renderings of mathematical expressions are completely linear, thereby losing much of this expressive power. Spoken descriptions of complex mathematics (found on talking books) compensate for this loss of expressive power by using extra-textual phrases, thereby making the renderings verbose.
To overcome these problems, we develop an equivalent audio notation. The first step is to identify dimensions in the audio space to parallel the functionality of the dimensions in the visual setting. The second step is to augment these audio dimensions with the use of pauses, intonational cues such as voice inflection, and descriptive phrases.
AsTeR implements this notational system by using fleeting and persistent cues, especially by exploiting the computer's ability to vary the characteristics of a synthetic voice. Renderings produced are therefore much more concise.
Our audio notation minimizes the verbiage in math renderings. Concise renderings serve to convey the concepts involved succinctly, leaving the listener time to reason about the expression. More descriptive renderings (with explanatory phrases to cue structure) can be used when listening to unfamiliar material. Thus, there is a wide range of possible renderings of a math expression varying between fully descriptive and completely notational. The choice of how much to rely on the audio notation, and how descriptive renderings should be, is entirely subjective.
Here are the features we require of our audio notation for mathematics: