Eyes-Free Internet Access
T. V. Raman | IBM Research | raman@cs.cornell.edu |
WWW |
Screen Contents Spoken Aloud.
User explores visual display to:
- Construct a mental model of the interface.
- And interpret intent of the UI.
Aural output is Derived from visual
display.
- Encapsulate display in an off-screen model.
- Present this model aurally.
- Enable navigation of this model.
Aural feedback lacks application
context.
Aural feedback from reading the screen:
- Degrades as visual interface gets richer.
- Misses information implicit in visual layout.
All computing applications :
- Obtain user input
- Compute on the information
- Display the results
Steps 1 and 3 constitute the user
interface
- Treat speech as a first class medium.
- Application produces its own feedback.
- Exploit features of the spoken medium.
Audio output independent of visual
display.
- Produce intuitive feedback.
- Provide a simpler user model.
- Reduces users cognitive load.
User works with one --not two--
applications.
Easy to perceive relevant information.
January 2000
Sun | Mon
| Tue | Wed | Thu
| Fri | Sat |
| | | | | | 1 |
2 | 3
| 4 | 5
| 6 | 7
| 8 |
9 | 10
| 11 | 12 | 13
| 14 | 15 |
16 | 17
| 18 | 19 | 20
| 21 | 22 |
23 | 24 | 25
| 26 | 27 | 28
| 29 |
30 | 31 |
|
|
|
|
|
- Speech server abstracts device interface
- Core modules provide speech services.
- Application extensions provide rich spoken feedback.
Does not modify Emacs
code-base.
Speech server provides core speech
services.
- Clients connect to speech server.
- Server provides speech services:
- Speak text.
- Set speech parameters.
- Stop, pause or resume speech.
- Clients protected from device dependencies.
Speech servers are currently implemented in
TCL
Core speech services provided by the Emacspeak
platform
- Speak a region of text.
- Configure context-specific pronunciation and
prosody.
- Annotate text to produce audio formatted output.
- Enhance auditory output with auditory icons.
Emacspeak core encourages code re-use throughout
Emacspeak.
Lisp advice facility:
- Extend code functionality without modifying original
source.
- Advice types:
- Advice fragments enhance and modify original
behavior.
Speech-enables Emacs without modifying code base
(defadvice next-line (after emacspeak pre act)
"Speak line that you just moved to."
(when (interactive-p)
(emacspeak-speak-line )))
- Latest version: over 40,000 lines:
- Core
- 7,000 lines
- Speech-enables
- Over 80 Emacs packages
- Speech-enables all of Emacs 20.4
- Speech-enables popular non-bundled extensions like
VM, BBDB, and W3.
Speech-enabling extensions are a fraction of the size of the application being speech-enabled.
- Succinct contextual speech feedback.
- Auditory icons augment interaction.
User focuses on task at hand.
Making speech interaction a first-class citizen on
Linux
- Continue UNIX tradition of keeping the UI separable
from the underlying computation engine.
- Exploit modular architecture of Gnome and KDE.
- Introduce speech services layer for both input and
output.
Make speech-enabling Gnome and KDE clients a
breeze.
Integrate speech services into the ORBs used by Gnome and
KDE.
Standardized speech services to provide:
- Customizable speech synthesis
- Customizable auditory displays
- Context-sensitive speech input
Speech-enabling Linux crucial for embedded appliance
space.
Emacspeak would not be possible in the closed source world.