[Next] [Up] [Previous]
Next: Non-speech audio in Up: Summary of work Previous: Speech synthesis

Audio as a data-type

The practical problem of how audio data should be managed has been addressed in order to deal with the following issues:

The work done at DEC CRL on the AudioFile [LPT<6353>>+93] project is particularly significant in using audio resources effectively. AudioFile, using the same conceptual model as the X-windows system, provides a client/server model for accessing audio devices on a variety of platforms. Several applications such as answering machines can be very easily built on top of AudioFile, which is publicly available from FTP://crl.dec.com/pub/DEC/AF. The speech skimmer project at the MIT Media Labs allows a listener to interactively skim recorded speech and listen to it at several levels of detail. See [Aro92c][Aro91a][ASea88][ABLS89][SA89][Aro93a][Aro91b][SASH93][Aro92a][Aro92b][Aro93b] for work carried out in the Speech Group at the MIT Media Labs on manipulating digitized speech and audio.

CSOUND, a music synthesis system developed at MIT by Barry Vercoe, can be used for real-time synthesis of complex audio. Researchers at NASA Ames have developed the convolvotron [WWK91][WF90], a system for real-time synthesis of directional audio. The convolvotron is computationally intensive, but the power available on today's desktop has seen the development of scaled-down versions of this technology in the form of QSOUND for the Apple and Intel-[tex2html_wrap5918] platforms.


TV Raman
Thu Mar 9 20:10:41 EST 1995