Creator:
Date:
Abstract:
Mobile radio bandwidth can be efficiently utilized by not transmitting a talker's voice signal during pauses, which on the average accounts for 60% of a party's time in a conversation. A speech detector (SD) is needed to identify the speech-plus-noise and noise-only intervals within the talker's audio signal. In this way, both voice and data users can share the radio frequencies on demand.
The SD should be simple, as each voice terminal must possess one. Deployed in a background of highly variable acoustic noise, the SD should be adaptive and robust. Also, the detector should produce acceptable speech
quality and yet realize a bandwidth-compression gain close to the reciprocal of the intrinsic speech activity. Lastly, the detector's talkspurt/pause statistics should be such that the speech transmission is tolerant to variable delay induced by a dynamic-resource-sharing network.
This thesis describes the design and development of a SD to meet the above requirements. Two signal features are used to make the binary decision: magnitude-energy and zero crossings. The SD also uses a "hangover" to admit weak utterances as speech and to produce continuous speech flow. The novel part of the SD is to use a signal-variability measure to adapt the parameters of the discrimination devices to the background noise characteristics.
The SD has been implemented and developed on the 2920 single-chip digital signal processor. From observations, subjective evaluation, and objective measurements, the SD is shown to meet the said requirements.