当前位置:首页 > 译文赏析


发布时间:2013-2-4      阅读次数:1491

•适用的比特率的范围很广(从2.15 kbps 到44 kbps)
2.3 预处理器
这一部分引用了在1.1.x 分支中介绍的预处理器模块。预处理器是设计用来在运行编码器之前来处理声音的。预处理器提供了三个主要的功能:
图2.1 声学回音模型
2.4 自适应抖动缓冲器
2.5 回声消除器
2.6 重采样器
2.2 Codec
The main characteristics of Speex can be summarized as follows:
• Free software/open-source, patent and royalty-free
• Integration of narrowband and wideband using an embedded bit-stream
• Wide range of bit-rates available (from 2.15 kbps to 44 kbps)
• Dynamic bit-rate switching (AMR) and Variable Bit-Rate (VBR) operation
• Voice Activity Detection (VAD, integrated with VBR) and discontinuous transmission (DTX)
• Variable complexity
• Embedded wideband structure (scalable sampling rate)
• Ultra-wideband sampling rate at 32 kHz
• Intensity stereo encoding option
• Fixed-point implementation
2.3 Preprocessor
This part refers to the preprocessor module introduced in the 1.1.x branch. The preprocessor is designed to be used on the
audio before running the encoder. The preprocessor provides three main functionalities:
• noise suppression
• automatic gain control (AGC)
• voice activity detection (VAD)
2 Codec description
Figure 2.1: Acoustic echo model
The denoiser can be used to reduce the amount of background noise present in the input signal. This provides higher quality
speech whether or not the denoised signal is encoded with Speex (or at all). However, when using the denoised signal with the
codec, there is an additional benefit. Speech codecs in general (Speex included) tend to perform poorly on noisy input, which
tends to amplify the noise. The denoiser greatly reduces this effect.
Automatic gain control (AGC) is a feature that deals with the fact that the recording volume may vary by a large amount
between different setups. The AGC provides a way to adjust a signal to a reference volume. This is useful for voice over
IP because it removes the need for manual adjustment of the microphone gain. A secondary advantage is that by setting the
microphone gain to a conservative (low) level, it is easier to avoid clipping.
The voice activity detector (VAD) provided by the preprocessor is more advanced than the one directly provided in the
2.4 Adaptive Jitter Buffer
When transmitting voice (or any content for that matter) over UDP or RTP, packet may be lost, arrive with different delay,
or even out of order. The purpose of a jitter buffer is to reorder packets and buffer them long enough (but no longer than
necessary) so they can be sent to be decoded.
2.5 Acoustic Echo Canceller
In any hands-free communication system (Fig. 2.1), speech from the remote end is played in the local loudspeaker, propagates
in the room and is captured by the microphone. If the audio captured from the microphone is sent directly to the remote end,
then the remove user hears an echo of his voice. An acoustic echo canceller is designed to remove the acoustic echo before it
is sent to the remote end. It is important to understand that the echo canceller is meant to improve the quality on the remote
2.6 Resampler
In some cases, it may be useful to convert audio from one sampling rate to another. There are many reasons for that. It can
be for mixing streams that have different sampling rates, for supporting sampling rates that the soundcard doesn’t support, for
transcoding, etc. That’s why there is now a resampler that is part of the Speex project. This resampler can be used to convert
between any two arbitrary rates (the ratio must only be a rational number) and there is control over the quality/complexity


