amr

ADAPTIVE MULTIRATE SPEECH CODING

Presented By:Irfan Yaqoob

RADIO PARAMETERS

FRAME STRUCTURE

SOURCE CODING Compresses the amount of information over the given channel. Improves the spectral efficiency of the radio interface. Related to the network capacity. Allows the system to introduce more powerful encoding techniques to counter the propagation & interference effects.

CHANNEL CODING Conditions the output of the source encoder for transmission over the channel. Transcoded speech is error protected by passing it through the channel encoder. Related to the network quality. Includes : Coding for forward error correction & detection Bit interleaving Modulation

SPEECH ENCODING & MODULATION

Digitized Speech is passed at 64kbps through speech coder, compresses the 64kbps PCM speech to 13kbps data rate. The transcoded speech is error protected by passing it through channel encoder, that utilizes parity & convolutional code thus increasing the bite rate to 22.8kbps (for GSM-FR) and 11.4kbps (for GSM-HR). The 456 bits are interleaved to combat burst errors. Channel encoded bits over two adjacent 20ms interval are split into eight blocks (114 bits) and transmitted over eight frames.

Interleaved data is then modulated at 270.8kbps using GMSK. Then it is passed through Duplexer to isolate, transmit and receive signals. The reverse process is applicable for the downlink of signals.

CODEC Digital algorithm used to encode speech signals. Codecs used in GSM Half Rate (HR) Full Rate (FR) Enhanced Full Rate (EFR) Adaptive Multirate (AMR)

FULL RATE (GSM - FR) Specified in ETSI 06.10 Based on RPE-LTP (Regular Pulse Excitation-Long Term Prediction) Uses linear prediction in as many others. Provides a bit rate of 13kbps. (260bits/20ms) Gradually replaced by EFR and AMR due its poor quality.

REGULAR PULSE EXCITATION LONG TERM PREDICTION (RPE-LTP) Used in order to reduce the data sent b/w MS and the BTS. When a voltage level of a particular speech sample is quantified, the Mobile Station's internal logic predicts the voltage level for the next sample. When the next sample is quantified, the packet sent by the MS to the BTS contains only the error.

HALF RATE (GSM - HR) Specified in ETSI 06.20 Based on VSELP (Vector Sum Excited Linear Prediction) Algorithm. Provides a bit rate 6.5kbps. (130bits/20ms) Requires half the bandwidth, so network capacity is doubled at the expense of audio quality. Consumes 30% less energy.

CODE EXCITED LINEAR PREDICTOR (CELP) Coder & decoder have predetermined book of stochastic excitation signals. For each speech signal, the transmitter searches its code book of stochastic signals and the index of one that gives the best match is transmitted. The receiver uses this index of code book to pick the correct excitation signal for its synthesizer filter. CELP are extremely complex, but can achieve bit rates of as low as 4.8kbps.

VECTOR SUM EXCITED LINEAR PREDICTOR (VSELP) Utilizes three excitation sources or codebooks. Each of these contain the equivalent of 128 vectors. The three excitation sequences are multiplied by corresponding gain terms and summed up. The combined excitation sequence is used for synthesizer filter. Provides highest speech quality, low computational complexity & robustness to channel errors.

ALGERBRAIC CODE EXCITED LINEAR PREDICTOR (ACELP) ACELP codebooks have a specific algebraic structure. A 16-bit algebraic codebook shall be used in the innovative codebook search, the aim of which is to find the best innovation and gain parameters. The innovation vector contains, at most, four non-zero pulses.

ENHANCED FULL RATE (GSM - EFR) Developed to improve the poor quality of FR. Provides a bit rate of 12.2kbps. (244bits/20ms) Compatible with the highest AMR mode. Consumes 5% more energy. Recommended to use only in poor reception areas.

ADAPTIVE MULTIRATE (AMR) Audio data compression technique for speech encoding. The AMR codec uses eight source codecs with bit rates of 12.2, 10.2, 7.95, 7.40, 6.70, 5.90, 5.15 and 4.75 kbps. It uses link adaptation to select from one of eight different bit rates based on link conditions. Link adaptation is the selection of the best codec mode to meet the local radio channel and capacity requirements.

ADAPTIVE MULTIRATE (AMR) If the radio conditions are bad, source coding is reduced and channel coding is increased. This improves the quality and robustness of the network connection while sacrificing some network capacity. AMR utilizes Discontinuous Transmission (DTX), with Voice Activity Detection (VAD) and Comfort Noise Generation (CNG) to reduce bandwidth usage during silence periods.

ADAPTIVE MULTIRATE (AMR)

VOICE ACTIVITY DETECTION Technique in speech processing where the presence of human speech is detected in the regions of audio. Its main uses are in speech coding and speech recognition. Deactivates some processes during non-speech segments to avoid unnecessary coding/transmission of silence packets. Done at the transmitters (MS) end.

COMFORT NOISE Artificial background noise used to fill the silence in a transmission resulting from voice activity detection. The result of receiving total silence, especially for a prolonged period, has a number of unwanted effects on the listener. To counteract these effects, comfort noise is added. Done at the receivers (BTS) end.

DISCONTINOUS TRANSMISSION (DTX) Method of momentarily powering-down, or muting, a mobile set when there is no voice input to the set. This conserves battery power, eases the workload of the components in the transmitter amplifiers, and reduces interference. Resources freed up when one user is in silence can be used to serve another user, thus increasing capacity of the network. Operates using VAD and CNG.

The End

amr

Documents

speech signals

rate fr

data rate

speech coder

transcoded speech

pcm speech

bite rate

highest speech quality