SPEECH COMPRESSION |
SPEECH COMPRESSION IN GSMContents
IntroductionMy thesis, which was my final year project at University, was actually about speech compression where I actually studied the cellular compression algorithms for the European and North American standards. They are the GSM 6.10 and Qualcomm's QCELP (Qualcomm Code Excited Linear Predictive) algorithms. GSM 6.10 is a fixed rate lossy compression algorithm that compresses data at a fixed rate of 13 Kbps. QCELP is a variable rate speech compression algorithm and compresses data at different data rates. The data rate chosen for each frame is based on the amount of energy in each frame compared with different thresholds levels. For those who are keen to learn the GSM 6.10 algorithm, the standard can be obtained from ETSI Specifications at a certain cost. The QCELP algorithm can be obtained from Global Info Centre . Obtaining the GSM 6.10 Source CodeEverything you want to know about GSM compression can be obtained on Jutta Degener's GSM Page . I've actually made additional changes to Jutta Degener's patchlevel 10 release and the source code and executable is freely available. More information on the RPE-LTP (Regular Pulse Excited Long Term Predictor) algorithm can be obtained here . The links below are useful links you can go to learn more about GSM.
Evaluating the Performance of the RPE-LTPI did my own sets of testing to evaluate the performance of the RPE-LTP Speech Codec. The tests were done using objective and subjective methods. Objective TestingObjective methods of speech codec performance is normally done by providing quantitative measures on how well the reconstructed voice signal matches the original by looking at the mean square error distortion (MSE), classical signal to noise ratio (SNR), and segmented signal to noise ratio (SSNR) [1]. I performed the objective tests by looking at the mean square error, the SNR, and by comparing the graphs of the original and reconstructed signals. To do these tests, I wrote a small piece of code in Matlab, that did the job. I did the comparisons by looking at a male, female voices, and a sample from a CD player. The portion that I am comparing is in the range 8000:8250 in Matlab, i.e. 251 samples. Here is the piece of code below. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %File rpe.m % %Copyright Louis Selvon % %This script file compares the plots of the original and % %reconstructed speech signal for the audio formats a-law, and % %u-law. % %The mean square error, and SNR is also worked out. % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% clear all; str = input('Enter original audio file (a-law, or u-law) : ', 's'); ptr = input('Enter reconstructed audio file (a-law, or u-law) : ', 's'); disp(''); % Read encoded audio files. mu1 = auread(str); mu2 = auread(ptr); len1 = length(mu1); len2 = length(mu2); % Make sure the lengths are the same. if (len1 ~= len2) disp('Error : Length of the original and reconstructed must be of same length.'); return; end % Find the energy for the original signal. a = 1:len1; sq(a) = mu1.^2; % Take the difference. a = 1:len1; diff(a) = mu1(a) - mu2(a); % Find the mean square error. a = 1:len1; sp(a) = diff(a).^2; total_energy_error = sum(sp); mean_square_error = total_energy_error/len1 % Find the SNR. a = 8000:8500; m1(8501-a) = mu1(a); maxi = max(m1); % Find the maximum value. mini = min(m1); % find the minimum value. threshold_max = (80/100)*maxi; threshold_min = (75/100)*mini; for a = 8000:8500, if ((mu1(a) > threshold_max) | (mu1(a) < threshold_min)) total(8501-a) = sq(a); % Find total energy within original signal. total_energy_error_frame(8501-a) = sp(a); end end SNR = 20*log10(sum(total)/sum(total_energy_error_frame)) % Copy only 250 samples for comparison. for i = 8000:8250, z1(8251-i) = mu1(i); z2(8251-i) = mu2(i); z3(8251-i) = diff(i); end fliplr(z1); fliplr(z2); fliplr(23); % Plot the data. subplot(3,1,1); plot(z1); title('Plot of original audio file.'); xlabel('No. of samples'); ylabel('Amplitude'); subplot(3,1,2); plot(z2); title('Plot of reconstructed audio file.'); xlabel('No. of samples'); ylabel('Amplitude'); subplot(3,1,3); plot(z3); title('Plot of the difference.'); xlabel('No. of samples'); ylabel('Amplitude'); %End Matlab Coding Note : The original signal of each sample data was sampled at 8 KHz with 8 bits mono and last for 10 seconds. The table below illustrates the results of the MSE.
Table 1. Mean Square Error Results The SNR is calculated from the formula below. SNR = 20log10(sum(x)2 ÷ sum(x - x')2) i.e. SNR = Total energy within original signal ÷ Total Energy Error. where, x = original signal; x' = reconstructed signal. Before the above formula is used, the initial step is to select a threshold value which represents the maximum unvoiced value. Each sample is then compared with the threshold and is counted as energy if the value is above the threshold. While doing the testing in Matlab, I had to change the threshold constantly until I got the maximum SNR for each of the test data. The SNR can be calculated by looking at the whole signal, classical SNR, or by looking at small segments, SSNR, of the signal, say 160 samples, and then averaged to attain the SSNR. I performed my analysis by looking at only one segment of the signal, 8000:8500 in Matlab, instead of the above two techniques as this would require the use of a "for" loop in Matlab which would take a while to evaluate. The results are shown in Table 2.
Table 2. SNR of the RPE-LTP The actual codec has a SNR of 35 db, compared to 15.5391 db from my results, a ratio of approximately 6:1. Subjective TestingSubjective testing methods normally include the diagnostic rhyme test (DRT), and the mean opinion score (MOS) [1]. I performed a MOS scale testing by gathering groups of listeners of different ages to rate the perceived quality of the reconstructed speech on the scaled defined in [1]. The results are shown in Table 3.
Table 3. MOS Rating of the RPE-LTP A codec that achieves a MOS rating of 4 or higher is generally thought of as acceptable for providing toll quality speech [1]. Thus from my analysis, the RPE-LTP speech codec can be thought of as acceptable to providing a quality between toll and near toll. The actual speech codec of the RPE-LTP has a MOS of 3.54 [1]. Note : The people that I gathered to do the listening are from my family and neighbours. Basics of Speech CodingBelow are a set of links you can go to to learn about speech coding.
Audio FormatsThe links below will tell you everything you need to know about audio file formats.
Audio PlayersBelow are sets of audio players that you can download to play the audio format of your choice.
References |
HOME | COUNTRY OF BIRTH | STRESS MANAGEMENT | SPEECH COMPRESSION | DIGITAL DELAY INSERTION | MUSIC | CONTACT US |
© 1999-2007 Webmaster of www.lselvon.com - All Rights Reserved Worldwide...
My thesis, which was my final year project at University, was actually about speech compression where I actually studied the cellular compression algorithms for the European and North American standards. They are the GSM 6.10 and Qualcomm's QCELP (Qualcomm Code Excited Linear Predictive) algorithms. To give you an idea of speech compression we'll mention snippet so you get an idea what speech compression is all about. But first you can expect the most spectacular site with information on gsm idea with a state of the art resource on streaming audio player that provides a great topic on speech idea . Let's get started.
GSM 6.10 is a fixed rate lossy compression algorithm that compresses data at a fixed rate of 13 Kbps. QCELP is a variable rate speech compression algorithm and compresses data at different data rates. The data rate chosen for each frame is based on the amount of energy in each frame compared with different thresholds levels. As you can see a group of professionals did their research, so you'll to learn from the most spectacular site with information on streaming audio player with a state of the art resource on speech idea that no other site can match information on gsm idea . For those who are keen to learn the GSM 6.10 algorithm, the standard can be obtained from ETSI Specifications at a certain cost. The QCELP algorithm can be obtained from Global Info Centre .
Everything you want to know about GSM compression can be obtained on Jutta Degener's GSM Page . So you'll be sure to learn from the most spectacular site with information on speech idea with an excellent topic on gsm idea that no other site can match information on streaming audio player . I've actually made additional changes to Jutta Degener's patchlevel 10 release and the source code and executable is freely available.
More information on the RPE-LTP (Regular Pulse Excited Long Term Predictor) algorithm can be obtained here . Our site's content on speech compression so far gives wealth from a top notch topic on compression test with a state of the art resource on gprs gsm that is as good as it gets with information on audio player software . clear all; str = input('Enter original audio file (a-law, or u-law) : ', 's'); ptr = input('Enter reconstructed audio file (a-law, or u-law) : ', 's'); disp(''); % Read encoded audio files. mu1 = auread(str); mu2 = auread(ptr); len1 = length(mu1); len2 = length(mu2); % Make sure the lengths are the same. if (len1 ~= len2) disp('Error : Length of the original and reconstructed must be of same length.'); return; end
% Find the energy for the original signal. a = 1:len1; sq(a) = mu1.^2; % Take the difference. a = 1:len1; diff(a) = mu1(a) - mu2(a); % Find the mean square error. a = 1:len1; sp(a) = diff(a).^2; total_energy_error = sum(sp); mean_square_error = total_energy_error/len1 % Find the SNR. a = 8000:8500; m1(8501-a) = mu1(a); maxi = max(m1); % Find the maximum value. mini = min(m1); % find the minimum value. threshold_max = (80/100)*maxi; threshold_min = (75/100)*mini; Other topic provides the wealth of knowledge so you can expect nothing can match information on this gprs gsm with a resourceful topic on audio player software that is as good as it gets with information on compression test . for a = 8000:8500, if ((mu1(a) > threshold_max) | (mu1(a) < threshold_min)) total(8501-a) = sq(a); % Find total energy within original signal. total_energy_error_frame(8501-a) = sp(a); end end SNR = 20*log10(sum(total)/sum(total_energy_error_frame)) % Copy only 250 samples for comparison. for i = 8000:8250, z1(8251-i) = mu1(i); z2(8251-i) = mu2(i); z3(8251-i) = diff(i); end fliplr(z1); fliplr(z2); fliplr(23);
% Plot the data. subplot(3,1,1); plot(z1); title('Plot of original audio file.'); xlabel('No. of samples'); ylabel('Amplitude'); subplot(3,1,2); plot(z2); title('Plot of reconstructed audio file.'); xlabel('No. of samples'); ylabel('Amplitude'); subplot(3,1,3); plot(z3); title('Plot of the difference.'); xlabel('No. of samples'); ylabel('Amplitude'); So you can expect nothing can match information on this audio player software providing a magnificent topic on compression test that is beyond what you can imagine information provided on gprs gsm . Note : The original signal of each sample data was sampled at 8 KHz with 8 bits mono and last for 10 seconds. The table below illustrates the results of the MSE.
Before the above formula is used, the initial step is to select a threshold value which represents the maximum unvoiced value. Each sample is then compared with the threshold and is counted as energy if the value is above the threshold. So with this knowledge you will get our site reveals information about gsm idea with an impressive topic on streaming audio player that sure will water your eyes the topic on speech idea . While doing the testing in Matlab, I had to change the threshold constantly until I got the maximum SNR for each of the test data. The SNR can be calculated by looking at the whole signal, classical SNR, or by looking at small segments, SSNR, of the signal, say 160 samples, and then averaged to attain the SSNR. I performed my analysis by looking at only one segment of the signal, 8000:8500 in Matlab, instead of the above two techniques as this would require the use of a "for" loop in Matlab which would take a while to evaluate. The results are shown in Table 2.
The actual codec has a SNR of 35 db, compared to 15.5391 db from my results, a ratio of approximately 6:1. You'll get a breadth of knowlegde from a top notch topic on streaming audio player providing a magnificent topic on speech idea that will blow your mind on gsm idea . Subjective testing methods normally include the diagnostic rhyme test (DRT), and the mean opinion score (MOS) [1]. I performed a MOS scale testing by gathering groups of listeners of different ages to rate the perceived quality of the reconstructed speech on the scaled defined in [1].
A codec that achieves a MOS rating of 4 or higher is generally thought of as acceptable for providing toll quality speech [1]. Thus from my analysis, the RPE-LTP speech codec can be thought of as acceptable to providing a quality between toll and near toll. The actual speech codec of the RPE-LTP has a MOS of 3.54 [1]. Have we grabbed your attention yet. We'll don't you want knowledge from nothing can match information on this speech idea with a resourceful topic on gsm idea that is as good as it gets with information on streaming audio player . Note : The people that I gathered to do the listening are from my family and neighbours.
Below are a set of links you can go to to learn about speech coding. Now we really must have your attention. You'll sure will expect a top notch topic on compression test with a resourceful topic on gprs gsm that will blow your mind on audio player software .
The links below will tell you everything you need to know about audio file formats. So expect the most spectacular site with information on gprs gsm with an impressive topic on audio player software that no other site can match information on compression test .
Well we hope that what out site on speech compression has provided you with the knowledge so that you can expect the site definetely delivers a resource audio player software providing a magnificent topic on compression test that no other site can match information on gprs gsm . But we do not just expect you to rely on our site. To find more about speech compression why not check out those urls below :