bİlal soylu-b090100037-(tasarım-konuşmacı tanımlayıcı)
TRANSCRIPT
-
7/29/2019 BLAL SOYLU-b090100037-(Tasarm-konumac tanmlayc)
1/22
SAKARYANVERSTESSAKARYAN
VERSTESSAKARYANVERSTESS
AKARYANVERSTESSAKARYAN
VERSTESSAKARYANVERSTESS
AKARYANVERSTESSAKARYAN
VERSTESSAKARYANVERSTESS
AKARYANVERSTESSAKARYAN
VERSTESSAKARYANVERSTESS
AKARYANVERSTESSAKARYAN
VERSTESSAKARYANVERSTESS
AKARYANVERSTESSAKARYAN
VERSTESSAKARYANVERSTESS
AKARYANVERSTESSAKARYAN
VERSTESSAKARYANVERSTESS
AKARYANVERSTESSAKARYAN
VERSTESSAKARYANVERSTESS
AKARYANVERSTESSAKARYAN
MHENDSLK FAKLTESELEKTRK-ELEKTRONK MHENDSL
TASARIMI
KONU
SPEAK RECOGNTON(KONUMACI TANIMLAYICI)RENC
BLAL SOYLU-B090100037DANIMAN
YRD.DO.DR.GKEN ETNEL
-
7/29/2019 BLAL SOYLU-b090100037-(Tasarm-konumac tanmlayc)
2/22
OTOMATK KONUMA TANIMA(AUTOMATC SPEAK RECOGNTON)
1-)KONUMA TANIMA SSTEMLERNN YARARLARI NELERDR ve NERELERDE
KULLANILIRLAR?
a)Yararlar
Konuma tanma sistemlerinin en byk faydalarndan biri kullanm kolayldr.Konuma
tanma sistemleri veri giri arac olarak mikrofon veya telefonu kullanrlar. Buradaki veri
kayna da insann hi zorluk ekmedii konuma alkanldr.
Bir dier yarar da buna bal olarak veri toplama hzdr.Konuarak gnderilen veri klavye
veya dier eylem gerektiren bir iten daha hzl olduu iin veri toplama hz asndan da
olduka yarar vardr.
Konuma tanma sistemleri ayrca kullancya hareket serbestlii de salar.yle ki ibakmndan ellerini kullanan bir operatr klavye veya fare ile yapamad verigiriini yaka
mikrofonu veya kulaklkl mikrofon ile yaparak yapt ii brakmadan hareket serbestliiyle
veri giriini salam olur.
Sistemler ayn zamanda uzaktan veri girii yapabildii iin cihazlarn uzaktan kontrolnn
salanmas gibi yarar da vardr.
b)Nerelerde Kullanlrlar?
Konuma tanma sistemlerini n balca kullanld yerler Toronto niversitesiprofesrlerinden Stephan Cook tarafndan; dikte, komut kontrol, telefonla hizmet, tbbi
yetersizlikler ve gml uygulamalar olarak verilmitir.
DKTE(yazdrma)
Bu ok nemli uygulama konuma tanma sistemlerini en ok kullanld yerlerden biridir.
Dikte uygulamas genel itibariyle yledir, kimi oturum,toplant,rportaj,adli vaka vb. gibi
alanlarda konumann tamamnn dokman olarak eldesi ok zordur ve yavatr.klavyeyle
veya elle dokman oluturulur fakat dikte sistemiyle konumalar direk yazya dklr ve
dokman metin eklinde oluur.Bu uygulama imdilik ngilizce de baarl saylsa da ileride
dier diller iinde uygulamaya konacaktr.Uygulamann en baarl ve kullanlr rnekleri
Microsoft Diction, DragonDictate, IBM, ViaVoice gibi irketler tarafndan oluturulmutur.
KOMUT-KONTROL
Konuma tanmayla birok cihaz kontrol edilebilmektedir.Konumaya gre eletirilmi
kelime veya harfler belirli komutlara karlk gelebilmektedir.Mesela konuma tanmaya gre
komut alan ve kontrol edilen robotlar rnektir.Dier bir rnek ise akll ev sistemlerde
kap,lamba,klima,pencere gibi alr-kapanr zellikli uygulamalarn sese gre komut alpilem yapmasdr.
-
7/29/2019 BLAL SOYLU-b090100037-(Tasarm-konumac tanmlayc)
3/22
TELEFONLA HZMET
Konuma tanmlama sistemlerinin gnmzde ok nemli bir uygulamasnda telefon
hizmetleridir. Bu sistem telefon bankacl,sesli imza vs. gibi alanlarda gn getike
kullanlmaktadr.Konuma tanma sisteminin telefon hizmetlerinde kullanlmas kullanc
asndan hem byk kolaylk hem de zaman kazandrmaktadr.rnein bir telefonla
bankaclk hizmetinde kullanc gvenlik asndan tularla birok numara girerken sesli imza
ile sadece konuarak istedii ilemi rahata ve ok daha hzl yapabilmektedir.
TIBB YETERSZLKLER
Konuma tanma sistemi ellerini kullanamayan engelli kiilerde de ok fayda salamaktadr.
Baz cihazlar iin ama kapama veya kontrol ilemi ellerini kullanamayan kiilerin konumayla
yaplabilinir.
GML UYGULAMALAR
Bu uygulamann en kullanlabilir rnei cep telefonlarnda ki sesli arama zelliidir.Sesli
arama sistemi,telefon rehberindeki bir kiinin telefon numaras ile kaydedilmi bir ses
etiketinin ilikilendirilmesi prensibine dayanr ve kullanc arayaca kiiyi konuma tanma
sistemiyle konuarak arar.
2-)KONUMACI TANIMLAYICI SSTEMLERE GENEL BAKI
Konumac tanmlayc sistemin genel alma prensibi u ekildedir;
Herhangi bir kayt cihazyla (mikrofon vb.) eitli ses rneklerinin kaydedilmesi, konumacnn
tekrar konumas,sesin sistemle ilenmesi,daha nceki rneklerle karlatrlmas ve eleip
elemediine baklmas ve son olarak da ilemlerin doruluuna gre ilevin gereklemesi.
3-)KONUMACI TANIMLAYICI SSTEMLERN ALIMA EKLLER(Sesin lenmesi)
1. Konuma znitelik Vektrlerinin karmKonuurken kardmz ses sinyalleri ene,dil,dudak vb. yaplardan geerken deiikekiller alr.Bu ekiller tml(a,e,) ve tmsz(k,l,m,) sesler olarak
tanmlanr.Seslerin bu zelliinden yararlanarak konumann znitelik vektrlerini
karrz.
Konuma znitelik vektrlerinin karm iin u admlar ilenir;
-
7/29/2019 BLAL SOYLU-b090100037-(Tasarm-konumac tanmlayc)
4/22
N LEM BLOUSPEKTRAL
EKLLENDRME
SPEKTRAL
ANALZ
KONUMA SNYAL
ZNTELK VEKTRLER
2. n lem Bloun ilem blou, ses sinyali sisteme aktarlrken kullanlan A/D dntrcden
kaynaklanan grlt ve DC ofset iaretin ve istenmeyen grltlerin ses sinyalindengiderilmesi amacyla gerekletirilir.
3. Spektral ekillendirmensan ses sistemi alak geiren bir filtreye benzer bu nedenle seslerin tml
blmlerinde(sesli harflerde) negatif bir eim vardr bu etkiyi kaldrmak iin de ses
sinyali 1. Dereceden FIR filtre ile filtrelenir. Bu filtreye nvurgu filtresi denir.Transfer
fonksiyonu da u ekildedir;
H(z)=1-az-1
,0.95a0.97
renek a tml Sesini Frekans Blgesi Erisi
-
7/29/2019 BLAL SOYLU-b090100037-(Tasarm-konumac tanmlayc)
5/22
4. Spektral AnalizKonuma sinyali tml,tmsz ve bu seslerin birleiminden oluan bir sinyaldir.Bu
sesler belirli zelliklere sahiptir.Bu zelliklere gre insan konuma sistemine benzer
bir konuma sinyal retim modeli oluturulabilir.Bu model u ekildedir;
Bu modelde oluan konuma sinyali, uyar sinyalinin katsaylar zamanla deien ses
yolu filtresiyle filtrelenmesi sonucu oluur.Modelden de anlalaca gibi konuma
sinyalinin spektrumu zamanla deien bir yapya sahiptir.
eitli a Deerlerine Gre Frekans Cevab
-
7/29/2019 BLAL SOYLU-b090100037-(Tasarm-konumac tanmlayc)
6/22
5. PencerelemePencereleme ilemi ses sinyalinin zaman domeninde genellikle Hamming
Penceresiyle arplma ilemidir.Pencereleme ilemi ses sinyalin ilenmek istenen
ksmn alr ve dier ksmn ise sfrlar.Hamming Penceresi u ekilde tanmlanr;
Bu tanmda N pencere uzunluudur ve sistemde deerlendirilirken maksimum
verimin alnabilmesi iin 15-30 ms aralnda olur.rnek bir konuma sinyalinin
pencerelenmesi u ekildedir;
Hamming Pencereleme fonksiyonu k; y(n)=w(n).x(n) dir.
Hamming Penceresi bir bakma filtre ilemi yapmaktadr.Yukardaki ekiller de
incelendiinde Hamming Penceresinin konuma sinyalini daha dzgn ve belirgin
hale getirdii grlmtr.
6.
LPC(Dorusal ngrmsel Kodlama)
Bu yntem, insan grtla ve az yaps zelliklerinin yan sra, ses zelliklerini de
dikkate alr.Dorusal ngrmsel temel olarak, sesin, periyodik drt veya rastgele
grlt ile uyarlan, dorusal ve zamana gre deien bir sistemin kts ile
modellenebilecei prensibine dayanr. Bu model dorusal bir filtre olarak aadaki
transfer fonksiyonu ile ifade edilmektedir. Burada p, LPC kodlaycnn seviyesi olarak
ifade edilir.
Hamming Penceresinden Geirilmi sfr Kelimesinin Sinyali
Hamming Penceresinden Geirilmemisfr Kelimesinin Sinyali
-
7/29/2019 BLAL SOYLU-b090100037-(Tasarm-konumac tanmlayc)
7/22
Bu transfer fonksiyonuna ters z dnm uygulandnda da aadaki fonksiyonelde edilir;
LPC ilem yaplacak rnein o rnekten daha nceki bir rnekten elde edilebilecei
prensibiyle alr.
Tahmin sonucu elde edilen rnein asl rnekle olan farknn, yani hatann kareleri
toplamnn minimizasyonu iin bir seri parametre hesaplanr.
Yukarda ki eitliin zm ile p sayda LPC parametresi hesaplanr.Burada p,LPC
kodlayc seviyesi,a1,a2,,ap ise LPC parametreleri olarak ifade edilir.Yukardaki filtrekatsaylar,ortalama kare hatas minimum olacak ekilde aadaki gibi hesaplanabilir;
((
( (
e*n+ hatas, sinyal 0nN-1 aralnda pencerelendiinden 0nN-1p aralnda
sfrdan farkldr.Bu yzden bir stteki eitlikte nnin snrlar 0nN-1+p alnr.Aada
11025 Hz de rneklenmi bir tml ses sinyali ve LPC analizi sonucu elde edilen hata
sinyali grafikleri vardr.
-
7/29/2019 BLAL SOYLU-b090100037-(Tasarm-konumac tanmlayc)
8/22
Yukarda ki grafiklerde tml ses sinyalinin ksa zamanl spektrumu ve H(z) filtresinin
frekans cevab verilmitir. Orjinal spektrumlu H(z) filtresinin orijinal konuma sinyali
ve LPC analizi sonucunda elde edilen hata frekans tepkisi arasnda ki iliki net olarak
gzlenmitir. H(z) filtresinin frekans tepkisi orijinal sinyalin spektrumunun zarfnn
yumuatlm bir hali olduu net olarak gzlenmitir.Bundan dolay LPC analizi ksa
zamanl spektrum tahmini olarak da dnlebilir.
LPC ile analiz edilen ses sinyali karlatrma ve eletirme yaplmak zere
DTW(dynamic time warping) yntemi uygulanr.
7. DTW(Dynamic Time Warping)Bir szcn seslendirilmesinde ayn kii seslendirse bile farkllklar olur.Bu ses ayn
zamanda uzun veya ksa olarak da seslendirilebilir.te bu DTW ile bu farkl
seslendirmeler zaman iinde yaylarak yada daraltlarak birbiriyle rtmesi salanr.
tml Ses Sinyali ve LPC Sonucu Elde Edilen
Hata
tml Ses Sinyalinin Ksa Zamanl
Spektrumu ve H(z) Filtresini Frekans Tepkisi
-
7/29/2019 BLAL SOYLU-b090100037-(Tasarm-konumac tanmlayc)
9/22
DTW ile sistemde kaytl szcklerle seslendirilen szcn zamanda rttrlmesi ve
karlatrlmas gerekletirilir.Bu algoritmann kullanlma ekli ise; ablon olarak kaytl
birden ok parametrenin ayr ayr DTW algoritmas yardmyla karlatrma ilemi iin
hazrlanmas ve beraberce deerlendirilmesidir. Aada ki ekilde LPC parametreleri
zerine DTW algoritmasnn uygulanmas gsterilmektedir.
Bu algoritmann uygulama ekli de yledir;
ablon olarak kaydedilmi szcklerin LPC parametre deerleri ile alma annda ses
kayd ile alnm szckten hesaplanan LPC parametre deerleri, LPC analizcisi
yardmyla zaman iinde rttrlr. Bu rttrme sayesinde kaytl tm ablonlar
ile karlatrma salanarak her ablon iin benzemeleri hesaplanr. Hesaplamalar
yardmyla en yakn ablona olan yaknlama oran yzde olarak hesaplanmakta ve
eer bu yaknlama oran, tanmlanan eik deerin stnde ise eletirme
gerekletirilmektedir.
LPC kodlayc knda her bir ereve karlnda dn deeri olarak p adet LPC
parametresi alnmaktadr. fade kuyruu analizcisi, ifadeleri ifade kuyruundan
ekerek LPC kodlaycsna kodlama iin erevelere ayrarak gndermektedir.
Kuyruktan ekilen ifadenin m adet ereveden olutuu durumda: bunun sonucu
olarak kodlanm ifade boyutlar p ve m olan iki boyutlu bir dizidir. Sistemde n adet
ereveden alnm ifadenin LPC ile kodlanm karl ablon olarak kaytl bulunsun.
Bu durumda kaytl ablon, boyutlar p ve n olan iki boyutlu bir dizi olacaktr. Bu iki
dizi; boyutlu uzayda yukardaki ekilde de grld gibi birbirine dik olarak
yerletirilerek, 1den pye kadar grlen her bir LPC parametre dzleminde farklar
hesaplanmakta, sonrasnda her LPC parametre dzlemi hcre baznda ortalama
farklar hesaplanarak tek bir dzleme indirgenmektedir. Sonrasnda DTW
algoritmasnn uygulanmas, iki ses sinyaline dorudan DTW algoritmasnn
uygulanmasnda olduu gibi yaplmakta ve bir durumdan dierine geiyaknlklar
karlarak karlatrma gerekletirilmektedir.
-
7/29/2019 BLAL SOYLU-b090100037-(Tasarm-konumac tanmlayc)
10/22
Speak Recognition C Program
/*****************************************************************************
*
* speaker_recognition.c
*
* Main Program to Identitfy a Speaker.
*
* The aim of this project is to determine the identity of the speaker
* from the speech sample of the speaker and the trained vectors.
*
* Trained vectors are derived from the speech sample of the speaker at
* a different time.
*
* First the input analog speech signal is digitized at 8KhZ Sampling
* Frequency using the on board ADC (Analog to Digital Converter)
* The Speech sample is stored in an one-dimensional array.
* Speech signal's are quasi-stationary. It means that the
* speech signal over a very short time frame can be considered to be a
* stationary. The speech signal is split into frames. Each frame consists
* of 256 Samples of Speech signal and the subsequent frame will start from
* the 100th sample of the previous frame. Thus each frame will overlap
* with two other subsequent other frames. This technique is called
* Framing. Speech sample in one frame is considered to be stationary.
*
* After Framing, to prevent the spectral lekage we apply windowing.
* Here Hamming window with 256 co-efficients is used.
*
* Third step is to convert the Time domain speech Signal into Frequency
* Domain using Discrete Fourier Transform. Here Fast Fourier Transform
* is used.
*
-
7/29/2019 BLAL SOYLU-b090100037-(Tasarm-konumac tanmlayc)
11/22
* The resultant transformation will result in a signal beeing complex
* in nature. Speech is a real signal but its Fourier Transform will be
* a complex one (Signal having both real and imaginary).
*
* The power of the signal in Frequency domain is calculated by summing
* the square of Real and Imaginary part of the signal in Frequency Domain.
* The power signal will be a real one. Since second half of the samples
* in the frame will be symmetric to the first half (because the speech signal
* is a real one) we ignore the second half (second 128 samples in each frame)
*
* Triangular filters are designed using Mel Frequency Scale. These bank of
* filters will approximate our ears. The power signal is then applied to
* these bank of filters to determine the frequency content across each filter.
* In our implementation we choose total number of filters to be 20.
* These 20 filters are uniformly spaced in Mel Frequency scale between
* 0-4KhZ.
*
* After computing the Mel-Frequency Spectrum, log of Mel-Frequency Spectrum
* is computed.
*
* Discrete Cosine Tranform of the resulting signal will result in the
* computation of the Mel-Frequency Cepstral Co-efficient.
*
* Euclidean distance between the trained vectors and the Mel-Frequency
* Cepstral Co-efficients are computed for each trained vectors. The
* trained vector that produces the smallest Euclidean distance will
* be identified as the speaker.
*
*
* Written by Vasanthan Rangan and Sowmya Narayanan
*
-
7/29/2019 BLAL SOYLU-b090100037-(Tasarm-konumac tanmlayc)
12/22
******************************************************************************/
/*****************************************************************************
* Include Header Files
******************************************************************************/
#include "DSK6713_aic23.h"
Uint32 fs=DSK6713_AIC23_FREQ_8KHZ;
#include
#include
#include "block_dc.h" // Header file for identifying the start of speech signal
#include "detect_envelope.h" // Header file for identfying the start of speech signal
#include "training1.h" // Header file containing the trained vectors.
/*****************************************************************************
* Definition of Variables
*****************************************************************************/
#define Number_Of_Filters 20 // Number of Mel-Frequency Filters
#define column_length 256 // Frame Length of the one speech signal
#define row_length 100 // Total number of Frames in the given speech signal
#define PI 3.14159
/*****************************************************************************
* Custom Structure Definition
*****************************************************************************/
struct complex {
float real;
float imag;
}; // Generic Structure to represent real and imaginary part of a signal
-
7/29/2019 BLAL SOYLU-b090100037-(Tasarm-konumac tanmlayc)
13/22
struct buffer {
struct complex data[row_length][column_length];
}; // Structure to store the input speech sample
struct mfcc {
float data[row_length][Number_Of_Filters];
}; // Structure to store the Mel-Frequency Co-efficients
/*****************************************************************************
* Assigning the data structures to external memory
*****************************************************************************/
#pragma DATA_SECTION(real_buffer,".EXTRAM")
struct buffer real_buffer; //real_buffer is used to store the input speech.
#pragma DATA_SECTION(coeff,".EXTRAM")
struct mfcc coeff; //coeff is used to store the Mel-Frequency Spectrum.
#pragma DATA_SECTION(mfcc_ct,".EXTRAM")
struct mfcc mfcc_ct; //mfcc_ct is used to store the Mel-Frequency Cepstral Co-efficients.
/*****************************************************************************
* Variable Declaration
*****************************************************************************/
int gain; /* output gain (Used during Play-Back */
int signal_status; /* Variable to detect speech signal */
int count; /* Variable to count */
int column; /* Variable used for incrementing column (Samples inside Frame)*/
int row; /* Variable used for incrementing row(Number of Frames)*/
/* Variable to identify where the program is Example: program_control=0 means
-
7/29/2019 BLAL SOYLU-b090100037-(Tasarm-konumac tanmlayc)
14/22
program is capturing input speech signal program_control=1 means that program
has finished capturing input and ready for processing. At this time the
input speech signal is replayed back program_control=2 means program is
ready for idenitification. */
int program_control;
float mfcc_vector[20]; /* Variable to store the vector of the speech signal */
/*****************************************************************************
* Function Declaration
*****************************************************************************/
void fft (struct buffer *, int , int ); /* Function to compute Fast Fouruer Transform */
short playback(); /* Function for play back */
void log_energy(struct mfcc *); /* Function to compute Log of Power Signal */
void mfcc_coeff(struct mfcc * , struct mfcc *); /* Function to compute MFCC */
void mfcc_vect(struct mfcc * , float *); /* Funciton to compute MFCC Vector */
/******************************************************************************
* Function Definition Starts
******************************************************************************/
interrupt void c_int11() { /* interrupt service routine */
short sample_data;
short out_sample;
if ( program_control == 0 ) { /* Beginning of Capturing input speech */
sample_data = input_sample(); /* input data */
signal_status = framing_windowing(sample_data, &real_buffer); /* Signal Identification
* and Framing and Windowing */
-
7/29/2019 BLAL SOYLU-b090100037-(Tasarm-konumac tanmlayc)
15/22
out_sample = 0; /* Output Data */
if (signal_status > 0) {
program_control = 1; /* Capturing input signal is done */
}
output_sample(out_sample); /* play nothing */
}
if ( program_control == 1 ) { /* Beginning of the Play back */
out_sample = playback(); /* call the playback funciton to get the
* stored speech sample */
output_sample(out_sample); /* play the output speech sample */
}
return;
}
void main() { /* Main Function of the program */
/****************************************************************************
* Declaring Local Variables
*****************************************************************************/
int i; /* Variable used for counters */
int j; /* Variable used for Counters */
int stages,speaker; /* Variable to identify total number of stages
* and the speaker */
float distance,ref_distance; /* Variable for storing Euclidean Distance
* and the reference Distance for comparision
*/
/*****************************************************************************
* Execution of functions start
******************************************************************************/
-
7/29/2019 BLAL SOYLU-b090100037-(Tasarm-konumac tanmlayc)
16/22
comm_intr(); /* init DSK, codec, McBSP */
/******************************************************************************
* Initializing Variables
*****************************************************************************/
gain = 1;
column = 0;
row = 0;
program_control = 0;
signal_status = 0;
count = 0;
stages=8; /* Total Number of stages in FFT = 8 */
ref_distance = 9999999999999999.9999999; /* Variable for storing reference Distance */
for ( i=0; i < row_length ; i++ ) { /* Total Number of Frames */
for ( j = 0; j < column_length ; j++) { /* Total Number of Samples in a Frame */
real_buffer.data[i][j].real = 0.0; /* Initializing real part to be zero */
real_buffer.data[i][j].imag = 0.0; /* Initializing imaginary part to be zero*/
}
}
for ( i=0; i
-
7/29/2019 BLAL SOYLU-b090100037-(Tasarm-konumac tanmlayc)
17/22
/* Compute FFT of the input speech signal after Framing and Windowing */
fft(&real_buffer,column_length,stages);
/* Compute Power Spectrum of the speech signal in Frequency Domain Representation */
power_spectrum(&real_buffer);
/* Compute Mel-Frequency Spectrum of the speech signal in Power Spectrum Form */
mel_freq_spectrum(&real_buffer,&coeff);
/* Computation of Log of the Power Spectrum */
log_energy(&coeff);
/* Computation of Discrete Cosine Transform */
mfcc_coeff(&mfcc_ct,&coeff);
/* Compute Vector */
mfcc_vect(&mfcc_ct,mfcc_vector);
/* Identifying the Speaker */
for ( i=0; i
-
7/29/2019 BLAL SOYLU-b090100037-(Tasarm-konumac tanmlayc)
18/22
speaker = i; /* Identify the speaker to be corresponding sampel */
ref_distance = distance; /* Store the new Distance */
}
}
/* Print the identified Speaker */
printf("Input Speaker is identified to be %d Speaker from the Training Set\n",++speaker);
}
/* Function to Compute Fast Fourier Transform */
void fft (struct buffer *input_data, int n, int m) {/* Input speech Data, n = 2^m, m = total number of stages */
int n1,n2,i,j,k,l,row_index; /* Declare Variables
* n1 is the difference between upper and
lower
* i,j,k,l are counters
* row_index is used to index every frame */
float xt,yt,c,s,e,a; /* declare variables for storing temporary values
* xt,yt for temporary real and Imaginary respectively
* c for cosine
* s for sine
* e and a for computing the input to cosine and sine
*/
for ( row_index = 0; row_index < row_length; row_index++) { /* For every frame */
/* Loop through all the stages */
n2 = n;
for ( k=0; k
-
7/29/2019 BLAL SOYLU-b090100037-(Tasarm-konumac tanmlayc)
19/22
a = j*e;
c = (float) cos(a);
s = (float) sin(a);
/* Do the Butterflies for all 256 samples */
for (i=j; idata[row_index][i].real - input_data-
>data[row_index][l].real;
input_data->data[row_index][i].real = input_data-
>data[row_index][i].real+input_data->data[row_index][l].real;
yt = input_data->data[row_index][i].imag - input_data-
>data[row_index][l].imag;
input_data->data[row_index][i].imag = input_data-
>data[row_index][i].imag+input_data->data[row_index][l].imag;
input_data->data[row_index][l].real = c*xt + s*yt;
input_data->data[row_index][l].imag = c*yt - s*yt;
}
}
}
/* Bit Reversal */
j = 0;
for ( i=0; idata[row_index][j].real = input_data->data[row_index][i].real;
input_data->data[row_index][i].real = xt;
yt = input_data->data[row_index][j].imag;
input_data->data[row_index][j].imag = input_data->data[row_index][i].imag;
input_data->data[row_index][i].imag = yt;
}
}
}
return;
-
7/29/2019 BLAL SOYLU-b090100037-(Tasarm-konumac tanmlayc)
20/22
}
/* Function to compute log of Mel-Frequency spectrum */
void log_energy(struct mfcc *co_eff) {
int i,j; /* Variables declared to act as counters */
for ( i=0; idata[i][j]); /* Compute log of co-efficients */
}
}
}
/* Function to compute Discrete Cosine Transform */
void mfcc_coeff(struct mfcc *mfccct, struct mfcc *co_eff) {
int i,j,k; /* Variable declared to act as counters */
for ( i=0; idata[i][j] + co_eff->data[i][k]*cos((double)((PI*j*(k-
1/2))/Number_Of_Filters));
}
}
}
}
/* Function to compute Euclidean distance and conversion to Vector */
-
7/29/2019 BLAL SOYLU-b090100037-(Tasarm-konumac tanmlayc)
21/22
void mfcc_vect(struct mfcc *mfccct, float *mfccvector) {
int i,j; /* variables declared to act as counters */
for ( i=0; i< Number_Of_Filters; i++ ) { /* Total Number of Filters */
mfccvector[i] = 0; /* Initialize the Vector to Zero */
for (j=0; j< row_length; j++) { /* For all the Frames Compute the distance */
mfccvector[i] = mfccvector[i] + ((mfccct->data[j][i])*(mfccct->data[j][i]));
}
}
}
/* Function to play back the speech signal */
short playback() {
column++; /* Variable to store the index of speech sample in a frame */
if ( column >= column_length ) { /* If Colum >=256 reset it to zero
* and increment the frame number
*/
column = 0; /* initialize the sample number back to zero */
row++; /* Increment the Frame Number */
}
if ( row >= row_length ) { /* If Total Frame Number reaches 100 initialize
* row to be zero
* and change the program control inidcating
* end of playback */
program_control = 2; /* End of Playback */
row = 0; /* Initialize the frame number back to zero */
}
return ((int)real_buffer.data[row][column].real); /* Return the stored speech Sample */
}
-
7/29/2019 BLAL SOYLU-b090100037-(Tasarm-konumac tanmlayc)
22/22