enee408g: capstone design project: multimedia signal processing design project 1:

17
09/09/200 5 ENEE408G Fall 2005 Multimedia Signal Processing 1 ENEE408G: Capstone Design Project: Multimedia Signal Processing Design Project 1: Digital Speech Processing

Upload: jace

Post on 04-Jan-2016

36 views

Category:

Documents


3 download

DESCRIPTION

ENEE408G: Capstone Design Project: Multimedia Signal Processing Design Project 1: Digital Speech Processing. Outline of Design Project 1. Part I : Speech Analysis Part II : Speech Coding: Linear Predictive Vocoder Part III: Speech Recognition by IBM ViaVoice Part IV: Speech Synthesis - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: ENEE408G: Capstone Design Project: Multimedia Signal Processing Design Project 1:

09/09/2005

ENEE408G Fall 2005 Multimedia Signal Processing 1

ENEE408G: Capstone Design Project:

Multimedia Signal Processing

Design Project 1:Digital Speech Processing

Page 2: ENEE408G: Capstone Design Project: Multimedia Signal Processing Design Project 1:

09/09/2005

ENEE408G Fall 2005 Multimedia Signal Processing 2

Outline of Design Project 1

Part I : Speech Analysis Part II : Speech Coding: Linear Predictive

Vocoder Part III: Speech Recognition by IBM

ViaVoice Part IV: Speech Synthesis Part V : Human Computer Interface Part VI: Mobile Computing and Pocket PC

Programming

Page 3: ENEE408G: Capstone Design Project: Multimedia Signal Processing Design Project 1:

09/09/2005

ENEE408G Fall 2005 Multimedia Signal Processing 3

Adjust the Microphone Device

Use Sound Recorder By accessoriesentertainmentsound

recorder Select Line-In 2/Mic 2

By Editaudio propertiessound recording Volume

Page 4: ENEE408G: Capstone Design Project: Multimedia Signal Processing Design Project 1:

09/09/2005

ENEE408G Fall 2005 Multimedia Signal Processing 4

Part I. Speech Analysis (1)

• Human Vocal Apparatus

Page 5: ENEE408G: Capstone Design Project: Multimedia Signal Processing Design Project 1:

09/09/2005

ENEE408G Fall 2005 Multimedia Signal Processing 5

Part I. Speech Analysis (2)

ImpluseTrain

Generator

Vocal TractModel

Pitch Period

Vocal TractParameters

speech

WhiteNoise

X

Voiced

Unvoiced G

• Vocal Tract Model

Page 6: ENEE408G: Capstone Design Project: Multimedia Signal Processing Design Project 1:

09/09/2005

ENEE408G Fall 2005 Multimedia Signal Processing 6

Part I. Speech Analysis (3)

COLEA toolbox: Waveform on Time Domain Spectrogram Pitch and Formant Tracking LPC Spectra

Record your own voice and analyze pitch and formants.

Page 7: ENEE408G: Capstone Design Project: Multimedia Signal Processing Design Project 1:

09/09/2005

ENEE408G Fall 2005 Multimedia Signal Processing 7

Part I. Speech Analysis (4)

Page 8: ENEE408G: Capstone Design Project: Multimedia Signal Processing Design Project 1:

09/09/2005

ENEE408G Fall 2005 Multimedia Signal Processing 8

Part I. Speech Analysis (5)

LPC Analysisby proclpc.m

FeaturesExtraction for

Training Set

GenderIdentificatoin

TrainingSet

Unknow genderwave files

Male / Female

Gender Identification: Use Auditory Toolbox to obtain Linear

Predictive coefficients. Design your algorithm to identify the gender

of samples in the training set. Test your algorithm on 9/26 by new

samples.

Page 9: ENEE408G: Capstone Design Project: Multimedia Signal Processing Design Project 1:

09/09/2005

ENEE408G Fall 2005 Multimedia Signal Processing 9

Pat II. Linear Predictive Vocoder: Encoder

Encoder:

FrameSegmentation&LPC analysisproclpc.m

OrignalSpeech

LPC to LSPlpcar2ls.m

Q{wk}k=1~10 {ak}k=1~10

QGain, UV/V,T

2.4kbpscompressed

Speech

Page 10: ENEE408G: Capstone Design Project: Multimedia Signal Processing Design Project 1:

09/09/2005

ENEE408G Fall 2005 Multimedia Signal Processing 10

Part II. Linear Predictive Vocoder:Decoder

2.4kbpscompressed

Speech

iQ

iQ

LSP to LPClpcls2ar.m

{w’k}k=1~10 {a’k}k=1~10

Gain’

Impluse TrainGenerator

White Noise

UV/V,T’

LPCsynthesis &

Framecombinationsynlpc.m

Source

ReconstructedSpeech

Page 11: ENEE408G: Capstone Design Project: Multimedia Signal Processing Design Project 1:

09/09/2005

ENEE408G Fall 2005 Multimedia Signal Processing 11

Part III. Speech Recognition

IBM ViaVoice ViaVoice Training: Operate PC by ViaVoice

Page 12: ENEE408G: Capstone Design Project: Multimedia Signal Processing Design Project 1:

09/09/2005

ENEE408G Fall 2005 Multimedia Signal Processing 12

Part III. IBM ViaVoice Training

Start from BLUE word.

Keep specking, the recognized words become GRAY.

If you hear sounds or the BLUE sign stop in a specific word, return to the blue word and read the BLACK sentence again.

Page 13: ENEE408G: Capstone Design Project: Multimedia Signal Processing Design Project 1:

09/09/2005

ENEE408G Fall 2005 Multimedia Signal Processing 13

Part III. IBM ViaVoice Dictation

Speak Pad

Menu Bar: 1. Menu Button 2. Microphone State 3. Status Area 4. ViaCenter Help 5. Current User

Page 14: ENEE408G: Capstone Design Project: Multimedia Signal Processing Design Project 1:

09/09/2005

ENEE408G Fall 2005 Multimedia Signal Processing 14

Part IV. Speech Synthesis

Text-To-Speech and Talking Head

Vowel Synthesis

Demo

Page 15: ENEE408G: Capstone Design Project: Multimedia Signal Processing Design Project 1:

09/09/2005

ENEE408G Fall 2005 Multimedia Signal Processing 15

Part V. Human Computer Interface

CSLU Human Computer Interface Rapid Application Developer (RAD) StartSpeech Toolkit RAD

MIT Galaxy System JUPITER: Weather Information System

http://www.sls.lcs.mit.edu/sls/applications/jupiter.shtml

TEL: 1-888-573-8255 PEGASUS: Airline Flight Planning System

http://www.sls.lcs.mit.edu/sls/applications/pegasus.shtml

TEL: 1-877-527-8255

Page 16: ENEE408G: Capstone Design Project: Multimedia Signal Processing Design Project 1:

09/09/2005

ENEE408G Fall 2005 Multimedia Signal Processing 16

Part VI. Pocket PC Programming

Apply what you learned from previous parts and design a simple application related to digital speech processing by Microsoft eMbedded Tools for Pocket PC.

Page 17: ENEE408G: Capstone Design Project: Multimedia Signal Processing Design Project 1:

09/09/2005

ENEE408G Fall 2005 Multimedia Signal Processing 17

Announcement

Matlab task: Part II C++ task: Part VI Check out Pocket PC