applications
Post on 31-Dec-2015
14 Views
Preview:
DESCRIPTION
TRANSCRIPT
2April 19, 2023
Familiar Speaker Recognition
• Two motivations– Finish MS Neuroscience degree
• Needed 700-level NEU course, Ind Study only option
– Speech Power versus Speech Intelligibility• Gerber 1974
– What about SID
Frequency Range (Hz)
% Power % Intelligibility
0-500 60 5500-1000 35 351000-2000 3 352000-4000 1 134000-8000 1 12
3April 19, 2023
Audio Data
• In-House Database– Longitudinal study (20 sessions over 3 years)– 65 subjects
• 25 (20 males, 5 females) connected to the Audio Group
– Read, Digits, Short Sentences, Conversations• 10 Short Sentences
– Two intonations• Let’s go skiing today.
– Visual and audible cue• Natural elicitation
• Shortfalls (hindsight)– Unequal Sentences– Different degrees of familiarity between listeners/speakers
4April 19, 2023
Listening Experiments
• Session 1 – Pure Tone Test• Session 2 –Familiarization with Test Set-up• Session 3 – Clean• Session 4 – 0-1K Hz, -20 dB, Speech shaped, add WGN• Session 5 - 1-2K Hz, -20 dB, Speech shaped, add WGN• Session 6 - 2-3K Hz, -20 dB, Speech shaped, add WGN• Session 7 - 3-4K Hz, -20 dB, Speech shaped, add WGN• Session 8 – 0-4K Hz, 0 dB, Speech shaped, add WGN• Session 9 - Clean• Session 10 - Whispered• Session 11 – Time-reversed
5April 19, 2023
Listening Experiments
• Results reported in 2 groups– Normal Hearing– Hearing Deficit
• Hard to draw conclusions from 2nd group– Don’t know severity of hearing loss
• Experiments are a rough 1st pass– 10 SID Listening Sessions– Analyze data– Learn from mistakes
6April 19, 2023
Listening Experiments
Group Clean 0-1K 1K-2K 2K-3K 3K-4K Clean
Normal 90.0 82.2 80.9 76.0 79.1 94.9
HD 73.3 62.0 49.3 50.0 58.0 73.0
Clean 0-1K 1-2K 2-3K 3-4K Clean45
55
65
75
85
95
Normal ListenersListeners with Hearing Deficit
7April 19, 2023
Current Research
• Data Analysis– Difficult to compare between sessions
• Is the performance statistically different
– Between group, within group?
– Current data analysis is focused on individual sentences• Let’s go skiing today.
– Same phonetic content
– Same noise (or lack of)
– Same intonation
– Same session
– Main variable is the speaker
• Formants, shimmer, jitter, energy, etc
8April 19, 2023
“Male 8”
• Most easily recognized voice• Except for Session 6
– 2K-3K noise• Currently, we build models the same
– Good or bad?• Can we figure out what is unique or not unique
about and individual’s voice?
Session 2 Clean 0-1K 1K-2K 2K-3K 3K-4K 0-4K Clean Whis Rev
“Male 8” 35 36 36 33 16 33 34 35 34 33
top related