wired for speech: how voice activates and advances the human-computer relationship clifford nass
Post on 06-Jan-2016
17 Views
Preview:
DESCRIPTION
TRANSCRIPT
1/59
Wired for Speech:
How Voice Activates and Advances the Human-Computer Relationship
Clifford Nass
Stanford University
2/59
Speaking is Fundamental
Fundamental means of human communication Everyone speaks
IQs as low as 50 Brains as small as 400 grams
Humans are built for words Learn new word every two hours for 11 years
3/59
Listening to Speech is Fundamental
Womb: Mother’s voice differentiation One day old: Differentiate speech vs. other sounds
Responses Brain hemispheres
Four day olds: Differentiate native language vs. other languages
Adults: Phoneme differentiation at 40-50 phonemes per second Cope with cocktail parties
4/59
Listening Beyond Speech is Fundamental
Humans are acutely aware of para-linguistic cues Gender Personality Accent Emotion Identity
5/59
Humans are Wired for Speech
Special parts of the brain devoted to Speech recognition Speech production Para-linguistic processing Voice recognition and discrimination
6/59
Therefore …
Voice interface should be the most
Enjoyable,
Efficient, &
Memorable
method for providing and acquiring information
7/59
Are They? No!Why Not?
Machines are different than humans Technology is insufficient
But are these good reasons?
8/59
It’s Easy to Create Rich Interactions
9/59
Critical Insights
Voice = Human
Technology Voice = Human Voice
Human-Technology Interaction =
Human-Human Interaction
10/59
Where’s the Leverage?
Social sciences can give us What’s important What’s unimportant Understanding Methods Unanswered questions
11/59
Examples of the Power of Social Science
12/59
Male or Female Voice?
Is gender important? Can technology have gender?
13/59
The Case of BMW
14/59
Brains are Built to Detect Voice Gender
First human category Infants at six months Self-identification by 2-3 years old Within seconds for adults
Multiple ways to recognize gender in voice Pitch Pitch range Variety of other spectral characteristics
15/59
Once Person Identifies Gender by Voice
Guides every interaction Same-gender favoritism
Trust Comfort
Gender stereotyping
16/59
Gender and Products
Gender should match product More appropriate More credible
Mutual influence of voice and product gender Female voices feminize products (and conversely) Female products feminize voices (and conversely) “Match principle”
17/59
Research Context
“Gender” of voice (synthetic) Gender of user “Gender” of product E-Commerce website
18/59
Examples of Advertisements
“Female” voice; female product
“Male” voice; female product
“Male” voice; male product
19/59
Appropriateness of the Voice
2
3
4
Female Product Male Product
Female Voice
Male Voice
20/59
Voice/Product Gender Influences
Female voices feminize products;Male voices masculinize products Strongest for opposite gender products
Female products feminize voices;Male products maculinize voices
Strong preference when voice matches product
21/59
Results for User Gender
People trust voices that match themselves Females conform more with “female” voices Males conform more with “male” voices
People like voices that match themselves Females like the “female” voice more Males like the “male” voice more
22/59
Other Results
Participants denied stereotyping technology Participants denied harboring stereotypes!
23/59
People stereotype voices by gender
Voice “gender” should match content “gender” Product descriptions Teaching Praise Jokes
24/59
Gender is Marked by Word Choice Female speech More “I,” “you,” “she,” “her,” “their,” “myself” Less “the,” “that,” these,” “one,” “two,” “some
more” More compliments More apologies More relationships between things Less description of particular things “They” for living things only
Voices should speak consistently with their “gender”
25/59
Selecting Voices
Voices manifest many traits Gender Personality Age Ethnicity
Voice traits should match content traits Content Language style Appearance (e.g., accent and race) Context
Voice traits should match user traits
26/59
If Only One Voice
Consider stereotypes Masculine vs. feminine (same voice)
Boost high frequencies (feminine) Boost low frequencies (masculine)
27/59
Emotions
28/59
Emotion and Voice
Voice is the first indicator of emotion Voice emotion has many markers
Pitch Value Range Change rate
Amplitude Value Range Change rate
Words per minute
29/59
Emotion is always relevant
User has initial emotion Interactions create emotions
Voice is particularly powerful Frustration is particularly powerful
30/59
Emotion and Technology
Could technology-based voices exhibit emotion?
Could technology-based voice emotion influence people?
31/59
Research Context
Create upset or happy drivers Have them “drive” for 25 minutes Female voice gives information and makes suggestions
Upbeat
Subdued
32/59
Number of Accidents
1
5
9
Happy Driver Upset Driver
Upbeat Voice
Subdued Voice
33/59
Results
People speak to car much more when emotion is consistent
People like car much more when emotion is consistent
34/59
Implications
User emotion is a critical part of any interaction
Emotion must match content Perception of voice
Trust Intelligence
User Performance Comfort Enjoyment
35/59
One Voice Emotion: Select for Goal
Overall liking Slightly happy voice
Attention-getting Anger Sadness
Trust and vulnerability Sadness (mild)
36/59
If You Can’t Manipulate Voice Emotion
Manipulate content Manipulate music
37/59
Using the First Person: Should IT say “I”
38/59
Should Voice Interfaces say “I”?
When should a voice interface say “I”? Does synthetic vs. recorded speech affect the
answer to the previous question?
39/59
The Importance of “I”
“I” is the most basic claim to humanity “I think, therefore I am” “I, Robot” Dobby and monsters don’t say “I”
“I” is the marker of responsibility “I made a mistake” vs.
“Mistakes were made”
40/59
Research Context Auction site Telephone interface with speech recognition Recorded bidding behavior Online questionnaire
41/59
Average Bidding Price
20
22
24
26
'I' No ''I'
Recorded Voice
Synthetic Voice
42/59
Results
When “I”+Recorded or “No I”+Synthetic System is higher quality Users were much more relaxed
“No I” is more objective “I” is more “present”
43/59
Results
“I” is right for embodiments Robots Characters Autonomous intelligence (“KITT”)
“I” is wrong when voice is second fiddle to technology Traditional car Heavily-branded products
44/59
Design
Text-to-Speech is a machine voice Recorded speech is a human voice Design questions are
Not philosophical questions Not judgment questions Experimentally verifiable
45/59
Mistakes are Tough to Talk About
46/59
Who is Responsible for Errors?
Recognition is not perfect When system fails, who should be assigned
responsibility? System User No one
47/59
Responding to Errors
Modesty Likable Unintelligent (people believe modesty!)
Criticism Isn’t really constructive Unpleasant Intelligent
Scapegoating Effective Safe
48/59
System Responses to Errors
System blame (most common)
No blame
User blame
49/59
Research context
Amazon-by-phone Numerous planned interaction errors
50/59
2
3
4
Likelihood of Purchase
No blame
System blame
User blame
Book Buying
51/59
Results
Neutral and system blame Sell much better than user blame Easier to use than system blame Nicer than system blame
User blame is most intelligent! System blame is least intelligent
52/59
Results for Errors
Take responsibility when unavoidable Increases trust Increases liking Weak negative effect on intelligence
Ignore errors whenever possible Duck responsibility to third party if needed
Blame the phone line Blame the road Making the Microsoft paperclip likable!
53/59
Results for Errors
Show commitment to the interaction Make guesses Show concern Griceian maxims
Quality Quantity Relevance Clarity
54/59
Design
Error recovery is critically important Negative experiences are more memorable Adaptation is crucially important
Flattery is effective Note times when interaction is successful
Design to avoid errors Alignment (good repetition) Air quotes
Scripting is important at all stages of the interaction
55/59
Other Areas of Importance/Research
56/59
Other Key Findings
Personality Accents Multiple voices and mixing voices Input vs. output modality Microphone type
57/59
Tying it All Together
Voice interfaces can be the most enjoyable, efficient, and memorable method for acquiring and providing information
Voice interfaces turn up the volume knob in user responses
The key is leveraging social aspects of speech
58/59
Summary – Part 1
Humans are wired for speech Interactions with voice interfaces are
fundamentally social Same social rules Same social expectations
59/59
Summary – Part 2
Social aspects of voice interfaces can be beneficial Users perform better Users feel better Users understand better
Social aspects of voice interfaces cannot be ignored Social audit is critical Social design is critical
Design psychology can be leveraged Less expensive than technology More effective than technology Broader impact than technology
top related