wired for speech: how voice activates and advances the human-computer relationship clifford nass

Post on 06-Jan-2016

17 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Wired for Speech: How Voice Activates and Advances the Human-Computer Relationship Clifford Nass Stanford University. Speaking is Fundamental. Fundamental means of human communication Everyone speaks IQs as low as 50 Brains as small as 400 grams Humans are built for words - PowerPoint PPT Presentation

TRANSCRIPT

1/59

Wired for Speech:

How Voice Activates and Advances the Human-Computer Relationship

Clifford Nass

Stanford University

2/59

Speaking is Fundamental

Fundamental means of human communication Everyone speaks

IQs as low as 50 Brains as small as 400 grams

Humans are built for words Learn new word every two hours for 11 years

3/59

Listening to Speech is Fundamental

Womb: Mother’s voice differentiation One day old: Differentiate speech vs. other sounds

Responses Brain hemispheres

Four day olds: Differentiate native language vs. other languages

Adults: Phoneme differentiation at 40-50 phonemes per second Cope with cocktail parties

4/59

Listening Beyond Speech is Fundamental

Humans are acutely aware of para-linguistic cues Gender Personality Accent Emotion Identity

5/59

Humans are Wired for Speech

Special parts of the brain devoted to Speech recognition Speech production Para-linguistic processing Voice recognition and discrimination

6/59

Therefore …

Voice interface should be the most

Enjoyable,

Efficient, &

Memorable

method for providing and acquiring information

7/59

Are They? No!Why Not?

Machines are different than humans Technology is insufficient

But are these good reasons?

8/59

It’s Easy to Create Rich Interactions

9/59

Critical Insights

Voice = Human

Technology Voice = Human Voice

Human-Technology Interaction =

Human-Human Interaction

10/59

Where’s the Leverage?

Social sciences can give us What’s important What’s unimportant Understanding Methods Unanswered questions

11/59

Examples of the Power of Social Science

12/59

Male or Female Voice?

Is gender important? Can technology have gender?

13/59

The Case of BMW

14/59

Brains are Built to Detect Voice Gender

First human category Infants at six months Self-identification by 2-3 years old Within seconds for adults

Multiple ways to recognize gender in voice Pitch Pitch range Variety of other spectral characteristics

15/59

Once Person Identifies Gender by Voice

Guides every interaction Same-gender favoritism

Trust Comfort

Gender stereotyping

16/59

Gender and Products

Gender should match product More appropriate More credible

Mutual influence of voice and product gender Female voices feminize products (and conversely) Female products feminize voices (and conversely) “Match principle”

17/59

Research Context

“Gender” of voice (synthetic) Gender of user “Gender” of product E-Commerce website

18/59

Examples of Advertisements

“Female” voice; female product

“Male” voice; female product

“Male” voice; male product

19/59

Appropriateness of the Voice

2

3

4

Female Product Male Product

Female Voice

Male Voice

20/59

Voice/Product Gender Influences

Female voices feminize products;Male voices masculinize products Strongest for opposite gender products

Female products feminize voices;Male products maculinize voices

Strong preference when voice matches product

21/59

Results for User Gender

People trust voices that match themselves Females conform more with “female” voices Males conform more with “male” voices

People like voices that match themselves Females like the “female” voice more Males like the “male” voice more

22/59

Other Results

Participants denied stereotyping technology Participants denied harboring stereotypes!

23/59

People stereotype voices by gender

Voice “gender” should match content “gender” Product descriptions Teaching Praise Jokes

24/59

Gender is Marked by Word Choice Female speech More “I,” “you,” “she,” “her,” “their,” “myself” Less “the,” “that,” these,” “one,” “two,” “some

more” More compliments More apologies More relationships between things Less description of particular things “They” for living things only

Voices should speak consistently with their “gender”

25/59

Selecting Voices

Voices manifest many traits Gender Personality Age Ethnicity

Voice traits should match content traits Content Language style Appearance (e.g., accent and race) Context

Voice traits should match user traits

26/59

If Only One Voice

Consider stereotypes Masculine vs. feminine (same voice)

Boost high frequencies (feminine) Boost low frequencies (masculine)

27/59

Emotions

28/59

Emotion and Voice

Voice is the first indicator of emotion Voice emotion has many markers

Pitch Value Range Change rate

Amplitude Value Range Change rate

Words per minute

29/59

Emotion is always relevant

User has initial emotion Interactions create emotions

Voice is particularly powerful Frustration is particularly powerful

30/59

Emotion and Technology

Could technology-based voices exhibit emotion?

Could technology-based voice emotion influence people?

31/59

Research Context

Create upset or happy drivers Have them “drive” for 25 minutes Female voice gives information and makes suggestions

Upbeat

Subdued

32/59

Number of Accidents

1

5

9

Happy Driver Upset Driver

Upbeat Voice

Subdued Voice

33/59

Results

People speak to car much more when emotion is consistent

People like car much more when emotion is consistent

34/59

Implications

User emotion is a critical part of any interaction

Emotion must match content Perception of voice

Trust Intelligence

User Performance Comfort Enjoyment

35/59

One Voice Emotion: Select for Goal

Overall liking Slightly happy voice

Attention-getting Anger Sadness

Trust and vulnerability Sadness (mild)

36/59

If You Can’t Manipulate Voice Emotion

Manipulate content Manipulate music

37/59

Using the First Person: Should IT say “I”

38/59

Should Voice Interfaces say “I”?

When should a voice interface say “I”? Does synthetic vs. recorded speech affect the

answer to the previous question?

39/59

The Importance of “I”

“I” is the most basic claim to humanity “I think, therefore I am” “I, Robot” Dobby and monsters don’t say “I”

“I” is the marker of responsibility “I made a mistake” vs.

“Mistakes were made”

40/59

Research Context Auction site Telephone interface with speech recognition Recorded bidding behavior Online questionnaire

41/59

Average Bidding Price

20

22

24

26

'I' No ''I'

Recorded Voice

Synthetic Voice

42/59

Results

When “I”+Recorded or “No I”+Synthetic System is higher quality Users were much more relaxed

“No I” is more objective “I” is more “present”

43/59

Results

“I” is right for embodiments Robots Characters Autonomous intelligence (“KITT”)

“I” is wrong when voice is second fiddle to technology Traditional car Heavily-branded products

44/59

Design

Text-to-Speech is a machine voice Recorded speech is a human voice Design questions are

Not philosophical questions Not judgment questions Experimentally verifiable

45/59

Mistakes are Tough to Talk About

46/59

Who is Responsible for Errors?

Recognition is not perfect When system fails, who should be assigned

responsibility? System User No one

47/59

Responding to Errors

Modesty Likable Unintelligent (people believe modesty!)

Criticism Isn’t really constructive Unpleasant Intelligent

Scapegoating Effective Safe

48/59

System Responses to Errors

System blame (most common)

No blame

User blame

49/59

Research context

Amazon-by-phone Numerous planned interaction errors

50/59

2

3

4

Likelihood of Purchase

No blame

System blame

User blame

Book Buying

51/59

Results

Neutral and system blame Sell much better than user blame Easier to use than system blame Nicer than system blame

User blame is most intelligent! System blame is least intelligent

52/59

Results for Errors

Take responsibility when unavoidable Increases trust Increases liking Weak negative effect on intelligence

Ignore errors whenever possible Duck responsibility to third party if needed

Blame the phone line Blame the road Making the Microsoft paperclip likable!

53/59

Results for Errors

Show commitment to the interaction Make guesses Show concern Griceian maxims

Quality Quantity Relevance Clarity

54/59

Design

Error recovery is critically important Negative experiences are more memorable Adaptation is crucially important

Flattery is effective Note times when interaction is successful

Design to avoid errors Alignment (good repetition) Air quotes

Scripting is important at all stages of the interaction

55/59

Other Areas of Importance/Research

56/59

Other Key Findings

Personality Accents Multiple voices and mixing voices Input vs. output modality Microphone type

57/59

Tying it All Together

Voice interfaces can be the most enjoyable, efficient, and memorable method for acquiring and providing information

Voice interfaces turn up the volume knob in user responses

The key is leveraging social aspects of speech

58/59

Summary – Part 1

Humans are wired for speech Interactions with voice interfaces are

fundamentally social Same social rules Same social expectations

59/59

Summary – Part 2

Social aspects of voice interfaces can be beneficial Users perform better Users feel better Users understand better

Social aspects of voice interfaces cannot be ignored Social audit is critical Social design is critical

Design psychology can be leveraged Less expensive than technology More effective than technology Broader impact than technology

top related