teaching tactics and simulating human tutor dialog moves in auto tutor rupali mukherjee and marvin...
TRANSCRIPT
Teaching Tactics and
Simulating Human Tutor Dialog Moves in
Auto Tutor
Rupali Mukherjee and Marvin Schiller8. Juli 2003
Intelligent Tutoring Systems, Dr. Ivana Kruijff-Kurbayová, Dr. Helmut Horacek, SS 2003
Overview
What is AutoTutor? The need for AutoTutor Teaching Tactics Analysis of
unaccomplished tutoring Curriculum script
Dialog Move Generation
Dialog
Management Language
Analysis
„Virtual Students“ Human Students Conclusion/Discussion
Introduction
Architecture
Evaluation
2
An Introduction to AutoTutor
Simulates discourse patterns & pedagogical strategies of human tutors
Developed by the „Tutorial Research Group“ (University of Memphis)
Domain: computer literacy (college)
Aim: conversational partner
3
A Snapshot of AutoTutor
4
Sample Dialog
AutoTutor: When you turn on the computer, how is the operating system
first activated and loaded into RAM?
Student: When the computer is turned on a file is automatically booted up.
AutoTutor: Anything else?
Student: This file contains the operating system.
AutoTutor: What else?
Student: The operating system is then read into RAM and activated.
AutoTutor: Right. The CPU exectues instructions in ROM, which helps the
operating system to start.
Student: After being read from ROM the OS is activated.
AutoTutor: Right. Let's review what we've just gone over. The operating
system [...]
5
Overview
What is AutoTutor? The need for AutoTutor Teaching Tactics Analysis of
unaccomplished tutoring
Introduction
Curriculum script Dialog Move
Generation
Dialog
Management Language
Analysis
Architecture
„Virtual Students“ Human Students Conclusion/Discussion
Evaluation
6
The need for AutoTutor
One-on-one Tutoring
Information delivery Acquisition of shallow knowledge
Classroom Teaching
Construction of knowledge via interaction (constructivism) Deep comprehension
AutoTutor
7
Overview
What is AutoTutor? The need for AutoTutor Teaching Tactics Analysis of
unaccomplished tutoring
Introduction
Curriculum script Dialog Move
Generation
Dialog
Management Language
Analysis
Architecture
„Virtual Students“ Human Students Conclusion/Discussion
Evaluation
8
Teaching Tactics in Auto Tutor
Constructivism: student actively constructs knowledge
each person forms their own representation of knowledge
learning: matching own current representations with own experience
interaction necessary for learning process
Auto Tutor 1: models unaccomplished tutors
Auto Tutor 2: sophisticated tutoring
9
Overview
What is AutoTutor? The need for AutoTutor Teaching Tactics Analysis of
unaccomplished tutoring
Introduction
Curriculum script Dialog Move
Generation
Dialog
Management Language
Analysis
Architecture
„Virtual Students“ Human Students Conclusion/Discussion
Evaluation
10
An anatomy of unskilled one-on-one Tutoring
One-on-one unskilled tutoring is effective
(effect size 0.5-2.3 sdu. over classroom teaching) (Bloom, 1984; Cohen, Kulik &Kulik 1982)
(1 sdu. ~ 1 letter grade)
But:
usually no expert domain knowledge
no sophisticated tutoring strategies
11
Analysis of unaccomplished Tutoring - The Setting
Analysis of 100 hrs of naturalistic one-on-one tutoring
grad. students teaching undergrad. students basic research methods
middle school students teaching younger students basic algebra
Result: rarely use sophisticated strategies.
But 2 methods: a 5-step dialog frame, tutor-initiated dialog moves
12
5 Step Dialog Frame in one-on-one Tutoring
5 Step Dialog Frame
Step 1: Tutor asks question (or presents problem)
Step 2: Learner answers question
Step 3: Tutor gives short immediate feedback
Step 4: Tutor and Learner collaboratively improve the answer
Step 5: Tutor assesses learner's understanding
13
3 Step Dialog Frame in Classroom Teaching
Classroom Dialog Pattern
Step 1: Tutor asks question
Step 2: Learner answers question
Step 3: Tutor gives short immediate feedback
Step 4: Tutor and Learner collaboratively improve the answer
Step 5: Tutor assesses learner's understanding
Initiation
Response
Evaluation
Step 4 makes the difference! 14
Dialog Move Categories
Dialog Moves are sensible to quality and quantity of the preceding contribution by the student.
1. Positive Immediate Feedback - „That's right“ „Yeah“
2. Neutral Immediate Feedback - „Okay“ „Uh-huh“
3. Negative Immediate Feedback - „Not quite“ „No“
4. Prompting for more information - „Uh-huh“ „What else“
5. Prompting (for specific information) - „If you add RAM, the CPU can store more data and larger ______?“6. Hinting - „What about the size of programs you need to run?“
7. Elaboration - „With additional RAM, you can handle larger programs“
8. Splicing in/correcting content after a student error - „Storing the program on a floppy disk will not help you to run the program.“9. Summarizing - „So to recap,...“
15
Overview
What is AutoTutor? The need for AutoTutor Teaching Tactics Analysis of
unaccomplished tutoring
Introduction
Curriculum script Dialog Move
Generation
Dialog Management
Language
Analysis
Architecture
„Virtual Students“ Human Students Conclusion/Discussion
Evaluation
16
Topic: basic concepts focal question ideal answers, answer aspects hints, prompts anticipated bad answers corrections for bad answers a summary
Curriculum Script
Loosely structured lesson plans (organise topics & content)
12 Topics each
3 Macrotopics hardware operating systems internet
17
Curriculum Script - Example Topic
\info-8 Large, multi-user computers often work on several jobs simultaneously. This is known as concurrent processing. (...) So here's your question.
\question-8 How does the operating system of a typical computer process several jobs with one CPU?
basic concepts
focal question
18
Curriculum Script - Example Topic (II)
\pgood-8-1 The OS helps the computer to work on several jobs simultaneously by rapidly switching back and forth between jobs.
\phint-8-1-1 How can the OS take advantage of idle time on the job?
\phintc-8-1-1 The operating system switches between jobs.
good answer aspect
(GAA)
hint
19
Curriculum Script - Example Topic (III)
\ppromt-8-1-1 The operating system switches rapidly between _
\ppromptk-8-1-1 jobs
\bad-8-1 The operating system completes one job at a time and then works on another.
\splice-8-1 The operating system can work on several jobs at once.
prompt
bad answer
correction
20
Overview
What is AutoTutor? The need for AutoTutor Teaching Tactics Analysis of
unaccomplished tutoring
Introduction
Curriculum script Dialog Move
Generation
Dialog Management
Language
Analysis
Architecture
„Virtual Students“ Human Students Conclusion/Discussion
Evaluation
21
The Dialog Advancer Network (DAN)
Mechanism for enhancing AutoTutor's conversational skills
Enables AutoTutor to:
adapt each dialog move to learner's
previous turn
indicate when the learner has the floor for
contributions
22
Role of the DAN - Turn-adaption
adapt each dialog move to learner's previous turn
Coherence emerges in human conversations
Reason: participants generally adapt their turns so that they are relevant to preceding turn
„Turn-adaption“ problematic: content of dialog moves is predetermined
DAN: make quasi-adapted dialog moves relevant to learner's previous turn.
23
Role of the DAN - Turn-taking
indicate when the learner has the floor for contributions
Turn-taking: integral feature of of conversational process Speakers signal to listeners that they are relinquishing the
floor (facilitates turn-taking in human-to-human conversation)
If AutoTutor lacks this, users often do not know when or if to respond (in early versions, often confusion after Hints, Elaboration and Prompt Response dialog moves)
Current versions: use of linguistic discourse markers to disambiguate conversation
Next versions of AutoTutor: also gestures and paralinguistic signals (e.g. eye gaze) 24
DAN
Student Turn N
Tutor Adapts
Classifies Frozen
Expression
Repeat Advancer
State
Comprehension Advancer State
Tutor Selects Dialog Move
Student Turn N+1
Advancer State
Advancer State
Advancer State
Select Discourse Markter
„Once Again“ + Prev. Turn D.Move.
Select Discourse Markter„Well“ or „I see“ + Pump or Hint
Select Pump
Select Hint
Select Summary Tutor Asks next Topic Question
Select Elaboration
Select Discourse Marker„Okay“ or „Moving on“
Select Short Feedback
Answers WH or Yes/No question
Select Discourse Marker „Okay“
25
DAN - example pathway
Adaption
AutoTutor: Well, where is most of the information you type in temporarily stored?
Student: RAM
AutoTutor: Right! In RAM.
Advancer State
AutoTutor: Okay.
Student Turn N + 1
AutoTutor: Let's review, after you enter information, it is sent to the CPU. The CPU carries out the instructions on the data
select summary
asks next tutor topic question
AutoTutor: How does the OS of a typical computer process several jobs simultaneously with only one CPU?“
Student Turn N
Select Short Feedback
Tutor selects Dialog Move
26
Effect of the DAN
Development of the DAN: interaction with students improved considerably
Numerous pathways: refine micro-adaption skills
Eradication of turn-taking confusion by Advancer States
Enhances overall effectiveness as tutor and conversational partner
27
Analysis of DAN Pathway Frequency Distribution
64 computer literacy students interacted with AutoTutor
(for course credits)
24 topics covered in each tutoring session
written transcripts generated for each session
3 of the 24 topics were randomly selected -> analysis of
192 mini-conversations
28
Analysis of DAN Pathway Frequency Distribution - Results
Result: most frequently travelled pathways:
Prompt Response - Advancer - PromptPositive Feedback - Prompt Response - Advancer - Prompt }
35% of all paths
Conclusion: Too many prompts! Leads to short answers (but goal of AutoTutor: longer, conversational contributions)
Remedy: modification of triggering conditions for prompts
29
Student Turn N
Tutor Adapts
Classifies Frozen
Expression
Repeat Advancer
State
Comprehension Advancer State
Tutor Selects Dialog Move
Student Turn N+1
Advancer State
Advancer State
Advancer State
Select Discourse Markter
„Once Again“ + Prev. Turn D.Move.
Select Discourse Markter„Well“ or „I see“ + Pump or Hint
Select Pump
Select Hint
Select Summary Tutor Asks next Topic Question
Select Elaboration
Select Discourse Marker„Okay“ or „Moving on“
Select Short Feedback
Answers WH or Yes/No question
Select Discourse Marker „Okay“
Dialog Move Selection
30
Language Analysis
Word Segmenter
Syntactic Class Identifier
Speech Act Classification Assertion WH-question Yes-/No- question Directive Short Response
Latent Semantic Analysis
Student's contribution
31
Language Analysis
Latent Semantic Analysis
Computation of a relatedness score between two sets of words
Compression of a corpus of texts (here: curriculum script, textbooks, articles) into a k-dimensional LSA-space
Purely statistical method (no deep understanding)
32
Dialog Move Selection
assertion quality of preceding turn dialog history (global variables: ability,
verbosity, initiative of learner) extent of coverage of GAA's
sensitive to15 Production Rulesvia
Examples:
IF [student Assertion match with GAA = HIGH or VERY HIGH] THEN [select POSITIVE FEEDBACK]
IF[student ability = MEDIUM or HIGH& Assertion match with good answer aspect = LOW
THEN [select HINT]33
Dialog Move Selection - Selection of next Good Answer Aspect
focal question
good answer aspects A1
A2
A3 .....
An
all need to be covered
each Ai has coverage metric between 0 and 1 (computed by LSA,
updated with each assertion) each A
i covered if coverage metric above a threshold
34
Dialog Move Selection - Selection of next Good Answer Aspect (II)
A1 A
3A
2A
4A
5
coverage values
Threshold
A2
is covered (above threshold)
A5
has highest subthreshold value -
selected as next GAA to be covered
AutoTutor-1: all contributions count
AutoTutor-2: only student contributions are considered
35
Overview
What is AutoTutor? The need for AutoTutor Teaching Tactics Analysis of
unaccomplished tutoring
Introduction
Curriculum script Dialog Move
Generation
Dialog
Management Language
Analysis
Architecture
„Virtual Students“ Human Students Conclusion/Discussion
Evaluation
36
Evaluation with Virtual Students
Creation of virtual students Tutoring sessions with virtual students Evaluation by experts in language and pedagogy
(ratings between 1 [very poor] and 6 [very good])
Revision and adjustment of AutoTutor
Evaluation criteria: discrimination of learner ability choice of appropriate dialog moves
Pedagogical effectiveness- pedagogical aspects- dialog reasonable for
normal human tutor?
Conversational appropriateness- politeness norms- quality, quantity, relevance, manner (Gricean maxims)
2 judges 2 judges
37
Creation of Virtual Students
36 topics in the curriculum script answered by ~100 human computer literacy students
Quality of each answer rated by judges Creation of 7 virtual student „prototypes“
Good verbose student: contributions taken from „good“ answer samples
2-3 assertions each turnGood succinct student: contributions taken from „good“ answer
samples 1 assertion each turn
Vague student: contributions contain „vague“ assertions
Erroneous student: contributions contain assertions with misconceptions
38
Creation of Virtual Students (II)
36 topics in the curriculum script answered by ~100 human computer literacy students
Quality of each answer rated by judges Creation of 7 virtual student „prototypes“
Mute student: contributions „semantically depleted“: „Well“, „Okay“, ...
Good coherent student: first 5 turns contain 1 good assertion contributions from same human student
Monte Carlo Student: all classes of assertions
39
Pedagogical Effectiveness (1. and 2. evaluation cycle)
Good Verbose
Good Succinct
Vague
ErroneousMute
Good Coherent
Monte Carlo
1
2
3
4
55.5
6
Pedagogical Effectiveness Scores for AutoTutor
PE Means (1. /2. cycle, n=36)
PE
Rat
ing
2 judges gave scores between 1 and 6
PA score for good verbose, good succinct student lower than average
r
40
Conversational Appropriateness (1. and 2. evaluation cycle)
Good Verbose
Good Succinct
Vague
ErroneousMute
Good Coherent
Monte Carlo
1
2
3
4
55.5
6
Conversational Appropriateness Scores
CA Means (1. /2. cycle, n=36)
CA
Rat
ing
2 judges gave scores between 1 and 6
asymmetry in scores for good and bad students
41
Consequences of the Evaluation Results
Measures taken:
Revision of curriculum script (shorter, more conversational sentences)
Dialog moves were given discourse markers
Changes to production rules Adjustments to LSA values
42
Evaluation Results (before/after revisions) (I)
Good Verbose
Good Succint
Vague
ErroneousMute
Good Coherent
Monte Carlo
1
2
3
4
55.5
6
Pedagogical Effectiveness Scores
PE Means (1. /2. cycle, n=36)
PE Means (3. cycle, 210 <= n <= 592)
PE
Rat
ing
43
Evaluation Results (before/after revisions) (II)
Good Verbose
Good Succint
Vague
ErroneousMute
Good Coherent
Monte Carlo
1
2
3
4
55.5
6
Conversational Appropriateness
CA Means (1. /2. cycle, n= 36)
CA Means (3. cycle, 210 <= n <= 592 )
CA
Rat
ing
Outcome: the asymmetry has disappeared!
44
Evaluation Results
Some results are „promising“ Major problem not AutoTutor, but virtual students:
redundancies incoherence
45
Overview
What is AutoTutor? The need for AutoTutor Teaching Tactics Analysis of
unaccomplished tutoring
Introduction
Architecture
Curriculum script Dialog Move
Generation
Dialog Management
Language
Analysis
„Virtual Students“ Human Students Conclusion/Discussion
Evaluation
46
Effect of AutoTutor on Learning Gains
Assessment of learning gains - 3 conditions
Significant differences in the students’ scores among the 3 conditions, with means
- AutoTutor 0.43 - Reread 0.38 - Control 0.36
• Gains in learning and memory - size increment of .5 to .6 SD units over control condition.
AutoTutor Reread Control
47
„Bystander“ Turing Test
6 human tutors were asked what they would say at these 144 points
Transcripts of AutoTutor-1's dialog moves
144 Tutor Moves from Dialogs between Students and AutoTutor-1
36 computer literacy students discriminated: AutoTutor or Human Tutor?
?
48
„Bystander“ Turing Test
36 computer literacy students discriminated: AutoTutor or Human Tutor?
Outcome: discrimination score of -.08
Students are unable to discriminate whether particular dialogue had been generated by a computer vs. a human !
49
The TRG’s View on the Results
“Impressive” outcome supported claim that AutoTutor is a good simulation of human tutors.
Attempts to comprehend the student input.
„Almost as good as an expert in computer literacy .“
50
Students' Emotional Response to the Talking Head
Students initially amused by the talking head –but amusement wears off in a few minutes.
Trouble in understanding the synthesized speech (some students).
Inappropriate speech acts irritate students (only minority).
Sufficiently engaging to complete the tutorial sessions.
51
Overview
What is AutoTutor? The need for AutoTutor Teaching Tactics Analysis of
unaccomplished tutoring
Introduction
Architecture
Curriculum script Dialog Move
Generation
Dialog Management
Language
Analysis
„Virtual Students“ Human Students Conclusion/Discussion
Evaluation
52
Conclusion/Discussion
Identification and implementation of an important class of teaching tactics and discourse patterns
3 major aspects:
3. Dialog Management (DAN)
1. Analysis of Human Tutoringonly for unaccomplished tutors. How about well-trained tutors ?
2. Language Analysis via LSA
what about combining a semantical parser and LSA?
53
Pros and Cons
Strengths - not purely domain-specific- easy creation of curriculum script (no programming skills needed)- robust behaviour
Weaknesses - shallow understanding only- performance largely depends on Curriculum Script
54
Thank you!
Any questions ?