improving lecture speech summarization using rhetorical information

Improving Lecture Speech Summarization using Rhetorical Information

Presenter: Shih-Hsiang Lin 02/21/2008

2

Introduction

• Unlike conversational speech, lecture and presentations are planned speech– Lecture speakers will follow a relatively rigid rhetorical structure

• overview (introduction) more detailed description (content) conclusions

• Lecture speech is different from broadcast news (BNs) stylistically– A wide range of speaking styles in lecture speech

• Unlike almost fixed anchors or reporters in BNs

– A typical lecture speaker often sounds dull and monotonic

• Unlike using prosody to emphasize important points in BNs

• It remains as an open question whether systems trained to summarize BNs are directly applicable to lecture speech

3

Rhetorical Structure Characteristics in Lecture Speech

• How to extract the rhetorical structure?– Lexical Evidence

• Using term distribution

– When a writer writes from subtopic to subtopic in a linear text, s/he generates sentences that are tightly linked together within a subtopic

– When s/he proceeds to the next subtopic, the sentences that are generated are less related to the previous sentences, but they themselves are tightly linked again

• Using sentence cohesiveness

– The cohesiveness is measured by a cosine value between content word-frequency vectors consisting of more than a fixed number of content words

– Acoustic/ Discourse Evidence

4

Rhetorical Structure Characteristics in Lecture Speech (cont.)

• Using PCA projection of all acoustic/phonetic, lexical and discourse feature of lecture speech render the underlying rhetorical structure

5

Extractive Summarization of Lecture Speech

binary classification problem

6

Acoustic/Phonetic/Lexical Features

7

Discourse Feature

• The probability distributions of words in texts can be adequately estimated by Poisson mixture

• The Poisson Noun is based on the following assumptions– First, if a sentence contains new noun words, it probably contains new infor

mation

• The noun word’s Poisson score varies according to its position

– Second, if a noun word occurs frequently, it is likely to be more important than other noun words and the sentence with these high frequency noun words should be included in a summary

i

N

kj N

kTFpppoisinPoissonNou

i

1

,

Number of noun words in sentence i, which belongs to section j

p means that word k appeared in the p-th time within section j

8

Extraction of Rhetorical Structure

9

Experiments and Evaluation

• 40 of the 60 well organized presentations together with power point files and manual transcription– 34 presentations that contain 6049 sentences as training set– The remaining 6 presentations that contain 1116 (Auto) or 1033 (Manu

al) sentences as held-out test set

• Sentence boundary detection– Using HMM segmenter

• 3 ~ 7 hmm states, each of the GMMs contains 256 components • silence, noise, mandarin initial speech, mandarin final speech and n

on English word speech events

• Multiple passes ASR system– Word Accuracy : 69.7% and 70.3% accuracy for manual and automatic

segmented sentences

• ROUGE-L (longest common subsequence) as evaluation metrics

10

ROUGE-L

• Longest Common Subsequence (LCS)– Given two sequences X and Y, a longest common subsequence of X an

d Y is a common subsequence with maximum length

• Example1.police killed the gunman

2.police kill the gunman

3.the gunman kill police

– ROUGE-N: S2=S3 (“police”, “the gunman”)

– ROUGE-L

• S2=3/4 (“police the gunman”)

• S3=2/4 (“the gunman”)

• S2>S3

11

Results

• By using lexical features, segmental summarizer yield the best performance– This shows the contribution of rhetorical structure in the lecture speech

• Lexical Features rank higher than acoustic feature in all experiments– What is said is more important than how it is said

• The discourse feature is even less important in the segmental summarizer than in the whole summarizer– This clearly shows that discourse feature from BNs are not applicable to lecture

speech as they are based on sentence position

30% summarization ratio

improving lecture speech summarization using rhetorical information

Documents