finding structure in time

30
Finding Structure in Time Jeffrey L. Elman Presented by: Kaushik Choudhary

Upload: carrie

Post on 11-Feb-2016

26 views

Category:

Documents


0 download

DESCRIPTION

Finding Structure in Time. Jeffrey L. Elman Presented by: Kaushik Choudhary. Outline. Introduction The Problem with Time Networks with Memory Experiments with Exclusive-OR Structure in Letter Sequences Discovering the Notion “Word” Simple Sentences Conclusion. Introduction. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Finding Structure in Time

Finding Structure in Time

Jeffrey L. Elman

Presented by: Kaushik Choudhary

Page 2: Finding Structure in Time

Outline

• Introduction• The Problem with Time• Networks with Memory• Experiments with Exclusive-OR• Structure in Letter Sequences• Discovering the Notion “Word”• Simple Sentences• Conclusion

Page 3: Finding Structure in Time

Introduction

• How might one represent temporal events in PDP models?

• We utter words in a sequence and not all together!

• This paper discusses an approach to account for time by the “effect it has on processing”

Page 4: Finding Structure in Time

Outline

• Introduction• The Problem with Time• Networks with Memory• Experiments with Exclusive-OR• Structure in Letter Sequences• Discovering the Notion “Word”• Simple Sentences• Conclusion

Page 5: Finding Structure in Time

The Problem with Time

• Possible approach - Represent temporal events by elements in a pattern vector

• Problems with the approach• Would require an interface to buffer the input. It would

be impossible to determine when to examine the buffer• Buffers would impose a limit on the input size and

demand it to be fixed• The vectors 011100000 and 000111000 are different

locations in space and thus the similarity goes undetected by PDP models.

Page 6: Finding Structure in Time

Outline

• Introduction• The Problem with Time• Networks with Memory• Experiments with Exclusive-OR• Structure in Letter Sequences• Discovering the Notion “Word”• Simple Sentences• Conclusion

Page 7: Finding Structure in Time

Networks with Memory

• Jordan (1986) proposed a network with recurrent connections.

• In such networks the hidden units could see their previous outputs to determine the future outputs – memory of the network.

Page 8: Finding Structure in Time

• In this paper, Elman proposes a similar network with additional units at the input layer.

• These units are referred to as “Context Units” and are also hidden.

• The input and context units activate the hidden units which in turn activate the output units and feed back the context units.

Networks with Memory

Page 9: Finding Structure in Time

Output

Networks with Memory

Hidden Units

Input Context Units

Elman’s proposed recurrent network.

Page 10: Finding Structure in Time

• In the above architecture, context units remember prior internal state for a specific output

• The hidden units develop a mapping to remember the temporal properties of the input

• This lends the network temporal sensitivity.

Networks with Memory

Page 11: Finding Structure in Time

Outline

• Introduction• The Problem with Time• Networks with Memory• Experiments with Exclusive-OR• Structure in Letter Sequences• Discovering the Notion “Word”• Simple Sentences• Conclusion

Page 12: Finding Structure in Time

Experiments with Exclusive-OR

• Sample input : • Sample output: • Every third bit is XOR of 1st and 2nd • Objective of the network is to predict the next

bit.• It is only possible to predict every third bit

accurately.

1 0 1 0 0 0 0 1 1 1 1 0

0 1 0 0 0 0 1 1 1 1 0 ?

Page 13: Finding Structure in Time

Experiments with Exclusive-OR

Page 14: Finding Structure in Time

Outline

• Introduction• The Problem with Time• Networks with Memory• Experiments with Exclusive-OR• Structure in Letter Sequences• Discovering the Notion “Word”• Simple Sentences• Conclusion

Page 15: Finding Structure in Time

Structure in Letter Sequences

• Sample input: Consonants b,d and g combined randomly. Then replaced with b->ba, d->dii and g->guuu.

• Each letter was assigned a unique 6-bit vector.

Page 16: Finding Structure in Time

• Objective of the network was to predict the next letter in the input sequence.

• Network structure: 6 input units, 6 output units, 20 hidden units and 20 context units.

• The network was trained through 200 passes over the sequence diibaguuubadiidiiguuu…

Structure in Letter Sequences

Page 17: Finding Structure in Time

Structure in Letter Sequences

Page 18: Finding Structure in Time

Outline

• Introduction• The Problem with Time• Networks with Memory• Experiments with Exclusive-OR• Structure in Letter Sequences• Discovering the Notion “Word”• Simple Sentences• Conclusion

Page 19: Finding Structure in Time

Discovering the Notion “Word”

• Input to the network: 200 sentences with no breaks between them (1270 words, 4963 letters)

• Each letter represented by a 5-bit vector• Network structure: 5 input units, 5 output

units, 20 hidden units and 20 context units.• Objective of the network was to predict the

next letter in the sequence.

Page 20: Finding Structure in Time

Discovering the Notion “Word”

Page 21: Finding Structure in Time

• The authors defend the ambiguity in results indicating that the experiment had only set out to show that there is predictability in boundaries of words in the sequence.

• And that the recurrent network is able to extract this information!

Discovering the Notion “Word”

Page 22: Finding Structure in Time

Outline

• Introduction• The Problem with Time• Networks with Memory• Experiments with Exclusive-OR• Structure in Letter Sequences• Discovering the Notion “Word”• Simple Sentences• Conclusion

Page 23: Finding Structure in Time

Simple Sentences

• 10,000 random sentences were created.• Each word in the sentence was assigned a 31-

bit vector with each bit representing a different word.

• No breaks between sentences thus giving a stream of 27,534 words.

• The network experienced six passes over this stream.

Page 24: Finding Structure in Time

• The objective of the network was to predict the next word.

Simple Sentences

Page 25: Finding Structure in Time

• The RMS error calculated based on successive words was about 0.88.

• The RMS error calculated based on probability of occurrence of a word was about 0.053.

• Impressive!

Simple Sentences

Page 26: Finding Structure in Time

Simple Sentences

Page 27: Finding Structure in Time

Outline

• Introduction• The Problem with Time• Networks with Memory• Experiments with Exclusive-OR• Structure in Letter Sequences• Discovering the Notion “Word”• Simple Sentences• Conclusion

Page 28: Finding Structure in Time

Conclusion

• Problems defined in terms of temporal events change nature.

• RMS error calculated over time may be used to evaluate temporal structures.

• More sequential dependencies does not necessarily translate to worse performance.

• Representations of time and hence memory depend on the task in hand.

• Representations may be structured.

Page 29: Finding Structure in Time

Outline

• Introduction• The Problem with Time• Networks with Memory• Experiments with Exclusive-OR• Structure in Letter Sequences• Discovering the Notion “Word”• Simple Sentences• Conclusion

Page 30: Finding Structure in Time

Thank you!