integrating greek and english digital resources sean boisen ([email protected])[email protected] computer...

23
Integrating Greek and English Digital Resources Sean Boisen ([email protected] ) Computer Assisted Research Section, S19-108 Slides at: http://semanticbible.org/other/presentations/2007-sbl-inte grating/

Upload: brianna-singleton

Post on 24-Dec-2015

220 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Integrating Greek and English Digital Resources Sean Boisen (sean@logos.com)sean@logos.com Computer Assisted Research Section, S19-108 Slides at: //semanticbible.org/other/presentation

Integrating Greek and English Digital Resources

Sean Boisen ([email protected])Computer Assisted Research Section,

S19-108

Slides at: http://semanticbible.org/other/presentations/2007-sbl-integrating/

Page 2: Integrating Greek and English Digital Resources Sean Boisen (sean@logos.com)sean@logos.com Computer Assisted Research Section, S19-108 Slides at: //semanticbible.org/other/presentation

Outline

• Motivation and thesis• Overview of cross-lingual resources• Cross-lingual semantic mapping and

applications• Conclusions

Page 3: Integrating Greek and English Digital Resources Sean Boisen (sean@logos.com)sean@logos.com Computer Assisted Research Section, S19-108 Slides at: //semanticbible.org/other/presentation

Motivation

• Expand the utility of existing Greek resources• Open new possibilities for English-oriented

students• Thesis: integrating English resources can

provide additional tools, resources, and insights

Page 4: Integrating Greek and English Digital Resources Sean Boisen (sean@logos.com)sean@logos.com Computer Assisted Research Section, S19-108 Slides at: //semanticbible.org/other/presentation

Overview of Resources

• ESV English-Greek Reverse Interlinear New Testament

• OpenText.Org Syntactically Analyzed Greek New Testament (“OpenText”)

• Greek-English Lexicon of the New Testament Based on Semantic Domains (“Louw-Nida”)

• WordNet

Page 5: Integrating Greek and English Digital Resources Sean Boisen (sean@logos.com)sean@logos.com Computer Assisted Research Section, S19-108 Slides at: //semanticbible.org/other/presentation

ESV Reverse Interlinear

• Designed to aid English readers in accessing the Greek NT

• Careful attention to word-level correspondence

Acts 21:1

Page 6: Integrating Greek and English Digital Resources Sean Boisen (sean@logos.com)sean@logos.com Computer Assisted Research Section, S19-108 Slides at: //semanticbible.org/other/presentation

Reverse Interlinear Information Structure

• Bi-directional lexical mapping• Preserves word order in both languages

And when we had parted from them and set sail

Ὡς δὲ ἐγένετο ἀναχθῆναι ἡμᾶς ἀποσπασθέντας ἀπ’ αὐτῶν

Page 7: Integrating Greek and English Digital Resources Sean Boisen (sean@logos.com)sean@logos.com Computer Assisted Research Section, S19-108 Slides at: //semanticbible.org/other/presentation

Reverse Interlinear Applications

• Lexicographic distribution– γίνομαι and English translational equivalents– ἔρχω: coming vs. going

• Part-of-speech – εὐθυδρομήσαντες (VPNPMAA) vs. “by a straight

course” (Adj)– Distributional analysis

• Integration of other English resources

Page 8: Integrating Greek and English Digital Resources Sean Boisen (sean@logos.com)sean@logos.com Computer Assisted Research Section, S19-108 Slides at: //semanticbible.org/other/presentation

Overview: OpenText.org

• Syntactic annotation of the Greek New Testament– Syntactic groups up to the clause level

Acts 21:1

Page 9: Integrating Greek and English Digital Resources Sean Boisen (sean@logos.com)sean@logos.com Computer Assisted Research Section, S19-108 Slides at: //semanticbible.org/other/presentation

OpenText Applications

• Word-level alignment to ESV Reverse Interlinear enables integration with English tools and resources– Numerous automated English analytical tools are

available– Enables cross-lingual comparison: part-of-speech

distribution, syntactic analysis, etc.

Page 10: Integrating Greek and English Digital Resources Sean Boisen (sean@logos.com)sean@logos.com Computer Assisted Research Section, S19-108 Slides at: //semanticbible.org/other/presentation

Overview: Louw-Nida

• Domain grouping: – “Object referents” (entities): 1-12– Events: 13-57– “Abstracts”: 58-91 – Discourse referentials: 92– Names of persons and places: 93

• Sub-domains with hypernyms (“is-a”) and hyponyms

Page 11: Integrating Greek and English Digital Resources Sean Boisen (sean@logos.com)sean@logos.com Computer Assisted Research Section, S19-108 Slides at: //semanticbible.org/other/presentation

Overview: Louw-Nida (2)

• Organized semantically into “meaning entries” – groups of terms with a shared sense that are

semantically distinguishable from others– Meanings within a sub-domain are ordered:

• “those meanings which are treated first tend to be of a more generic nature, while more specific meanings follow” (LN Introduction, p. vi)

• Index of Greek terms to meaning entries• Partial index of English terms to sense groups

Page 12: Integrating Greek and English Digital Resources Sean Boisen (sean@logos.com)sean@logos.com Computer Assisted Research Section, S19-108 Slides at: //semanticbible.org/other/presentation

Louw-Nida Information Structure

6: Artifacts

6.B: Agriculture & Husbandry

6.C: Fishing 6.D: Binding and Fastening 6.E: Traps, Snares

6.A: General Artifacts

Also:WeaponsBoatsVehiclesFor WritingMoneyFor MusicImagesLightsFurnitureAnd others …

6.23 παγίς“an object used for trapping or snaring, principally of birds”• ‘trap’• ‘snare’

6.24 θήρα “an instrument used for trapping, especially of animals other than birds”• ‘trap’• ‘snare’

6.25 σκάνδαλον “a trap, probably of the type which has a stick which when touched by an animal causes the trap to shut”• ‘trap’

Page 13: Integrating Greek and English Digital Resources Sean Boisen (sean@logos.com)sean@logos.com Computer Assisted Research Section, S19-108 Slides at: //semanticbible.org/other/presentation

Polysemy in Louw-Nida

• 6978 meanings• Greek index

– 6805 terms– 8428 term-meaning pairs

• English index– 4622 terms– 9586 term-meaning pairs

1

10

100

1000

10000

0 5 10 15 20 25

# of senses

Ter

m c

ount

Greek English

Page 14: Integrating Greek and English Digital Resources Sean Boisen (sean@logos.com)sean@logos.com Computer Assisted Research Section, S19-108 Slides at: //semanticbible.org/other/presentation

Applications of Louw-Nida

• Semantic concordance• Identifying semantic coherence and lexical

chains• Text similarity assessment• Collocation analysis (O’Donnell 2005)• Challenges

– Shallow hierarchy– Coverage limited to NT only

Page 15: Integrating Greek and English Digital Resources Sean Boisen (sean@logos.com)sean@logos.com Computer Assisted Research Section, S19-108 Slides at: //semanticbible.org/other/presentation

Overview: WordNet

• Rich hierarchy of English meanings• Organized into synonym sets (synsets)• Additional relationships beyond hypernyms

– Part/whole (holonym/meronym)– Derivational relationships

• On-line browser at http://wordnet.princeton.edu• Using version 3.0

– Python and Natural Language Toolkit (NLTK, http://nltk.org),

Page 16: Integrating Greek and English Digital Resources Sean Boisen (sean@logos.com)sean@logos.com Computer Assisted Research Section, S19-108 Slides at: //semanticbible.org/other/presentation

WordNet Information Structure

artifact

device

trap

instrumentality, instrument

net

fishnet, fishing net

snare, gin, noose

trap (verb) Related-tobait, decoy, lureHas-part

Page 17: Integrating Greek and English Digital Resources Sean Boisen (sean@logos.com)sean@logos.com Computer Assisted Research Section, S19-108 Slides at: //semanticbible.org/other/presentation

Mapping Louw-Nida to WordNet

• Extract meaning entries and their hierarchy• Extract the term-to-meaning indexes• Invert the English index to map LN meanings to

English terms• Use the English terms to identify a WordNet

synsets (or cluster) – More refined approach: Use Logos’ disambiguated

annotation of the Greek NT with Louw-Nida data (forthcoming)

– Refine this with mappings from ESV Reverse Interlinear

Page 18: Integrating Greek and English Digital Resources Sean Boisen (sean@logos.com)sean@logos.com Computer Assisted Research Section, S19-108 Slides at: //semanticbible.org/other/presentation

Application: Lexical Chains ὁ (R) The 33: Communcation

33.F: Speak, Talk

λόγος (N)

saying Saying, expression, locution

Speech, speech communication

Auditory communication

Communication (others)

is 31: Hold a View 31.I: Trust,

Rely 31.87 πιστός

(J) trustworthy: Trustworthy,

trusty

εἰ (T) If τὶς (P) anyone 25: Attitudes and Emotions

25.B: Desire Strongly

25.15 ὀρέγομαι (V)

aspires Aspire, aim, shoot for

Plan, be after Intend, mean, think

Will, wish (others)

to the office of 53: Religious Activities

53.1: Roles and Functions

53.69 ἐπισκοπή (N)

overseer, Overseer, superintendent

Supervisor Superior, higher-up, superordinate

Leader, (others)

he 25: Attitudes and Emotions

25.B: Desire Strongly

25.12 ἐπιθυμέω (V)

desires Desire, want

a 65: Value 65.C:

Good, Bad 65.22 καλός

(J) noble Noble (vs.

ignoble) Nobility, nobleness

42: Work, Do 42.D: Work, Toil

42.42 ἔργον (N)

task. Undertaking, project, task, labor

Work Activity Act, human action, human activity (others)

Page 19: Integrating Greek and English Digital Resources Sean Boisen (sean@logos.com)sean@logos.com Computer Assisted Research Section, S19-108 Slides at: //semanticbible.org/other/presentation

Application: Semantic Indexing for Search

• Use the more refined WordNet hierarchy to provide a richer search interface

• Example: “addiction”– ESV uses related adjective in 1Tim.3.8, “not addicted to

much wine”– Relevant LN meaning for προσέχω is LN.68.19

• Domain Aspect, Subdomain Continue• Gloss: ‘to continue to give oneself to, to continue to apply oneself

to.’

– “addiction”, “addict”, “addicted” not in the LN English index– Nothing leads directly back to LN.25.A (Desire, Want,

Wish) or LN.25.B (Desire Strongly)

Page 20: Integrating Greek and English Digital Resources Sean Boisen (sean@logos.com)sean@logos.com Computer Assisted Research Section, S19-108 Slides at: //semanticbible.org/other/presentation

English Applications: Search (2)

• WordNet hierarchy:– Addiction (an abnormally strong craving)

• Craving (an intense desire for some particular thing)– Desire (the feeling that accompanies an unsatisfied

state)

– “crave”, “craving” also not in LN English index• Though two related terms, νοσέω and ὀρέγομαι, also occur

in 1Tim and are translated by ESV as “craving”

• Richer semantic hierarchy “fills in the gaps”– Connects with user interest – Leads back to relevant semantic groups

Page 21: Integrating Greek and English Digital Resources Sean Boisen (sean@logos.com)sean@logos.com Computer Assisted Research Section, S19-108 Slides at: //semanticbible.org/other/presentation

Integration as a Research Strategy

• General benefits of Greek and English resource integration– Evaluate Greek results against a larger background– Provide the benefits of Greek scholarship to a wider

audience– Extending narrow resources to a broader corpus

Page 22: Integrating Greek and English Digital Resources Sean Boisen (sean@logos.com)sean@logos.com Computer Assisted Research Section, S19-108 Slides at: //semanticbible.org/other/presentation

Conclusions

• Cross-lingual integration opens up new possibilities

• Valuable data resources for empirically-based analysis

Page 23: Integrating Greek and English Digital Resources Sean Boisen (sean@logos.com)sean@logos.com Computer Assisted Research Section, S19-108 Slides at: //semanticbible.org/other/presentation

References

• Fellbaum, C., editor (1998). WordNet: An Electronic Lexical Database.

• Louw, J. P. and Nida, E. A., editors (1989). Greek-English Lexicon of the New Testament: Based on Semantic Domains.

• O'Donnell, M. B. (2005). Corpus Linguistics and the Greek of the New Testament.