introduction to humanities computing

Download Introduction to Humanities Computing

Post on 20-Feb-2016

32 views

Category:

Documents

0 download

Embed Size (px)

DESCRIPTION

Introduction to Humanities Computing. Spring 1999. Lecture Six. Passport to Tour. What is the important Concept from Computer Confluence Chapter 6 Devoted to spread sheets Which is transferable to The world of text analysis ?. Malleable Matrix. A tour. - PowerPoint PPT Presentation

TRANSCRIPT

  • Introduction to Humanities ComputingSpring 1999Lecture Six

  • Passport to Tour

    What is the importantConcept from Computer Confluence Chapter 6 Devoted to spread sheetsWhich is transferable to The world of text analysis?

    Malleable Matrix

  • A tour

    Dartmouth Dante Database Project DDP is still best accessed via Telnet. The address remains:

    library.Dartmouth.EDU

    at the prompt type

    connect dante

  • What is an electronic text?Can you provide examples?

    What type of electronic text will survive?

  • What is a electronic text?Any string of charactersAny file or document that can be readA word processing fileA text file

  • Types of Electronic Texts

    Literary textLinguistic corpusHypermedia work

  • A variety of formsWWW site (Hypermedia)

    Myst, Macbeth (Software, Text, and Media)

    MS Word formatted file (Wordprocessing)

    ASCII Text file (aka Flat File)

  • ASCIIAmerican Standard Code for Information Interchange0 Null3 EOT (end of text)13 CR (carriage return) 32 SP (space) 48 0 49 1 65 A97 a, 98 b, 99 c ...

  • Coding StandardsASCII = 7 bits per character128 characters - 7 bits per character32 Reserved for printing infoRest for printing charactersExtended ASCII = 8 bits or 1 byte, 256 characters, upper ASCII characters used for special characters, characters with diacritical marks & ligaturesUNICODE = 16 bit character set65,000 characters - most known languages

  • Why?Cross-PlatformLong-term survival of dataYou can use it to encode more complex documents using markup (SGML)

    ASCII Text + Markup = Electronic Representation of Literary Text

  • Encoding

    WelcomeWelcome to 3F03This is the home page for 3F03Quantitative Methods in the Humanities Franais

    In HTML all formatting provided by codes using ASCII characters

  • Content ModelTextHeadBodyTitleHeadingParagraph

  • Limits of HTMLNo codes for many of the features: Character, Author, Text type, Sonnet, Lines

    Text analysis software cant handle it

    Languages other than English

  • COCOA MarkupContinuous TagsDo note require closing - change valueFormat: (angled brackets < > are delimiters)Example

  • COCOA example

    THE MISUNDERSTANDINGA PLAY IN THREE ACTSTo my friends of the THEATRE DE L'EQUIPECHARACTERS IN THE PLAY:THE OLD MANSERVANT ...MARIALE MALENTENDU (THE MISUNDERSTANDING) was presented for the first time at the Theatre des Mathurins, Paris, in 1944

  • Example continued

    Noon. The clean, brightly lit public room of an inn. Everything is very spick and span.

    He'll come back.Did he tell you so?

  • Brief HistoryText Analysis ToolsText-analysis tools grew out of concordances:1247, Concordance to the Vulgate Bible, Paris1949, Father Busa Index Thomisticus1970s, Batch Concordancers like OCP1989, TACT - Interactive Concordancers1990s, Textual Visualization

  • What can be done...Text-analysis tools provide SpeedComplex SearchesReconfigured ViewsStatistics Researchers can generate custom concordances interactively

  • Concordances and InterpretationConcordances provide an alternative arrangement of the text that brings passages together into a concordantia.Interpretative strategy where answers are drawn from the text by assembling passages on the subject in question and reading this rearranged text as a meaningful whole.Concordance facilitates this rearrangement providing alternative views.

  • Types of Text-AnalysisStylisticDescribing authors style and comparing itAuthorship studiesLinguisticCreate representative corpusDescribe linguistic use (diachronic or synchronic)Thematic Finding patterns (words & phrases) in a textFollowing themes through a workComparing themesDemands a reiterative reading

  • Problematic equationsThat a theme is the passages where a set of words appearCan themes be identified by key words?What about ambiguous words?That concording passages into a new text is an acceptable interpretative strategyWhere does the passage start and end around a word?Is reading a rearranged text appropriate?That the distribution of words indicates the progress of a themeDo the number of hits indicate intensity of theme?

  • Whats the connectionInterpretation (Understanding)Surface Measurement (Quantification)

  • Two ViewsText-analysis is about proving things about texts

    Stylistic analysis provides reproducible descriptions of authors style

    Measurement of surface features allows us to prove more interesting points

    Reaction to impressionistic reader oriented literary theory

    Text-analysis is the rereading a text in ways that help one better understand it

    Text-analysis is only one of many strategies

    Text-analysis reveals anomalies to be researched

    Text-analysis is useful precisely because the computer cant do well what human readers do well, and can do other things well

  • E-Text Research ProjectPlanningPrototypingScanningor BuyingProofingTraditionalResearchMarkupInteractive StudyPlanning PhaseImplementation PhaseResearch PhasePublication

  • Obtaining an E-textAcquire one from someone else.Oxford Text ArchiveSearch the Internet using WWWCommercial VendorsCreate it yourselfScan it using OCR softwareOCR = Optical Character RecognitionType it in or hire services for inputMarkupValidate

    Quick Q&A Have they done evening classes before?Spring Session?Group work?

Recommended

View more >