documentation unstructured

Upload: meddou

Post on 07-Jul-2018

222 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/18/2019 Documentation Unstructured

    1/12

    Data Management:

    Documentation &Metadata

     Types of Documentation

    Data Life Cycle 

    Re-Purpose 

    Re-Use  Deposit 

    DataCollection 

    DataAnalysis 

    DataSaring 

    ProposalPlanning

    !riting 

    DataDisco"ery 

    #nd ofPro$ect 

    DataArci"e 

    Pro$ectStart Up

  • 8/18/2019 Documentation Unstructured

    2/12

    %

    Data Documentation

    Metadata'• (nformal or formal metods todescri)e your data

    • (mportant if you *ant to reuse your

    o*n data in te future• Also necessary *en saring your

    data

  • 8/18/2019 Documentation Unstructured

    3/12

    +

     ,oure already documenting yourdata

    • .ote)oo/ – Paper

     – Digital

     – La)

    • 0olders *it notes1 te2t 3les

    • Sources1 e2periments or sur"eys1

    procedures1 etc4

  • 8/18/2019 Documentation Unstructured

    4/12

    5

    Documentation in Researc

    Project Documentation Dataset Documentation• Conte2t of data collection

    • Data collection metods

    • Structure1 organi6ation ofdata 3les

    • Data sources used

    • Data "alidation1 7ualityassurance

    •  Transformations of data from

    te ra* data trouganalysis

    • (nformation oncon3dentiality1 access anduse conditions

    • 8aria)le names anddescriptions

    • #2planation of codes andscemas used

    • Algoritms used to transformdata

    • 0ile format and soft*areincluding "ersion' used

  • 8/18/2019 Documentation Unstructured

    5/12

    9

     Types of Documentation

    Documentation for understanding &re-use

    •Readme 0ile

    •Data Dictionary

    •Code)oo/

  • 8/18/2019 Documentation Unstructured

    6/12

    ReadMe

    • Descri)es te core documentationa)out an in"estigation and its data3les

    •  Typically a simple te2t 3le

    • Can descri)e te indi"idual 3les'and;or data pac/age as a *ole

  • 8/18/2019 Documentation Unstructured

    7/12

    <

    ReadMe #2ample - Dataset

  • 8/18/2019 Documentation Unstructured

    8/12

    =

    Data Dictionary

    • Pro"ides de3nitions of te data 3elds ina data 3le

    • More details on te "aria)les1

    o)ser"ations of a 3le• Used to understand te data and tedata)ases tat contain it

    • (denti3es data elements and teir

    attri)utes including names1 de3nitionsand units of measure and oterinformation

    • >ften tey are organi6ed as a ta)le

  • 8/18/2019 Documentation Unstructured

    9/12

    ?

    Data Dictionary #2ample

  • 8/18/2019 Documentation Unstructured

    10/12

    @

    !at is a Code)oo/B•  Typical in social sciences researc

    • (ncludes elements similar to readmeand dictionary

     – Pro$ect le"el information e4g4 sur"eydesign and metodology'

     – Response codes for eac "aria)le

     – Codes used to indicate nonresponse and

    missing data

    http://www.icpsr.umich.edu/icpsrweb/ICPSR/support/faqs/2006/0/what

    !is!codeboo"

  • 8/18/2019 Documentation Unstructured

    11/12

    @@

    !at is a Code)oo/B

    • Additionally1 code)oo/s may alsocontain:

     – A copy of te sur"ey 7uestionnaire if

    applica)le' – #2act 7uestions and s/ip patterns used in

    a sur"ey

     – 0re7uencies of response

    • uite longhttp://www.icpsr.umich.edu/icpsrweb/ICPSR/support/fa

    qs/2006/0/what!is!codeboo"

  • 8/18/2019 Documentation Unstructured

    12/12

    @%

    >ter #2amples of DataDocumentation

    • La) note)oo/s

    • Soft*are synta2

    • Programming code• (nstrument settings and;or

    cali)ration

    • Pro"enance of sources of data• #m)edded metadata e4g4 #E(01 0(TS'