how and why to document data for long-term storage; and what's special about geographical...

16
How and why to document data for long-term storage; and What's special about Geographical data? Allan Reese Cefas Weymouth

Upload: suchin

Post on 08-Feb-2016

33 views

Category:

Documents


0 download

DESCRIPTION

How and why to document data for long-term storage; and What's special about Geographical data?. Allan Reese Cefas Weymouth. Cefas buildings. Weymouth. Burnham-on-Crouch. Fish farms in E&W. Fish disease. Spring Viremia of Carp Virus. Causes and progress of infection - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: How and why to document data for  long-term storage; and What's special about Geographical data?

How and why to document data for long-term storage;

and

What's special about Geographical data?

Allan ReeseCefas Weymouth

Page 2: How and why to document data for  long-term storage; and What's special about Geographical data?

Cefas buildings

Weymouth

Burnham-on-Crouch

Page 3: How and why to document data for  long-term storage; and What's special about Geographical data?

Fish disease

• Causes and progress of infection• Diagnostics (viruses, bacteria,

fungi, parasites)• Vaccines and therapeutics

(safety and efficacy)• Epidemiology & risk assessment• Surveillance and control –

Fish Health Inspectors• Emerging and exotic diseases• Policy advice

Spring Viremia of Carp Virus

Fish farms in E&W

Page 4: How and why to document data for  long-term storage; and What's special about Geographical data?

Who wants a database?• I’ve got some data so I need a database• Our demo will show you how easy it is to

simultaneously search, share and retrieve information from thousands of library databases

• Project… plans to build, through networking, a database on best practices in the field

• Rapid growth in the quantity of omic data means bio-informaticians need to manage data in an efficient and reliable manner.  The main focus of this course is on designing, creating and querying relational databases

Page 5: How and why to document data for  long-term storage; and What's special about Geographical data?

Why a (relational) database?

1. large volume of data (typically gigabytes)2. complex data structure (not matching standard

application)3. long-term use / continued accumulation or

incremental update4. total accuracy & consistency needed on micro-scale5. frequent accesses to small subsets, ad hoc queries6. data shared by more than one person

(University Computing 1991; Significance Dec 2007)

Page 6: How and why to document data for  long-term storage; and What's special about Geographical data?

Extract for analysis• Fields ( variables ) = columns• Units ( level of analysis ) = rows• Columns x Rows = Data table

Query -> view ->

table of data -> summary or analysis

Page 7: How and why to document data for  long-term storage; and What's special about Geographical data?

Mystery meat• What tables form the raw data?• What fields are in each table?• Data dictionary?• Documenting meanings or DB structure?

Page 8: How and why to document data for  long-term storage; and What's special about Geographical data?

Table preferred when• Scientific data probably SHOULD NOT be changed

– or data added in batches ( incremental )

• Structure NOT complex– replication across units allowed, but not excessive

• Levels of analysis are few ( or few dominant )• Analyses summarize whole data or samples

– often one-offs ( bespoke or user-written ) • Sorting or indexing allows very rapid access

Page 9: How and why to document data for  long-term storage; and What's special about Geographical data?

Data table needs metadata• Metadata standards (Dublin core)

– emphasis on discovery – list many fields– codebook not mentioned

• A modest suggestion– data table of rows and columns, with column headers– codebook: another table to explain headers– metadata: describe background, ownership etc

Page 10: How and why to document data for  long-term storage; and What's special about Geographical data?

Geographical Databases

                               

                     

Page 11: How and why to document data for  long-term storage; and What's special about Geographical data?

ESRI (ArcInfo) assumes• The purpose of a GIS is to provide a spatial

framework to support decisions …• Most often, a GIS presents information in the

form of maps and symbols …• A map user is the end consumer of a GIS.

This person looks at maps …• When the Cassini spacecraft was launched,

GIS was used to evaluate the risk of an accident with the plutonium generators on board

Page 12: How and why to document data for  long-term storage; and What's special about Geographical data?

Nearer to me

Page 13: How and why to document data for  long-term storage; and What's special about Geographical data?

GISs contain

• Data as points, lines, areas• Location data

– lat/long, grid refs, postcodes, toids• Representation instructions

– scaling, icons, label position, shading

Page 14: How and why to document data for  long-term storage; and What's special about Geographical data?

Can you get data out?• Point and click works for pop-up labels

– not to output a table• Limited to the precision of the input device, including

the user’s eyesight• I want, probably, a whole layer of data, including the

positions as named fields

How do my needs map into the database?

Page 15: How and why to document data for  long-term storage; and What's special about Geographical data?

Lacking / hidden / difficult in GIS

• List fields associated with physical object• Choose many objects and output data

– eg to make proximity matrix• Distinguish raw from constructed data

– point-heights versus interpolated contour

• Output data values for an area – eg sea surface temperatures

Page 16: How and why to document data for  long-term storage; and What's special about Geographical data?

Request

GIS suppliers may prefer to address users’ needs by adding yet more features to the interface, or pointing to the SQL interface

I would rather they re-consider the role of the GIS as a data warehouse, from which it should be easier to select and extract data that can be analysed in other software