geog 495 gis database design midterm review. outlines 1.database concepts 2.relational database...
TRANSCRIPT
Outlines
1. Database Concepts
2. Relational Database
3. Object-oriented Database
4. Entity-Relationship Diagram
5. Unified Modeling Language
6. Normalization
1. Database concepts
• Data vs. Information
• Data vs. Database
• DBMS vs. Database system
• Level of abstraction
• Data independence
• Different database models
• Architecture of database system
Data vs. Information
• Data: raw fact
• Information: data processed to reveal meaning
• Database system transforms data into information through queries
Data vs. Database
• Database is a collection of related data
• Database stores data in an organized manner (i.e. minimal data redundancy)
• Database exists, first and the foremost, to serve users’ requirements
DBMS vs. DB system
• DBMS allows users to access data (i.e. interface between users and database)
• DB system is composed of DBMS, people, database, and procedures
Different types of DB system
• Number of users– Single-user– Multi-user– Workgroup– Enterprise
• Database site location– Centralized– Distributed
• Database use– Transactional (production)– Data warehouse
Level of abstraction
• Conceptual: independent of s/w and h/w– How humans see the world
• Logical: s/w dependent– How programs see the world
• Physical: h/w dependent– How machines see the world
Real-world view
Machine code
Abstract
Concrete
representationdetail
Data independence
• Separation of program from data• Program’s ability to retrieve data without changing the
structure of code• Logical data independence
– Program’s ability to retrieve data without changing the structure of s/w-specific code
– When a system uses 4GL language (or non-procedural language)
• Physical data independence– Program’s ability to retrieve data without changing the structure
of h/w-specific code– When a system uses <= 3GL language (or procedural language)
Hierarchical DB model
• The world is represented with tree-like structure
• Only one-to-many relationships (i.e. parent-child relationship) are allowed
• Relationships between entities are built through reference point (e.g. pointer)
• Logical data independence (yes)
• Physical data independence (no)
Network DB model
• The world is represented with web-like structure
• Many-to-many relationships are allowed
• Relationships between entities are built through reference point (e.g. pointer)
• Logical data independence (yes)
• Physical data independence (no)
Relational DB model
• The world is represented with entities and relationships
• M:N relationships are conceptually allowed, but implemented through transformation into 1:M relationship (e.g. composite entities (a.k.a. bridge entities))
• Relationships are built through key• Logical data independence (yes)• Physical data independence (yes)
Object-oriented DB model
• The world is represented with a collection of objects
• Object embeds attribute, operation and relationships
• Complex objects can be represented through abstract data type
• Embody OO concepts such as encapsulation, inheritance and polymorphism
Architecture of DB system
1. External viewuser’s view, local (incomplete)
2. Conceptual/logical viewdesigner’s view, global
3. Internal viewProgrammer’s view
• Physical viewImplementation view, contains most details
• Organized by level of abstraction• Data independence is embodied by the proper
separation between four layers
Representation
• The world are viewed as entities and relationships
• Entities are modeled as table
• Relationships are built through common attributes between entities
Table
• Row represents a single entity
• Column represents attribute
• Cell represents a single value in the intersection between row and column
Key
• One or more attributes (columns) that determines other attributes
• Primary key– Uniquely identifies entity (should be unique)– Should be Not Null (non-empty)
• Foreign key– Common attributes that link one table to another
tables– Placed in M side table in reference to primary key to 1
side table
Integrity rules
• Entity integrity– Each entity (record) must be uniquely
identified (by primary key: PK)– PK should be Not Null for entity integrity to be
enforced
• Referential integrity– One table must reference another table
properly (by foreign key: FK)– FK should be Not Null for referential integrity
to be enforced
Relationships
• M:N– Yields data redundancy– Composite (or bridge) entities are needed to
transform into 1:M
3. Object-oriented DB
• Object
• Difference between object in OODB and entity in RDB
• Object and class
• OO concepts
Object
• Object has– OID (identity): system-generated– Instance variables (attributes): ADT allowed,
thus the representation of complex entities are possible
– Methods (operations): make objects act upon them, thus entities become autonomous
Objects (OODB) vs. Entities (RDB)
Objects, unlike entities• Identity is not state-dependent
– Because OID is system-generated
• Relationship is embedded in the object– Because objects store the reference to other
objects in themselves
• Autonomous – Because objects can use methods stored in
class
Object and Class
• Class is a collection of objects with similar attributes and behaviors
• Object is an instance of class from conceptual point of view
• Class is an instantiation of objects from implementation point of view (i.e. Object is implemented through class: e.g. object uses the methods stored in class)
• Objects are organized by class hierarchy
OO Concepts
• Encapsulation– information can be selectively hidden – enhances integrity
• Inheritance– subclass can inherit common properties from
superclass – enhances modularity
• Polymorphism– operation can take many forms depending on
characteristics (through method overriding) – enhances flexibility
4. Entity-Relationship Diagram
• Attributes• Relationships
– Connectivity– Cardinality– Participation (optionality)– Strength– Degree
• Entities– Composite– Weak– Subtype/supertype
Attribute
• Simple/Composite– Simple: cannot be subdivided (e.g. Sex)– Composite: can be subdivided (e.g. Name = First
Name + Last Name; Address = street + city + state + zipcode)
• Single-valued/Multi-valued– Single-valued: can have a single value (e.g. age) – Multi-valued: can have multiple values (e.g.
educational attainment: # degree can differ by persons; address: you can live in many different places such as permanent address, local address, vacation home, and so on)
Relationships
• Connectivity: 1:1, 1:M, M:N• Cardinalities: the number of entity
occurrence associated with another entity• Participation: optional/mandatory,
determined by cardinalities• Strength: existence-dependency + PK
derived from other table• Degree: the number of entities associated
with the relationship
Entity
• Composite (bridge)– entity that represents relationship between entities
(e.g. enrollment)
• Weak– when the relationship is strong (e.g. dependent)
• Supertype/subtype– characteristics of subtype entities are generalizable
from supertype entities (e.g. employee/secretary)
What is UML?
• Standardized modeling language for OO system design & analysis
• UML notation 1.0 was formed in 1996, version 2.0 as of 2005
• Graphic notations: in between natural language (too imprecise) and programming language (too precise thus too much details)
• Use different diagrams depending on different perspectives (conceptual, logical, physical)
Why UML?
• Let’s make OO system design unified• Let’s make OO system design visual and easy-
to-learn• Let’s make OO system design independent of
different programming languages• Let’s promote good things about OO principles
– Modularity/code reusability• Let’s make system extensible
– Stereotype, tagged value, constraint• Let’s make model interchange easier
– XMI (XML Modeling Interchange)
UML Diagrams
• Behavior diagram– Describe behavioral aspect of system– Use case diagram, activity diagram
• Structure diagram– Show the static structure of the model– Class diagram, package diagram– Component diagram, deployment diagram
• Interaction diagram– Represents different aspects of interaction– Sequence diagram, collaboration diagram
Class Diagram (overview)
• Shows the static structure of object-oriented database or database that is implemented in OO system
• Equivalent to ERD with some differences such as operation and more semantics on relationships
• Can be seen from three different perspectives (conceptual, specification, implementation)
Class Diagram: class
• Class is represented as three-part compartments (name, attribute, operation)
• Naming notation of attribute – [visibility] name: data type = [initial-value]
• Naming notation of operations– [visibility] name (parameter-list: data type):
[return value type]
Class Diagram: relationship
• Association• Aggregation: part-whole relation• Composition: strong form of aggregation• Generalization: general/unique properties• Dependency: implementation of one class is
dependent on another class
• Multiplicities: # participants associated with relationship
• Navigability: shows the direction of navigation between classes
What is normalization?
• Process for correcting table structure to minimize data redundancies
• Usually follows three-step procedures: conversion 1NF 2NF 3NF
• Operated by functional dependency between attributes
• Two types of functional dependencies– Partial dependency: nonkey attributes are dependent
on a part of composite PK– Transitive dependency: nonkey attributes are
dependent on another nonkey attributes