emerging database teshnologies
TRANSCRIPT
-
8/7/2019 Emerging Database Teshnologies
1/44
Dinesh Rao
-
8/7/2019 Emerging Database Teshnologies
2/44
1. Mobile Database.
2. Multimedia Database.
3. GIS ( Geographic Information Systems ).
4. Genome Data Management
-
8/7/2019 Emerging Database Teshnologies
3/44
1.Mobile Database
-
8/7/2019 Emerging Database Teshnologies
4/44
` Portable devices and wireless technology led to mobile computing.
` Portable computing devices and wireless communication allowed
the client to access data from any where and any time.
` There are some HW and SW problems that must be solved to make
maximum exploitation of mobile computing. i.e. Database recovery.
` Hardware problems are more difficult. Wireless coverage.
Battery.
Changes in network topology.
Wireless Transmission Speed.
-
8/7/2019 Emerging Database Teshnologies
5/44
Mobile Computing Architecture
-
8/7/2019 Emerging Database Teshnologies
6/44
In a MANET, co-located mobile units do not need tocommunicate via a fixed network, but instead, form their ownusing cost-effective technologies such as Bluetooth.
In a MANET, mobile units are responsible for routing their owndata, effectively acting as base stations as well as clients.
MANET must be robust enough to handle changes in networktopology. Such as arrival or departure of mobile unites.
MANET can fall underP2P architecture.
Mobile Ad-Hoc Network (MANET)
-
8/7/2019 Emerging Database Teshnologies
7/44
Communication latency
Intermittent connectivity
Limited battery life
Changing client location
All of these Characteristics impact data management in mobile
computing.
Characteristics ofMobile Environments - 1
-
8/7/2019 Emerging Database Teshnologies
8/44
The server may not be able to reach the client or vise
versa.
We can add proxies to the client and the server to cacheupdates into when connection is not available.
After the connection is available proxy automatically
forward these updates to its destination.
Characteristics ofMobile Environments - 2
-
8/7/2019 Emerging Database Teshnologies
9/44
The latency involved in wireless communication makesscalability a problem.x Since latency increases the time to service each client request,
so the server can handle fewer clients.
Servers can use Broadcasting to solve this problem.
Broadcast well reduces the load on the server, as clients
do not have to maintain active connections to it.x For example weather broadcasting
Characteristics ofMobile Environments - 3
-
8/7/2019 Emerging Database Teshnologies
10/44
` Client mobility also poses many data management
challenges: Servers must keep track of client locations in order to efficiently
route messages to them. Client data should be stored in the network location that
minimizes the traffic necessary to access it.
The act of moving between cells must be transparent to the
client.
` Client mobility also allows new applications that are
location-based.
Characteristics ofMobile Environments - 4
-
8/7/2019 Emerging Database Teshnologies
11/44
Mobile databases can be distributed under two possible
scenarios:1. The entire database is distributed mainly among the wired
components, possibly with full or partial replication. Management is done in fixed hosts, with additional functionalities.
2. The database is distributed among wired and wireless
components.
Management is done in both fixed hosts and mobile units.
Data Management Issues
-
8/7/2019 Emerging Database Teshnologies
12/44
Data distribution and replication (Cache)
Transactions models
Query processing (where data is located?)
Recovery and fault tolerance Mobile database design
Location-based service
Division of labor
Security
Data Management Issues
-
8/7/2019 Emerging Database Teshnologies
13/44
Application: IntermittentlySynchronized Databases
Insert\Update Data
-
8/7/2019 Emerging Database Teshnologies
14/44
2.Multimedia Database
-
8/7/2019 Emerging Database Teshnologies
15/44
` In the years ahead multimedia information systems
are expected to dominate our daily lives.
-
8/7/2019 Emerging Database Teshnologies
16/44
` DBMSs have been constantly adding to the types of
data they support.
` Today many types of multimedia data are available in
current systems. Text.
Graphics.
Images.
Animation.
Video. Audio.
-
8/7/2019 Emerging Database Teshnologies
17/44
` Multimedia data may be stored, delivered, and utilized
in many different ways.
` Applications may be categorized based on their datamanagement characteristics. Repository applications.
x A large amount of multimedia data as well as metadata is stored for retrieval
purposes.
Presentation applications.
x Simple multimedia viewing of video or audio data.
Collaborative work using multimedia information.
x Which engineers may execute a complex design task by merging drawings,
fitting subjects to design constraints, and generating new documentation,
change notifications, and so forth.
Nature ofMultimedia Applications
-
8/7/2019 Emerging Database Teshnologies
18/44
y Multimedia applications dealing with thousands
of images, documents, audio and video
segments, and free text data depend criticallyon:y Appropriate modeling of the structure and content of
data.
y Designing appropriate database schemas for storing andretrieving multimedia information.
Data Management Issues - 1
-
8/7/2019 Emerging Database Teshnologies
19/44
` Multimedia information systems are very complex and
embrace a large set of issues:
Modeling:x Complex Objects, dealing with large number of types of data (Graphics).
Design:
x Conceptual, logical, and physical design of multimedia has not been
addressed fully, and it remains an area of active research.
Storage:
x Multimedia data on standard disk devices presents problems of representation,compression, mapping to device hierarchies, archiving, and buffering during
the input/output operation.
x DBMS has presented the BLOB type (BinaryLarge Object).
Data Management Issues - 2
-
8/7/2019 Emerging Database Teshnologies
20/44
Multimedia information systems are very complex and
embrace a large set of issues (cont.):
Queries and retrieval: The database way of retrieving information is based on query languages
and internal index structures.
Performance :
Multimedia applications involving only documents and text, performance
constraints are subjectively determined by the user.
Applications involving video playback or audio-video synchronization,physical limitations dominate.
Data Management Issues - 3
-
8/7/2019 Emerging Database Teshnologies
21/44
Documents and records management
Knowledge dissemination
Education and training Marketing, advertising, retailing, entertainment, and
travel
Real-time control and monitoring
Multimedia Database Applications
-
8/7/2019 Emerging Database Teshnologies
22/44
3.Geographic InformationSystems (GIS)
-
8/7/2019 Emerging Database Teshnologies
23/44
` Geographic information systems(GIS): A systematicintegrationofhardwareandsoftwarefor
capturing,storing,displaying,updatingmanipulatingand
analyzingspatialdata.
-
8/7/2019 Emerging Database Teshnologies
24/44
y GIS can be divided into two formats:y Vectordata represents geometric objects such as points, lines,
and polygons.
y Rasterdata is characterized as an array of points, where each
point represents the value of an attribute for a real-worldlocation.
y Informally, raster images are n-dimensional array where each entry
is a unit of the image and represents an attribute
-
8/7/2019 Emerging Database Teshnologies
25/44
-
8/7/2019 Emerging Database Teshnologies
26/44
` There are several aspects of the geographical
objects need to be considered:Location.
Temporality.
Complex Spatial Features.
Object ID.
Data Quality.
Characteristics of Data in GIS
-
8/7/2019 Emerging Database Teshnologies
27/44
y The geographic context, topologic relations and
other spatial relationships are fundamentally
important in order to define spatialintegrityrules.
Characteristics of Data in GIS
-
8/7/2019 Emerging Database Teshnologies
28/44
y TopologyIntegrity.y Dealswith thebehavioroffeaturesandthespatial
relationshipbetween
them.
y Semantic Integrity.y Dealswith themeaning.
y User Defined Integrity.
y Businessrules.y Temporal.
y Punctualand Durable.
Constraints in GIS
-
8/7/2019 Emerging Database Teshnologies
29/44
` Briefly discuss the common conceptual models
for storing spatial data in GIS.
` Some conceptual data models: Rasterdatamodel:x Used for analytical applications.
Vectordatamodel:x Analysis is done using a well defined set of tools.
Conceptual Data Models forGIS
-
8/7/2019 Emerging Database Teshnologies
30/44
Conceptual Data Models forGIS
Some conceptual data models (cont.):
Networkmodel:
Define how lines connect to each other in a point. Rules are stored in a connectivity table.
Example of everyday application, optimizing a school bus route.
TIN datamodel: TriangularIrregular Network.
Is a vector-based approach.
models surfaces by connecting sample points as vector
of triangles.
-
8/7/2019 Emerging Database Teshnologies
31/44
DBMS Enhancements forGIS
Until the mid 1990s, GIS system was based mainly on file-based
systems.
No transfer standards was defined, which limited vendors in terms of
sharing.Involved in a geo-structure and attributes was stored in DBMS.
The spatial features was kept in a file and linked to the attributes.
Could not take FULL advantage of commercial RDBMS.
Database extensions has been released by vendors like DB2 spatial
extender, and OracleSpatial and OracleLocator to support GIS data.
These extensions allowed theuser to store, manage, and retrievegeo-objects.
-
8/7/2019 Emerging Database Teshnologies
32/44
-
8/7/2019 Emerging Database Teshnologies
33/44
GISStanders and Operations
Spatial Analysis Standard:Distance.
Returns the shortest distance between any two points in two
geometries.Buffer.
Returns a geometry that represents all points whose
distance from the given geometry is less than or equal to
distance.Convex Hull.
Union.
And more.
-
8/7/2019 Emerging Database Teshnologies
34/44
GISStanders and OperationsCREATE TABLE STATES (Sname VARCHAR(50) NOT NULL,State_shape POLYGON NOT NULL,
Country VARCH
AR(50)
NOT NULL,PRIMARY KEY (Sname),
FOREIGN KEY (Country) REFERENCES COUNTRIES (Cname));
SELECT SnameFROM STATSWHERE (AREA (State_shape) > 50000)
-
8/7/2019 Emerging Database Teshnologies
35/44
Future ofGISThere are some challenges in developing GIS
applications:Data Source.
Data Model.
Standards.
Mobile GIS.
Specialized DBMS forGIS.
-
8/7/2019 Emerging Database Teshnologies
36/44
4.Genome DataManagement
-
8/7/2019 Emerging Database Teshnologies
37/44
Biological Sciences and Genetics(1):The biological sciences encompass an enormous variety ofinformation. Environmental science gives us a view of how specieslive and interact in a world filled with natural phenomena. Histologyand cell biology delve into the tissue and cellular levels and provideknowledge about the inner structure and function of the cell. This
wealth of information that has been generated, classified, and storedfor centuries has only recently become a major application ofdatabase technology.Genetics has emerged as an ideal field for the application ofinformation technology. In a broad sense, it can be taught of as theconstruction of models based on information about genes which
can be defined as units of heredity
and population and the seekingout of relationships in that information.
-
8/7/2019 Emerging Database Teshnologies
38/44
Biological Sciences and Genetics(2):
The study of genetics can be divided into three branches:
1. Mendelian genetics is the study of the transmission of traits
between generations.
2. Molecular genetics is the study of the chemical structure and
function of genes at the molecular level.3. Population genetics is the study of how genetic information
varies across populations of organisms
-
8/7/2019 Emerging Database Teshnologies
39/44
Biological data exhibits many special characteristics that makemanagement of biological information a particularly challenging
problem. The characteristics related to biological information, and
focusing on a multidisciplinary field called bioinformatics that has
emerged. Bioinformatics addresses information management of
genetic information with special emphasis on DNA sequence
analysis.Applications of bioinformatics span design of targets for drugs, study
of mutations and related diseases, anthropologicalinvestigations on migration patterns of tribes and therapeutictreatments.
Characteristic 1: Biological data is highly complex when comparedwith most other domains or applications.
Characteristic 2: The amount and range of variability in data is high.Characteristic 3: Schemas in biological databases change at a rapid
pace.
-
8/7/2019 Emerging Database Teshnologies
40/44
Characteristic 4: Representations of the same data by different biologists
will likely be different (even using the same system).
Characteristic5: Most users of biological data do not require write access
to the database; read-only access is adequate.
Characteristic 6: Most biologists are not likely to have knowledge of theinternal structure of the database or about schema design.
Characteristic 7: The context of data gives added meaning for its use in
biological applications.
Characteristic 8: Defining and representing complex queries is extremely
important to the biologist.
Characteristic 9: Users of biological information often require access toold values of the data particularly when verifying previously
reported results.
-
8/7/2019 Emerging Database Teshnologies
41/44
GenBank
As of release 135.0 in April 2003, GenBank contains over31
billion nucleotide bases of more than 24 million sequences from
over100,000 species with roughly1400 new organisms being
added eachmonth.
The database size in flat file format is over100GB
uncompressed and has been doubling every15 months.
International collaboration with the European MolecularBiology
Laboratory (EMBL) in the U.K. and the DNA Data Bank of Japan
(DDBJ) on daily basis.
-
8/7/2019 Emerging Database Teshnologies
42/44
Other limited data sources (e.g. three-dimensional structure and
Online Mendelian Inheritance in Man (OMIM), have been added
recently by reformatting the existing OMIM and PDB databases
and redesigning the structure of the GenBank system to
accommodate these new data sets.
The system is maintained as a combination of flat files,
relational databases, and files containing Abstract Syntax
Notation One (ASN.1)
The average user of the database is not able to access thestructure of the data directly for querying or other functions,
although complete snapshots of the database are available for
export in a number of formats, including ASN.1.
-
8/7/2019 Emerging Database Teshnologies
43/44
DATABASE
NAME
MAJOR
CONTENT
INITIAL
TECHNOLOGY
CURRENT
TECHNOLOGY
DB PROBLEM
AREAS
PRIMARY DATA
TYPES
GenBank DNA/RNA
sequence,
protein
Text files Flat-file/ASN.1 Schema browsing,
schema evolution,
linking to other dbs
Text, numeric, Some
complex types
OMIM Disease phenotypes
and genotypes,etc
Index cards/text files Flat-file/ASN.1 Unstructured, free
text entries linking toother dbs
Text
GDB Genetic map linkage
data
Flat file Relational Schema expansion /
evolution, complex
objects, linking to
other dbs
Text, Numeric
ACEDB Genetic map linkage
data, sequence
data(non-human)
OO OO Schema expansion
/evolution, linking to
other dbs
Text, Numeric
HGMDB Sequence and
sequence variants
Flat File-application
specific
Flat File-application
specific
Schema expansion
/evolution, linking to
other dbs
Text
EcoCyc Biochemical
reactions and
pathways
OO OO Locked into class
hierarchy, schema
evolution
Complex types, text,
numeric
-
8/7/2019 Emerging Database Teshnologies
44/44
Thanks