relational databases. database large collection of data in an organised format to allow access and...
TRANSCRIPT
Relational Databases
Database
• Large collection of data in an organised format to allow access and control
• DBMS Database Management System - Special software that helps to maintain the database and control its use.
• DBA Database Administrator (or team) maintains the database(s)
Relational Database RDB
• Database based on the mathematical theory of relations (Ted Codd, IBM)
• Not to be confused with relationships which
are logical connections between data.
A definition
• Logical– The way things appear to be (to the user)
• Physical– The way things really are (on the computer)
What makes RDB special?
• Data is tabular, we can view it in columns and rows
• There are no pointers or physical connections to represent relationships
• Uses an engine to match values “on the fly”.
• The order or columns and rows should not matter.
What are the benefits?
• You don’t get into a mess of broken pointers• You can use tabular data to generate more tabular data
• You can more easily add or delete rows and you can split and merge data more easily
• You can store tables in different locations• You can control row and column access• You can sort data
• Data management can be separated into functions• You only need one language to do everything
What is SQL
• SQL is Structured Query Language• It is not perfect but it is the standard• It varies a little between products• It is composed of– SQL Query– DCL Control– DDL Definition – DML Manipulation
DBA Role
• Recovery (The most important priority).– Is the database in a consistent state at a point in time?
• Access– Are we protecting the data properly?– What are the users and roles and what access do they
have?– What are the privacy requirements?
• Integrity– Can we trust the data?– How do we make structural changes?
Advanced Topics
• Concurrency– Multiple users accessing the same data at the same time– Locking– Versioning
• Transactions– All or nothing unit of work
• Distribution– Distributed data (Replication)– Distributed access
• Tuning– Optimising the database performance– What queries are taking the longest time?– What strategies can we use to improve performance?