databases unplugged industry consolidation & evolution cheryl stepney microsoft corporation
TRANSCRIPT
Databases UnpluggedDatabases UnpluggedIndustry Consolidation & EvolutionIndustry Consolidation & Evolution
Cheryl StepneyCheryl Stepney
Microsoft CorporationMicrosoft Corporation
AgendaAgenda
Core Components Core Components Database ModelsDatabase Models Modeling the DatabaseModeling the Database Job Roles & Opportunities Job Roles & Opportunities Database VendorsDatabase Vendors Industry Convergence - XMLIndustry Convergence - XML
What is a DatabaseWhat is a Database
An organized set of dataAn organized set of data Have discrete fields with datatype Have discrete fields with datatype
definitionsdefinitions Ensure data accuracy via validation Ensure data accuracy via validation
rulesrules Be able to easily query the data using Be able to easily query the data using
the definitionsthe definitions A core component of every computer A core component of every computer
application in the world todayapplication in the world today
Database TypesDatabase Types Flat FilesFlat Files
1960’s – present1960’s – present Comma/tab delimited, no structure?Comma/tab delimited, no structure? Order No., Customer Name, Customer Location, Product A name, Product A price, Order No., Customer Name, Customer Location, Product A name, Product A price,
Product B name, Order Total, end recordProduct B name, Order Total, end record
HierarchicalHierarchical 1970’s – present1970’s – present Structured, non-flexible, hard to change schemaStructured, non-flexible, hard to change schema Example: IBM’s IMSExample: IBM’s IMS Segments of Customer, Order and Product linked by keys held until reorganizationSegments of Customer, Order and Product linked by keys held until reorganization
RelationalRelational 1980’s – present1980’s – present Flexible, links based on data values, primary and foreign keysFlexible, links based on data values, primary and foreign keys Tables are linked by the existence of data in a rowTables are linked by the existence of data in a row Separate Tables: Order, Customer, Location, Product, Order Line Detail, Contact at Separate Tables: Order, Customer, Location, Product, Order Line Detail, Contact at
LocationLocation
Object OrientedObject Oriented 1990’s – present1990’s – present Subject oriented, slow to gain adoption, slow performanceSubject oriented, slow to gain adoption, slow performance Objects: Order and CustomerObjects: Order and Customer
Database TerminologyDatabase Terminology Logical DesignLogical Design
Entities – things about which information needs to Entities – things about which information needs to be known or heldbe known or held
Relationships – Connectors between appropriate Relationships – Connectors between appropriate data data
Physical Design – User ViewPhysical Design – User View Tables - Columns and RowsTables - Columns and Rows Keys - Primary and ForeignKeys - Primary and Foreign Tables are linked by KeysTables are linked by Keys
Major Core Major Core Data ModelData Model Structured Query Language (SQL)Structured Query Language (SQL) Computational ModelComputational Model Query OptimizerQuery Optimizer Extensible Markup Language (XML)Extensible Markup Language (XML)
Database ComponentsDatabase Components Major Core FunctionalityMajor Core Functionality
Data ModelData Model Structured Query Language (SQL)Structured Query Language (SQL) Computational ModelComputational Model Query OptimizerQuery Optimizer Extensible Markup Language (XML)Extensible Markup Language (XML) Security Module Security Module
ComponentsComponents Tables Tables
Constraints – eg. Zip code must be 5 integers, Constraints – eg. Zip code must be 5 integers, mandatorymandatory
Defaults – eg. blank or null on Middle InitialDefaults – eg. blank or null on Middle Initial Indexes – Table and ViewIndexes – Table and View User-defined data typesUser-defined data types Keys Keys ViewsViews
User-defined functionsUser-defined functions Triggers Triggers Stored proceduresStored procedures
Database Career Roles Database Career Roles
Database DesignerDatabase Designer Data ArchitectData Architect Database ModelerDatabase Modeler
DBA – Database AdministratorDBA – Database Administrator Intended responsibilitiesIntended responsibilities Current Role DefinitionCurrent Role Definition
Business UserBusiness User Business IntelligenceBusiness Intelligence Data AnalystData Analyst
System Development LifecycleSystem Development LifecycleWhere Data and Code InteractWhere Data and Code Interact
Strategic AnalysisStrategic Analysis Data ModelData Model Functional DecompositionFunctional Decomposition
Detailed AnalysisDetailed Analysis DesignDesign CodeCode Test Test ProductionProduction
Cost of Making A ChangeCost of Making A Change
Strategic AnalysisStrategic Analysis $ 1 x n$ 1 x n Data ModelData Model Functional DecompositionFunctional Decomposition
Detailed AnalysisDetailed Analysis $ 5 x n$ 5 x n DesignDesign $ 50 x n$ 50 x n CodeCode $ 100 x n$ 100 x n Test Test $ 500 x n$ 500 x n ProductionProduction $1000 x n$1000 x n
Data Modeling Data Modeling HIPO ChartsHIPO Charts
IBM – Hierarchical Input/Output DiagramsIBM – Hierarchical Input/Output Diagrams Gane-Sarson – DFD (Data Flow Diagram)Gane-Sarson – DFD (Data Flow Diagram)
Entity / Relationship ModelingEntity / Relationship Modeling IDEFIX StandardsIDEFIX Standards
System ArchitectSystem Architect Oracle’s CASE MethodOracle’s CASE Method Microsoft’s VisioMicrosoft’s Visio
Express-GExpress-G Standard for Exchange of Product Model DataStandard for Exchange of Product Model Data
Object Role Modeling – ORMObject Role Modeling – ORM Microsoft’s VisioMicrosoft’s Visio
Relational ModelRelational ModelWhat is 1What is 1stst, 2, 2ndnd, 3, 3rdrd Normal Form? Normal Form?What is Normalization?What is Normalization?
OrderOrder
CustomerCustomer
ProductProduct• Remove Remove repeating repeating GroupsGroups
• Remove Remove dependenciesdependencies
• Cater for TimeCater for Time
NormalizationNormalization First normal form (1NF)First normal form (1NF)
It contains two-dimensional tables with rows and columns. It contains two-dimensional tables with rows and columns. Each column corresponds to a sub-object or an attribute of the object Each column corresponds to a sub-object or an attribute of the object
represented by the entire tablerepresented by the entire table Each row represents a unique instance of that sub-object or attribute and Each row represents a unique instance of that sub-object or attribute and
must be different in some way from any other row (that is, no duplicate must be different in some way from any other row (that is, no duplicate rows are possible). rows are possible).
All entries in any column must be of the same kindAll entries in any column must be of the same kind Second normal form (2NF)Second normal form (2NF)
Each column in a table that is not a determiner of the contents of another Each column in a table that is not a determiner of the contents of another column must itself be a function of the other columns in the tablecolumn must itself be a function of the other columns in the table
For example, in a table with three columns containing customer ID, For example, in a table with three columns containing customer ID, product sold, and price of the product when sold, the price would be a product sold, and price of the product when sold, the price would be a function of the customer ID (entitled to a discount) and the specific function of the customer ID (entitled to a discount) and the specific productproduct
Third normal form (3NF) Third normal form (3NF) For example, using the customer table just cited, removing a row For example, using the customer table just cited, removing a row
describing a customer purchase (because of a return perhaps) will also describing a customer purchase (because of a return perhaps) will also remove the fact that the product has a certain priceremove the fact that the product has a certain price
In the third normal form, these tables would be divided into two tables so In the third normal form, these tables would be divided into two tables so that product pricing would be tracked separately. that product pricing would be tracked separately.
Domain/key normal form (0NF)Domain/key normal form (0NF) A key uniquely identifies each row in a table. A domain is the set of A key uniquely identifies each row in a table. A domain is the set of
permissible values for an attribute. By enforcing key and domain permissible values for an attribute. By enforcing key and domain restrictions, the database is assured of being freed from modification restrictions, the database is assured of being freed from modification anomaliesanomalies
Relational Model ExampleRelational Model Example
Order Order DetailDetail
ContactContact
OrderOrder
CustomerCustomer
ProductProduct
ProductProductTypeType
ofofplaceplacebyby
withwith
onon
forfor supplied viasupplied via
subject ofsubject of
responsible forresponsible for
employeremployerofof
Object Oriented ModelingObject Oriented Modeling
entityentity
Shapes:Shapes:• ObjectsObjects
• EntityEntity• ValueValue
• ConstraintsConstraints• ConnectorsConnectors
• MandatoryMandatory• UniquenessUniqueness
• PredicatesPredicates• RolesRoles
valuevalue
Database VendorsDatabase Vendors Flat Files Flat Files 1960’s – present1960’s – present
All file systems start out as flatAll file systems start out as flat Hierarchical Hierarchical 1970’s – present1970’s – present
IBM’s IMS is still in useIBM’s IMS is still in use Relational (RDBMS) Relational (RDBMS) 1980’s – present1980’s – present
IBM – DB/2, UDB (Universal Database), InformixIBM – DB/2, UDB (Universal Database), Informix Oracle – Oracle 7.xOracle – Oracle 7.x Microsoft – SQL Server 2000Microsoft – SQL Server 2000 Sybase – Dynamic SQLSybase – Dynamic SQL Computer Associates – OpenIngresComputer Associates – OpenIngres
Object Oriented (ODBMS) 1990’s – presentObject Oriented (ODBMS) 1990’s – present Computer Associates - JasmineComputer Associates - Jasmine GemstoneGemstone O2O2 Object Store Object Store ObjectivityObjectivity Versant ODBMSVersant ODBMS IBM – Informix IllustraIBM – Informix Illustra
Former Relational Database Former Relational Database VendorsVendors IngresIngres InformixInformix UnifyUnify CullinetCullinet Dec Digital – RDBDec Digital – RDB VerityVerity Natural LanguageNatural Language
Relational Market ShareRelational Market Share Gartner perspective 2002Gartner perspective 2002
Based on Based on RevenueRevenue UnitsUnits IBMIBM 31%31%
20%20% DB/2, UDB (Universal Database)DB/2, UDB (Universal Database) Acquired Informix to gain leadAcquired Informix to gain lead
OracleOracle 30%30% 25%25% Oracle 7.xOracle 7.x
Microsoft Microsoft 29%29% 50%50% SQL Server 2000SQL Server 2000
SybaseSybase Dynamic SQLDynamic SQL 5% 5% 3% 3%
Computer Associates Computer Associates OpenIngresOpenIngres 5% 5% 2% 2%
Relational Database VendorsRelational Database Vendors IBMIBM
IMS – 1960’s, transactional, still in useIMS – 1960’s, transactional, still in use DB/2 – implemented around 1990DB/2 – implemented around 1990
Data ManagementData Management On Version 8.1On Version 8.1
Initially, Mainframe basedInitially, Mainframe based AS/400 – UDBAS/400 – UDB DB2 for LinuxDB2 for Linux Different code one each platformDifferent code one each platform
Market share: 31% Market share: 31%
Relational Database VendorsRelational Database Vendors OracleOracle
IPO 1986, founded in 1977IPO 1986, founded in 1977 Project Oracle to get funding (CIA)Project Oracle to get funding (CIA) Implemented IBM’s “System R” PaperImplemented IBM’s “System R” Paper Core to their business applicationsCore to their business applications Multi-Platform is business goalMulti-Platform is business goal
Unix, Linux, Mainframe, Windows, etc.Unix, Linux, Mainframe, Windows, etc. Oracle 9iOracle 9i
Market share: 30%Market share: 30% Several product offerings to buySeveral product offerings to buy
SQL Plus, Report Writer, Discoverer, Oracle Developer SQL Plus, Report Writer, Discoverer, Oracle Developer SuiteSuite
ApplicationsApplications Oracle Financials - Oracle 11iOracle Financials - Oracle 11i Oracle Collaboration SuiteOracle Collaboration Suite E-Business SuiteE-Business Suite
Microsoft SQL Server HistoryMicrosoft SQL Server History Not a Database Company at IPO in 1986Not a Database Company at IPO in 1986 HistoryHistory
1992: Beginning of SQL Server on Windows® 1992: Beginning of SQL Server on Windows® 1996: SQL Server 6.5 Ships1996: SQL Server 6.5 Ships 1998: SQL Server 7.0 released – complete rewrite 1998: SQL Server 7.0 released – complete rewrite 2000: SQL Server 2000 w. Data Warehousing 2000: SQL Server 2000 w. Data Warehousing 2001: SQL Server Wins Numerous Awards2001: SQL Server Wins Numerous Awards
Scaleable from pocket pc to Intel Mainframe – but only Scaleable from pocket pc to Intel Mainframe – but only on Windowson Windows
Market share: 25% based on RevenueMarket share: 25% based on Revenue All components in one box for single priceAll components in one box for single price
Transactional, OLAP, Data TransformationTransactional, OLAP, Data Transformation Notification Services, Reporting ServicesNotification Services, Reporting Services Data WarehouseData Warehouse First to support XML – no extra chargeFirst to support XML – no extra charge
Retail Price ComparisonRetail Price Comparison
44 $320,000$320,000 $79,996$79,996 $82,000$82,000
88 $640,000$640,000 $159,992$159,992 $164,000$164,000
1616 $1,280,000$1,280,000 $319,984$319,984 $328,000$328,000
3232 $2,560,000$2,560,000
**
$639,968$639,968
****
$656,000$656,000
******
# of# ofCPUsCPUs
ORACLEORACLEEnterpriseEnterprise
SQL ServerSQL ServerEnterpriseEnterprise
IBMIBMEnterpriseEnterprise
* Oracle - Additional for Reports, Data Warehouse, * Oracle - Additional for Reports, Data Warehouse, ** Microsoft - All Services in one price** Microsoft - All Services in one price*** IBM - Different pricing depending on platform*** IBM - Different pricing depending on platform
Market InnovationMarket Innovation
The Big 3The Big 3 Oracle Corporation Oracle Corporation IBMIBM MicrosoftMicrosoft
Transactional DatabasesTransactional Databases Data WarehousesData Warehouses Data Analysis – Business IntelligenceData Analysis – Business Intelligence XML supportXML support Full Text SearchFull Text Search
XMLXML
Extensible Markup Language (XML) is a simple, very Extensible Markup Language (XML) is a simple, very flexible text format derived from SGML (flexible text format derived from SGML (ISO 8879ISO 8879). ).
Originally designed to meet the challenges of large-Originally designed to meet the challenges of large-scale electronic publishingscale electronic publishing
XML is also playing an increasingly important role in XML is also playing an increasingly important role in the exchange of a wide variety of data on the Web the exchange of a wide variety of data on the Web and elsewhereand elsewhere
Extensible Markup Language (XML) 1.0 (Second EditiExtensible Markup Language (XML) 1.0 (Second Edition)on)
W3C Recommendation 6 October 2000W3C Recommendation 6 October 2000
Working GroupsWorking Groups XML Coordination GroupXML Coordination Group XML Core Working GroupXML Core Working Group XSL Working GroupXSL Working Group
Extensible Stylesheet LanguageExtensible Stylesheet Language XML Linking Working GroupXML Linking Working Group XML Query Working GroupXML Query Working Group XML Schema Working GroupXML Schema Working Group
http://www.w3.org/
Leading the Web to Its Full Potential...
A SQL QueryA SQL Query
Select ACO.Name 'Owner Name', ACC.Name 'Region Name', ACB.Name Select ACO.Name 'Owner Name', ACC.Name 'Region Name', ACB.Name 'Site Name' 'Site Name'
From AtriumComponent ACOFrom AtriumComponent ACO
Join AtriumComponent ACC On ACC.ContainerKey = ACO.ComponentIdJoin AtriumComponent ACC On ACC.ContainerKey = ACO.ComponentId
Join AtriumComponent ACB On ACB.ContainerKey = ACC.ComponentIdJoin AtriumComponent ACB On ACB.ContainerKey = ACC.ComponentId
Where ACO.ContainerKey = -1 Where ACO.ContainerKey = -1
Order By ACO.Name, ACC.Name, ACB.NameOrder By ACO.Name, ACC.Name, ACB.Name
For XML Auto QueryFor XML Auto Query
Select ACO.Name, ACC.Name, ACB.Name From Select ACO.Name, ACC.Name, ACB.Name From AtriumComponent ACOAtriumComponent ACO
Join AtriumComponent ACC On ACC.ContainerKey Join AtriumComponent ACC On ACC.ContainerKey = ACO.ComponentId= ACO.ComponentId
Join AtriumComponent ACB On ACB.ContainerKey Join AtriumComponent ACB On ACB.ContainerKey = ACC.ComponentId= ACC.ComponentId
Where ACO.ContainerKey = -1Where ACO.ContainerKey = -1
Order By ACO.Name, ACC.Name, ACB.Name Order By ACO.Name, ACC.Name, ACB.Name
For XML AUTOFor XML AUTO
Query ResultsQuery ResultsOwner Name Region Name Site Name Owner Name Region Name Site Name
-------------------------------------------------- -------------------------------------------------- -------------------------------------------------- --------------------------------------------------
-------------------------------------------------- -------------------------------------------------- Atrium Music Stores Eastern Region Atrium Music Store Atrium Music Stores Eastern Region Atrium Music Store
3838Atrium Music Stores Eastern Region Atrium Music Store Atrium Music Stores Eastern Region Atrium Music Store
3939Atrium Music Stores Eastern Region Atrium Music Store Atrium Music Stores Eastern Region Atrium Music Store
4141Atrium Music Stores Western Region Atrium Music Store Atrium Music Stores Western Region Atrium Music Store
3434Atrium Music Stores Western Region Atrium Music Store Atrium Music Stores Western Region Atrium Music Store
3737Atrium Music Stores Western Region Atrium Music Store Atrium Music Stores Western Region Atrium Music Store
4242Atrium Music Stores Western Region Atrium Music Store Atrium Music Stores Western Region Atrium Music Store
4444
(7 row(s) affected)(7 row(s) affected)