Денис Резник "relational database design. normalize till it hurts, then denormalize...
TRANSCRIPT
Golden Database Design Rule:Normalize till it hurtsDenormalize till it worksDenis ReznikData Architect at Intapp, Inc.Microsoft Data Platform MVPhttp:/reznik.uneta.com.ua@denisreznik
Database History
1960s 1970s 1980s 1990s 2000s Nowadays
Object Databases
RDMS Commercial
Success
SQL
RDBMSIngress
System R
E.F. Codd’s Paper
CODASYLIMS
NoSQL(Johan Oskarsson)
(?)Google BigTable
Paper
Amazon Dynamo Paper
Id UserId Name Date1 1 Work 07/03/2014
2 2 Test 09/09/2015
4 2 Rest 12/08/2015
Id Name Phone1 Bill +380678455732
2 John NULL
3 Mike +380501233427
Matter Client
Relations (Tables)
Attribute (Column)
Tuple (Row)
Relational Model
Normalization
• Normalization is the process of organizing the columns (attributes) and tables (relations) of a relational database to minimize data redundancy
Redundancy ComplexityTable Count
Company User Phone Phone TypeMicrosoft John Dow +380969785732 NULL
Microsoft John Dow +32345409123 NULL
Microsoft Larry McGregor +45678904692 NULL
Oracle Corp. John Snow +380988958371 NULL
Amazon Jack Snack +23348902385 Home
Amazon Jack Snack +69058763287 Work
First Normal Form (1NF)
• Each cell contains an atomic value
Company User PhoneMicrosoft John Dow Tel1: +380969785732, Tel2: +32345409123
Microsoft Larry McGregor Tel: +45678904692
Oracle Corp. John Snow +380988958371
Amazon Jack Snack Home: +23348902385 Work: +69058763287
MattersMatters
Second Normal Form (2NF)
• Table has a Key (Key = Primary Key)
• All non-key columns of the relation are depend from a a whole KeyMatters
Company User Company Address ManagerMicrosoft John Dow Redmond Jane Daw
Microsoft Duncan MacLeod Redmond John Dow
Microsoft John Snow Redmond TonyStark
Oracle Corp. John Dow California Rick Brick
Amazon Jack Snack Seattle George Black
Google Dale Cooper California Diana Smith
User Company ManagerJohn Dow Microsoft Jane Dow
Duncan MacLeod Microsoft John Dow
John Snow Microsoft Tony Stark
John Dow Oracle Corp. Rick Brick
Jack Snack Amazon George Black
Dale Cooper Google Diana Smith
Company AddressMicrosoft Redmond
Oracle Corp. California
Amazon Seattle
Google California
ClientsKey: (Client, Matter)
Second Normal Form (2NF)
• Table has a Key (Key = Primary Key)
• All non-key columns of the relation are depend from a a whole KeyMatters
Company User Company Address ManagerMicrosoft John Dow Redmond Jane Daw
Microsoft Duncan MacLeod Redmond John Dow
Microsoft John Snow Redmond TonyStark
Oracle Corp. John Dow California Rick Brick
Amazon Jack Snack Seattle George Black
Google Dale Cooper California Diana Smith
User Company ManagerJohn Dow Microsoft Jane Dow
Duncan MacLeod Microsoft John Dow
John Snow Microsoft Tony Stark
John Dow Oracle Corp. Rick Brick
Jack Snack Amazon George Black
Dale Cooper Google Diana Smith
Company AddressMicrosoft Redmond
Oracle Corp. California
Amazon Seattle
Google California
ClientsKey: (Client, Matter)
Third Normal Form (3NF)
• Every non-prime attribute of Relation is non-transitively dependent on every Key of Relation
Matters
Company User Manager Manager AgeMicrosoft John Dow Peter Parker 23
Microsoft Patrik Jones Steven Wu 45
Microsoft Jackie Adams Steven Wu 45
Oracle Corp. Ashley Grey John James 67
Amazon Scott McMillan John Smith 34
Amazon Mary Smith John Smith 34
Key: (Client, Matter)Matters
Company User ManagerMicrosoft John Dow Peter Parker
Microsoft Patrik Jones Steven Wu
Microsoft Jackie Adams Steven Wu
Oracle Corp. Ashley Grey Jean Claude
John Smith Scott McMillan John Smith
Adam Gram Mary Smith John Dow
Attorneys
Manager Manager Age
Peter Partner 23
Steven Wu 45
Jean Claude 67
John Smith 34
Fourth Normal Form (4NF)
• Eliminates independent many-to-one relationships between columns
Matters
Id Company Consultant1 Microsoft Peter Partner
2 Microsoft John Dow
3 Microsoft Amy Chen
4 Oracle Jim Beam
5 Amazon John Snow
6 Google John Snow
Matters
Id Company1 Microsoft
2 Oracle
3 Amazon
4 Google
Attorneys
Id Consultant1 Peter Partner
2 John Dow
3 Amy Chen
4 Jim Beam
5 John Snow
MatterAttorneys
CompanyId ConsultantId
1 1
1 2
1 3
2 4
3 5
4 5
Foreign KeysUsers
User Company ManagerJohn Dow Microsoft Jane Dow
Duncan MacLeod Microsoft John Dow
John Snow Microsoft Tony Stark
John Dow Oracle Corp. Rick Brick
Jack Snack Amazon George Black
Dale Cooper Google Diana Smith
Company AddressMicrosoft Redmond
Oracle Corp. California
Amazon Seattle
Google California
CompaniesKey: (User, Company)
FK_USERS_COMPANY
Foreign KeysUsers
User Company ManagerJohn Dow Microsoft Jane Dow
Duncan MacLeod Microsoft John Dow
John Snow Microsoft Tony Stark
John Dow Oracle Corp. Rick Brick
Jack Snack Amazon George Black
Dale Cooper Google Diana Smith
Company AddressMicrosoft Redmond
Oracle Corp. California
Amazon Seattle
Google California
CompaniesKey: (User, Company)
FK_USERS_COMPANY
UPDATE Users SET Company = 'Microsoft'WHERE User = Dale Cooper AND Company = Google'
The Law of Diminishing Returns
Thank You!
@[email protected]://reznik.uneta.com.ua/ https://www.facebook.com/denis.reznik.5https://www.linkedin.com/pub/denis-reznik/3/502/234