avoiding common database pitfalls

26
Avoiding Common Database Pitfalls Derek Binkley @DerekB_WI

Upload: derek-binkley

Post on 09-Jan-2017

246 views

Category:

Data & Analytics


2 download

TRANSCRIPT

Page 1: Avoiding common database pitfalls

Avoiding Common Database Pitfalls

Derek Binkley@DerekB_WI

Page 2: Avoiding common database pitfalls

About Me★Lead Developer at National Conference of

Bar Examiners★PHP and Java Developer★MySQL DBA★Father of Three★First ever talk

Page 3: Avoiding common database pitfalls

About NCBEExamining Bars?

Page 4: Avoiding common database pitfalls

About NCBENo, developing the bar exam for future lawyers.

Plus, supporting state admission authorities.

Page 5: Avoiding common database pitfalls

What is this talk about anyway?

Relational databases:Why to use them

Avoiding common mistakesGetting the most of a database

Good data design

Page 6: Avoiding common database pitfalls

Common ways to store data★Relational database (MySQL, SQL Server,

Oracle, Postgres)★NoSQL/document database/Key-value store

(CouchDB, MongoDB, Redis)★File system★Custom Built Solutions

Page 7: Avoiding common database pitfalls

Database Normalization1st Normal Form• Primary Key enforcing uniqueness• Consistent Field Types• Consistent Column Count

2nd Normal Form• Parent/Child Relations• Foreign Keys

3rd Normal Form• No transitive dependencies

Page 8: Avoiding common database pitfalls

Pitfall #1

Not understanding how to structure data

Page 9: Avoiding common database pitfalls

No clearly defined columnsColumns were left unlabeled so that customers could define their own data.

Very difficult for future developers

Page 10: Avoiding common database pitfalls

SQL with no defined columns

How maintainable is this?Do you know what this does?How would you map this to an object in PHP?

Page 11: Avoiding common database pitfalls

SQL well defined columns

This statement is quite clear what you are selecting and what you will get.

Page 12: Avoiding common database pitfalls

A more flexible solution?If Developer Needs• Flexible data storage• Customer ability to define meaning of data

NoSQL – Great use case.• Allows data design to change on the fly• Document defines meaning of underlying data.

Page 13: Avoiding common database pitfalls

Not using foreign key

Not taking advantage of database’s ability to store and index a number as a foreign key.

Violates 2nd Normal form.Cannot ensure data consistency

Page 14: Avoiding common database pitfalls

Keys with meaning

In this example any ID with a value < 100 gets administrative rights.

Problems?It is not clear the meaning of the data without the PHP code above.Administrative access could accidentally be given.Artificially limited to 99 administrators over life of system.

Page 15: Avoiding common database pitfalls

Natural Key v. Surrogate Key• A surrogate key has no meaning, e.g. GUID, incremental key• A natural key is a unique data attribute already in table, e.g.

two digit state code.

Page 16: Avoiding common database pitfalls

Reduce Joins with Natural KeySurrogate keys require join to get address of customer

Natural key allows understanding of state field without a join.

Page 17: Avoiding common database pitfalls

Natural Key May ChangeA primary key must be stable over time.In the United States, two digit state codes are historically stable.Examples of data thought to be stable may not be.• Country code – Yugoslavia? Soviet Union?• SSN – Privacy concerns require SSN to be encrypted?• Naturalized surrogate key – Once a surrogate key is shown to

user it starts to have meaning.

Page 18: Avoiding common database pitfalls

Pitfall #2

Not using the database for what it is good at

Page 19: Avoiding common database pitfalls

Not defining keys/indexes1. Ensure Data Integrity2. Boost PerformanceDatabase System optimizes queries for you.Explain plan can help provide useful information to see the

efficiency of keys.

Page 20: Avoiding common database pitfalls

Custom Transaction Handling• Databases are built to handle transaction.• App created its own locking table.• Developers spent time recreating a feature

they had already paid for.

• Deadlock condition required contacting tech support.

Page 21: Avoiding common database pitfalls

Not using processing power of DBDatabase is not just a storage engine.Powerful platform for sorting, filtering, grouping

and summarizing data.

Where do you put your domain logic?

Page 22: Avoiding common database pitfalls

Domain Logic In MemoryLogic is entirely in your PHP code, database is merely

used for storage.Easy to refactorEasy to unit testPart of your version control systemEasy to deployInefficient use of SQLHigh overheard for database connections and queries

Page 23: Avoiding common database pitfalls

Domain Logic In DatabaseSome logic is in SQL (procedures, views or direct SQL)More efficient: less memory in PHP, less connection

timeSQL statement will be much smaller and possibly easier

to understand.

Page 24: Avoiding common database pitfalls

Object Relational Mapping - ORM • Simplifies data access by mapping tables to

objects.• Can lead to more complex in memory

manipulation of data.• Can tie in to query mechanism to offload work

and logic to database.

Page 25: Avoiding common database pitfalls

Mishandling of nulls• Null should not be meaningful.• Null = “I don’t know”• Null column not equal to another null column• Check for “is null”

Page 26: Avoiding common database pitfalls

ResourcesA Simple Guide to Five Normal Forms in Relational Database Theory - http://www.bkent.net/Doc/simple5.htmNormalization: Friend or Foe? - http://www.treelinedesign.com/slides/normalization/confoo14.pdfDomain Logic and SQL - http://martinfowler.com/articles/dblogic.htmlChoosing a Primary Key: Natural or Surrogate? - http://www.agiledata.org/essays/keys.htmlSurrogate Key vs. Natural Key - http://sqlmag.com/business-intelligence/surrogate-key-vs-natural-key

Slides are available at ????

Please give feedback at https://joind.in/16419