5 key considerations for data modeling
DESCRIPTION
See the companion webinar at: http://embt.co/1sc5YZl According to a recent report from SINTEF, an independent research organization in Scandinavia, 90% of all the data in the world has been generated over the last two years. Where does it all go? And how do we make sense of it all? To get control of the data, you have to know more about that data and how it relates to other data. Data modeling provides the infrastructure needed to capture the right level of information about the data and its associated metadata. In this webinar, Torquil Harkness will discuss five key considerations for effective data modeling, including: + Aspects of model design + Attributes such as naming standards + The importance of planning for future growthTRANSCRIPT
EMBARCADERO TECHNOLOGIESEMBARCADERO TECHNOLOGIES
5 Key considerations for data modeling
Presenter: Torquil Harkness
Technical Writer
EMBARCADERO TECHNOLOGIES
Data….
• Every Minute of every day we create:- More than 204 million email messages
- Over 2 million Google search queries- 48 hours of new YouTube videos- More than 100,000 tweets
• Megabyte, gigabyte, terabyte, petabyte, exabyte, zettabyte, yottabytes
The new NSA facility in Utah can
hold 5 zettabytes of data.
To store only 1Zb of data, it would
take 62.5 billion iPhones!
EMBARCADERO TECHNOLOGIES
Island Clinic - Ebola Treatment Centre
WHO /C. Black http://www.who.int/features/2014/liberia-ebola-island-clinic/en/
Data… used to save lives
EMBARCADERO TECHNOLOGIES
Mobile phone location data
Data… used to save lives
Integrating data sets from anonymised mobile phone usage and demographic indicators.mage Credit: PLOS Currents.
EMBARCADERO TECHNOLOGIES
Topics
• Model Design
• Planning for Growth
• Naming Standards
• Data Lineage
• Big Data
EMBARCADERO TECHNOLOGIES
Model Design
Data
Sales
Logic
EMBARCADERO TECHNOLOGIES
Model Design
75% …………………..
25% …………………..
25% ………………….
12.50% ……………...
12.50% ………………
Visualizing can be useful to see the results
All purpose flour
Cake flour
Granulated sugar
Butter
Eggs
EMBARCADERO TECHNOLOGIES
Model Design
EMBARCADERO TECHNOLOGIES
Model Design
• Logical Model
- The organisation of your data. Basically, the Blueprint.
• Physical Model
- The ‘physical structure’ of the data in the database.
EMBARCADERO TECHNOLOGIES
Model Design
• Logical Model- The organisation of your data. Basically, the Blueprint.
• Physical Model- The ‘physical structure’ of the data in the database.
• Normalisation- Eliminating redundancy and mitigating corruption.- 1NF:the key, 2NF: the whole key, 3NF:nothing but the key.
So help me Codd, Edgar F.
EMBARCADERO TECHNOLOGIES
Model Design
Customer Name Customer Address Customer Tel No. Product Cost
Holmes, S 221B Baker St, London +44 1632 960957 Hat 44.99
Holmes, S 221B Baker St, London +44 1632 960957 Pipe 22.99
Fletcher, J 698 Candlewood Lane, Cabot Cove, Maine
+001 1632 960428 Typewriter 129.99
Fletcher, J 698 Candlewood Lane, Cabot Cove, Maine
+001 1632 960428 Hat 44.99
EMBARCADERO TECHNOLOGIES
Model DesignCustomer ID Customer Name Customer Address Customer Tel
No.
20 Holmes, S 221B Baker St, London
+44 1632 960957
20 Holmes, S 221B Baker St, London
+44 1632 960957
30 Fletcher, J 698 Candlewood Lane, Cabot Cove, Maine
+001 1632 960428
30 Fletcher, J 698 Candlewood Lane, Cabot Cove, Maine
+001 1632 960428
Order ID Customer ID
ORD001 20
ORD002 30
ORD003
Order ID Product ID Quantity
ORD001 001 1
ORD001 002 1
ORD002 003 1
ORD002 001 1
Product ID Product Cost
001 Hat 44.99
002 Pipe 22.99
003 Typewriter 129.99
Order details
Orders
Products
Customer
EMBARCADERO TECHNOLOGIES
Naming Standards
• An example of a very short naming standard.
tNYEZC - table of NY Employees Zip Code.
EMBARCADERO TECHNOLOGIES
Naming Standards
• Be clear and understandable to everyone.
• Add a detail of description – tbl for a table etc.
• Use a ‘naming standards template’ to ensure consistency.
EMBARCADERO TECHNOLOGIES
Planning for Growth
• Each engine of a jet on a flight from London to New York generates 10TB of data every 30 minutes.
Source: Pratt and Whitney.
• 90% of the World’s data generated over the last two years.
Source: Science Daily.
EMBARCADERO TECHNOLOGIES
Planning for Growth
• Planning for Storage
• Predicting Growth
EMBARCADERO TECHNOLOGIES
Planning for Growth
EMBARCADERO TECHNOLOGIES
Data Lineage
• FACT: 73.8 percent of facts are made up!
EMBARCADERO TECHNOLOGIES
Data Lineage
• The Data Trail
EMBARCADERO TECHNOLOGIES
Data Lineage
• The Data Trail
EMBARCADERO TECHNOLOGIES
Big Data
EMBARCADERO TECHNOLOGIES
Big Data
EMBARCADERO TECHNOLOGIES
Big Data
EMBARCADERO TECHNOLOGIES
Summary
• Model Design
• Planning for Growth
• Naming Standards
• Data Lineage
• Big Data
EMBARCADERO TECHNOLOGIES
Concluding Remarks
25
EMBARCADERO TECHNOLOGIES
Thank you!
• Product Videos: http://www.embarcadero.com/products/er-studio/product-videos
• Wiki and Documentation: http://docs.embarcadero.com/
• Learn more about the ER/Studio product family: http://www.embarcadero.com/data-modeling
• Trial Downloads: http://www.embarcadero.com/downloads
• To arrange a demo, please contact Embarcadero Sales: [email protected], (888) 233-2224
26