nosql databases

32
NoSQL Database: New Era of Databases for Big data Analytics - Classification, Characteristics and Comparison A B M Moniruzzaman and Syed Akhter Hossain 06/07/22 1 CSC 8710

Upload: meshal-albeedhani

Post on 15-Jan-2015

283 views

Category:

Data & Analytics


3 download

DESCRIPTION

This presentation explains why NoSQL databases came over SQL databases although SQL databases has been successfully technology for more than twenty years. Moreover, This presentation discuses the characteristics and classifications of NoSQL databases. Finally, These slides cover four NoSQL databases briefly.

TRANSCRIPT

Page 1: NoSQL databases

NoSQL Database: New Era of Databases for Big data Analytics - Classification, Characteristics and

Comparison

A B M Moniruzzaman and Syed Akhter Hossain

04/10/23 1CSC 8710

Page 2: NoSQL databases

Contents

• NoSQL databases definition• Why NoSQL databases?• Characteristics of NoSQL Databases• Primary Uses of NoSQL Database• Key-Value databases• Documents databases• Column-Family databases• Graph databases• Adoption of NoSQL Database • Conclusion

04/10/23 CSC 8710 2

Page 3: NoSQL databases

NoSQL Database

• NoSQL for Not Only SQL, refers to an eclectic and increasingly familiar group of non-relational data management system

• databases are not built primarily on tables, and generally don't use SQL for data manipulation.

• NoSQL systems are distributed, non-relational database, designed for large-scale data storage and for massive-parallel data processing across a large number of commodity servers.

04/10/23 CSC 8710 3

Page 4: NoSQL databases

NoSQL Database

• They also use non-SQL languages and mechanisms to interact with data.

• NoSQL database systems arose alongside major Internet companies, such as Google, Amazon, and Facebook which had challenges in dealing with huge quantities of data

• These systems are designed to scale thousands or millions of users doing updates as well as reads, in contrast to traditional DBMSs and data warehouses

04/10/23 CSC 8710 4

Page 5: NoSQL databases

Why NoSQL?

• Relational DBMSs have been a successful technology for many years, providing persistence, concurrency control and integration mechanisms.

• The need of processing large amount of data changes the direction from scaling vertically to scaling horizontally on clusters.

04/10/23 CSC 8710 5

Page 6: NoSQL databases

Why NoSQL?

• NoSQL databases focus on analytical processing of large scale datasets, offering increased scalability over commodity hardware

• Organizations that collect large amounts of unstructured data are increasingly turning to non-relational databases (NoSQL databases).

04/10/23 CSC 8710 6

Page 7: NoSQL databases

Big Data

04/10/23 CSC 8710 7

Page 8: NoSQL databases

Characteristics of NoSQL Databases

• Strong Consistency: all clients see the same version of data.

• High Availability: Data always available, at least one copy of the requested data even if one of the nodes is down.

• Partition-tolerance: the total system keeps its characteristic even when being deployed on different servers

04/10/23 CSC 8710 8

Page 9: NoSQL databases

Characteristics of NoSQL Databases

04/10/23 CSC 8710 9

Page 10: NoSQL databases

Primary Uses of NoSQL Database

1. Large-scale data processing

2. Exploratory analytics on semi-structured data (expert level)

3. Large volume data storage.

04/10/23 CSC 8710 10

Page 11: NoSQL databases

Classification of NoSQL Databases

• Key-Value databases

• Documents databases

• Column Family databases

• Graphics databases

04/10/23 CSC 8710 11

Page 12: NoSQL databases

Key-Value Databases

• These DMS store items as alpha-numeric identifiers that refer to the keys. Each key has associated values.

• The values could be simple text strings or more complex lists and sets

• Search only performed against keys, and limited to exact matches.

• Search cannot be performed against values

04/10/23 CSC 8710 12

Page 13: NoSQL databases

Key-Value Databases

04/10/23 CSC 8710 13

Page 14: NoSQL databases

Key-Value characterstics

• The simplicity of Key-Value Store makes them very quick and light.

• Highly scalable retrieval of the values needed for application tasks such as retrieving product names.

• This is why Amazon use K-V system, Dynamo, in its shopping cart. Dynamo is a highly available key-value storage system.

• Example: Dynamo (Amazon), Voldemort (LinkedIn) Redis, BerkeleyDB, Riak

04/10/23 CSC 8710 14

Page 15: NoSQL databases

Pros and Cons

• pros: anything can be stored in an aggregate

• cons: only key lookup to access the entire aggregate is allowed (no query and part of aggregate retrieval mechanisms)

04/10/23 CSC 8710 15

Page 16: NoSQL databases

Document Database

• Designed to manage and store documents.

• These documents are encoded in a standard data exchange format such as XML, JSON (Javascript Option Notation) or BSON (Binary JSON).

04/10/23 CSC 8710 16

Page 17: NoSQL databases

Document Database

04/10/23 CSC 8710 17

Page 18: NoSQL databases

Primary Uses

• Document databases are good for storing and managing Big Data-size collections of literal documents such as text documents, email messages.

04/10/23 CSC 8710 18

Page 19: NoSQL databases

Pros And Cons

• pros: allow structured queries and partial aggregate retrieval based on the fields in the aggregate

• cons: imposes a limit on what can be placed in a database

04/10/23 CSC 8710 19

Page 20: NoSQL databases

Column-Family Databases

• It consists of a Key-Value pair where the value consists of set of columns.

•  The column family databases are represented in tables, each key-value pair being a row.

• All the related data can be grouped as one family

04/10/23 CSC 8710 20

Page 21: NoSQL databases

Primary Uses

1. Large-scale, batch-oriented data processing: sorting, parsing, conversion :

- conversions between hexadecimal, binary and decimal code values.

2. Exploratory and predictive analytics performed by expert statisticians and programmers.

04/10/23 CSC 8710 21

Page 22: NoSQL databases

Column-Family

04/10/23 CSC 8710 22

Page 23: NoSQL databases

Graph Databases

• Graph databases replace relational tables with structured relational graphs of interconnected key-value pairings.

• Graph databases are useful when you are more interested in relationships between data than the data itself and it works perfectly for the social network.

• It is optimized for relationship traversing not for querying

• Examples: Neo4j, InfoGrid, Sones GraphDB, AllegroGraph, InfiniteGraph

04/10/23 CSC 8710 23

Page 24: NoSQL databases

Graph Databases

04/10/23 CSC 8710 24

Page 25: NoSQL databases

Adoption of NoSQL Database

• Organizations that have massive data storage are looking seriously at NoSQL.

• NoSQL Database expert are highly demanded for most of the developing organizations.

• The next graph shows job trends of five NoSQL Databases from Indeed.com

04/10/23 CSC 8710 25

Page 26: NoSQL databases

Job Trends of Five NoSQL Databases

04/10/23 CSC 8710 26

Page 27: NoSQL databases

Adoption of NoSQL Database

• MongoDB‘s growth means that it has cemented its place as the most popular NoSQL database.

• According to LinkedIn profile mentions, The mentions of NoSQL technologies form 45% in LinkedIn profiles.

04/10/23 CSC 8710 27

Page 28: NoSQL databases

LinkedIn statistics

04/10/23 CSC 8710 28

Page 29: NoSQL databases

Conclusion

• Computational and storage requirements of applications such as for Big Data analytics, Business Intelligence and social networking over peta-byte datasets led us to the change from SQL to NoSQL DBs.

• This led to the development of horizontally scalable, distributed non-relational No-SQL databases.

• MongoDB‘s is the most demanded one.

04/10/23 CSC 8710 29

Page 30: NoSQL databases

Resources

• http://arxiv.org/ftp/arxiv/papers/1307/1307.0191.pdf

• http://en.wikipedia.org/wiki/Column_family

• http://en.wikipedia.org/wiki/NoSQL

04/10/23 30CSC 8710

Page 31: NoSQL databases

04/10/23 31CSC 8710

Page 32: NoSQL databases

04/10/23 32CSC 8710