Download - NoSQL databases

Transcript
Page 1: NoSQL databases

NoSQL Database: New Era of Databases for Big data Analytics - Classification, Characteristics and

Comparison

A B M Moniruzzaman and Syed Akhter Hossain

04/10/23 1CSC 8710

Page 2: NoSQL databases

Contents

• NoSQL databases definition• Why NoSQL databases?• Characteristics of NoSQL Databases• Primary Uses of NoSQL Database• Key-Value databases• Documents databases• Column-Family databases• Graph databases• Adoption of NoSQL Database • Conclusion

04/10/23 CSC 8710 2

Page 3: NoSQL databases

NoSQL Database

• NoSQL for Not Only SQL, refers to an eclectic and increasingly familiar group of non-relational data management system

• databases are not built primarily on tables, and generally don't use SQL for data manipulation.

• NoSQL systems are distributed, non-relational database, designed for large-scale data storage and for massive-parallel data processing across a large number of commodity servers.

04/10/23 CSC 8710 3

Page 4: NoSQL databases

NoSQL Database

• They also use non-SQL languages and mechanisms to interact with data.

• NoSQL database systems arose alongside major Internet companies, such as Google, Amazon, and Facebook which had challenges in dealing with huge quantities of data

• These systems are designed to scale thousands or millions of users doing updates as well as reads, in contrast to traditional DBMSs and data warehouses

04/10/23 CSC 8710 4

Page 5: NoSQL databases

Why NoSQL?

• Relational DBMSs have been a successful technology for many years, providing persistence, concurrency control and integration mechanisms.

• The need of processing large amount of data changes the direction from scaling vertically to scaling horizontally on clusters.

04/10/23 CSC 8710 5

Page 6: NoSQL databases

Why NoSQL?

• NoSQL databases focus on analytical processing of large scale datasets, offering increased scalability over commodity hardware

• Organizations that collect large amounts of unstructured data are increasingly turning to non-relational databases (NoSQL databases).

04/10/23 CSC 8710 6

Page 7: NoSQL databases

Big Data

04/10/23 CSC 8710 7

Page 8: NoSQL databases

Characteristics of NoSQL Databases

• Strong Consistency: all clients see the same version of data.

• High Availability: Data always available, at least one copy of the requested data even if one of the nodes is down.

• Partition-tolerance: the total system keeps its characteristic even when being deployed on different servers

04/10/23 CSC 8710 8

Page 9: NoSQL databases

Characteristics of NoSQL Databases

04/10/23 CSC 8710 9

Page 10: NoSQL databases

Primary Uses of NoSQL Database

1. Large-scale data processing

2. Exploratory analytics on semi-structured data (expert level)

3. Large volume data storage.

04/10/23 CSC 8710 10

Page 11: NoSQL databases

Classification of NoSQL Databases

• Key-Value databases

• Documents databases

• Column Family databases

• Graphics databases

04/10/23 CSC 8710 11

Page 12: NoSQL databases

Key-Value Databases

• These DMS store items as alpha-numeric identifiers that refer to the keys. Each key has associated values.

• The values could be simple text strings or more complex lists and sets

• Search only performed against keys, and limited to exact matches.

• Search cannot be performed against values

04/10/23 CSC 8710 12

Page 13: NoSQL databases

Key-Value Databases

04/10/23 CSC 8710 13

Page 14: NoSQL databases

Key-Value characterstics

• The simplicity of Key-Value Store makes them very quick and light.

• Highly scalable retrieval of the values needed for application tasks such as retrieving product names.

• This is why Amazon use K-V system, Dynamo, in its shopping cart. Dynamo is a highly available key-value storage system.

• Example: Dynamo (Amazon), Voldemort (LinkedIn) Redis, BerkeleyDB, Riak

04/10/23 CSC 8710 14

Page 15: NoSQL databases

Pros and Cons

• pros: anything can be stored in an aggregate

• cons: only key lookup to access the entire aggregate is allowed (no query and part of aggregate retrieval mechanisms)

04/10/23 CSC 8710 15

Page 16: NoSQL databases

Document Database

• Designed to manage and store documents.

• These documents are encoded in a standard data exchange format such as XML, JSON (Javascript Option Notation) or BSON (Binary JSON).

04/10/23 CSC 8710 16

Page 17: NoSQL databases

Document Database

04/10/23 CSC 8710 17

Page 18: NoSQL databases

Primary Uses

• Document databases are good for storing and managing Big Data-size collections of literal documents such as text documents, email messages.

04/10/23 CSC 8710 18

Page 19: NoSQL databases

Pros And Cons

• pros: allow structured queries and partial aggregate retrieval based on the fields in the aggregate

• cons: imposes a limit on what can be placed in a database

04/10/23 CSC 8710 19

Page 20: NoSQL databases

Column-Family Databases

• It consists of a Key-Value pair where the value consists of set of columns.

•  The column family databases are represented in tables, each key-value pair being a row.

• All the related data can be grouped as one family

04/10/23 CSC 8710 20

Page 21: NoSQL databases

Primary Uses

1. Large-scale, batch-oriented data processing: sorting, parsing, conversion :

- conversions between hexadecimal, binary and decimal code values.

2. Exploratory and predictive analytics performed by expert statisticians and programmers.

04/10/23 CSC 8710 21

Page 22: NoSQL databases

Column-Family

04/10/23 CSC 8710 22

Page 23: NoSQL databases

Graph Databases

• Graph databases replace relational tables with structured relational graphs of interconnected key-value pairings.

• Graph databases are useful when you are more interested in relationships between data than the data itself and it works perfectly for the social network.

• It is optimized for relationship traversing not for querying

• Examples: Neo4j, InfoGrid, Sones GraphDB, AllegroGraph, InfiniteGraph

04/10/23 CSC 8710 23

Page 24: NoSQL databases

Graph Databases

04/10/23 CSC 8710 24

Page 25: NoSQL databases

Adoption of NoSQL Database

• Organizations that have massive data storage are looking seriously at NoSQL.

• NoSQL Database expert are highly demanded for most of the developing organizations.

• The next graph shows job trends of five NoSQL Databases from Indeed.com

04/10/23 CSC 8710 25

Page 26: NoSQL databases

Job Trends of Five NoSQL Databases

04/10/23 CSC 8710 26

Page 27: NoSQL databases

Adoption of NoSQL Database

• MongoDB‘s growth means that it has cemented its place as the most popular NoSQL database.

• According to LinkedIn profile mentions, The mentions of NoSQL technologies form 45% in LinkedIn profiles.

04/10/23 CSC 8710 27

Page 28: NoSQL databases

LinkedIn statistics

04/10/23 CSC 8710 28

Page 29: NoSQL databases

Conclusion

• Computational and storage requirements of applications such as for Big Data analytics, Business Intelligence and social networking over peta-byte datasets led us to the change from SQL to NoSQL DBs.

• This led to the development of horizontally scalable, distributed non-relational No-SQL databases.

• MongoDB‘s is the most demanded one.

04/10/23 CSC 8710 29

Page 30: NoSQL databases

Resources

• http://arxiv.org/ftp/arxiv/papers/1307/1307.0191.pdf

• http://en.wikipedia.org/wiki/Column_family

• http://en.wikipedia.org/wiki/NoSQL

04/10/23 30CSC 8710

Page 31: NoSQL databases

04/10/23 31CSC 8710

Page 32: NoSQL databases

04/10/23 32CSC 8710


Top Related