lab pratico per la progettazione di soluzioni mongodb in ambito internet of things e big data - by...

Designing MongoDB solutions for the Internet of Things and Big Data

INNOVATION COMPANY

Roberto Contiero [email protected] @contieroroberto

“Learn quickly and Think Well!”Stefano Dindo

[email protected] @stefanodindo

“We are an Innovation Company. We design and develop cutting edge software to drive

our customers’ digital transformation, through Agile Methodologies and continuous

delivery”

WE HELP OUR CUSTOMERS TO

DESIGN IDEA

CREATE PRODUCTS

EXTRACT VALUE FROM DATA

We get powerful ideas to market fast

We design and develop innovative and better software solutions

We collect and analyze data to help your decisions

Discover Experiment Delivery

MVP

Continuous Design &

Integration

OUR APPROACH

MVP

MVP

MVP

??? ?? ?

?

End UsersIdeaCustomer & zero12 collaboration meeting

( CanvUX )

Customer feedback

Today we are dealing with

“Internet Of Things”

“Internet of Things is a neologism referring to the extension of the Internet to the world of objects

and concrete places.”

2020 IoT Market Share

4 Billion

Connected People

$4 Trillion

Business Opportunity

25+ Billion

Integrated systems connected to the Web

50 Trillion 50GBs of data

Fonte: IDC

Broker MQTT

Authentication API

Business Logic API

PredictiveEngine

API

Application Frontend

MongoDB

IoT Architecture:

Users

Things

Predictive Algorithm

Data Operation

User Interaction

Authentication

MongoDBfor GIANT Ideas

What is MongoDB?

General Purpose Document Database Open Source

MongoDB

General Info

SQL - like MongoDB World

Database Database

Table Collection

Row Document

Column Field

What’s a document?

{ "name": "John", "surname": "Doe", "email": "[email protected]", "cell": 3281432896, "sport": ["swimming", "football"]}

{ _id: ‘ObjectId(“4b2b9…”)’, first_name: ‘Paul’, surname: ‘Miller’, city: ‘London’, location: [45.123,47.232], cars: [ { model: ‘Bentley’, year: 1973, value: 100000, … }, { model: ‘Rolls Royce’, year: 1965, value: 330000, … } ]}

Comparison Relational vs Document

Document

{ _id: ‘ObjectId(“4b2b9…”)’, first_name: ‘Paul’, surname: ‘Miller’, city: ‘London’, location: [45.123,47.232], cars: [ { model: ‘Bentley’, year: 1973, value: 100000, … }, { model: ‘Rolls Royce’, year: 1965, value: 330000, … } ]}

Example Data Type

Null

Boolean

Number

String

Date

Array

Embedded documents

{ x: null }

{ x: true }

{ x: 3.14 } { x: 3 }

{ x: “zero12” }

{ x: new Date() }

{ x: [“a”,”b”, “c”] }

{ x: {y: “a” } }

Aggregation

op1 op2 opn……{ "name": "John", "surname": "Doe", "email": "[email protected]" }

Pipeline stages Documents

Main Operator

• $project • $match • $limit • $skip • $sort • $unwind • $group

Join ?

Yes, use

$lookup operator

MongoDB 3.2

MongoDB Compass 3 Storage Engine• WiredTiger • NMAPv1.1 • In-Memory ( beta )

Data Encryption

Business Intelligence Connectors

$lookup operator

Document Validation

MongoDB

Architectures

Node 1 Secondary

Node 2 Secondary

Node 3 Primary

Heartbeat

Replica Replica

ReplicaSet Configuration

Sharding Configuration

How to Use MongoDB in IoT Area

Time Series

Definition

Set of values of a variable detected at different timestamps.

Timet0 t1 t2 t3

f ( t0 )

f ( t1 )

f ( t2 )

f ( t3 )

Time Series Data is Everywhere

1. Financial markets pricing

2. Sensors (temperature, pressure, proximity)

3. Industrial Fleets (Location, velocity, operational)

4. Social Networks (status update)

5. System (server logs, application logs)

6. Mobile devices (calls, texts)

Time Series Data at a Higher Level

1. Widely applicable data model

2. Various schema and modeling options

3. Application requirements drive schema design

Time Series - Schema Design

How to Use MongoDB in IoT Area

Designing for writing and reading

1. One document per event

2. One document per minute (average)

3. One document per minute (second)

4. One document per hour

One document per event

{ server: "server1",load: 92,ts: ISODate("2014-10-16T22:07:38.000-0500")

}

1. Relational-centric approach

2. Insert-driven workload

3. Aggregations computed at application-level

One document per minute (average)

{ server: "server1",load_num: 92, load_sum: 4500,ts: ISODate("2014-10-16T22:07:00.000-0500")

}

1. Pre-aggregation to compute average per minute more easily

2. Update-driven workload

3. Minute-level resolution

One document per minute ( second )

{ server: "server1",load: { 0: 15, 1: 20, ..., 58: 45, 59: 40 }ts: ISODate("2014-10-16T22:07:00.000-0500")

}

1. Store per second data at minute level


3. Pre-allocate structure to avoid document moves

One document per hour ( by second )

{server: "server1",load: { 0: 15, 1: 20, ..., 3598: 45, 3599: 40 } ts: ISODate("2014-10-16T22:00:00.000-0500") }

1. Store per second data at hourly level

2. Update driven workload


4. Updating the last second requires 3599 steps

One document per hour ( by second ){

server: "server1", load: {

0: {0: 15, ..., 59: 45}, .... 59: {0: 25, ..., 59: 75} }

ts: ISODate("2014-10-16T22:00:00.000-0500") }

1. Store per second data at hourly level with nesting



4. Updating the last second requires 59+59 steps

Writing operation analysis

1. Example: data generated every second

2. Capturing data per minute requires:- One document per event: 60 writes

- One document per minute: 1 write, 59 updates

3. Transition from “insert-driven” to “update-driven”

- Individual writes are smaller

- Performance and concurrency benefits

1. Example: data generated every second

2. Reading data for a single hours requires:- One document per event: 3600 reads

- One document per minute: 60 reads

3. Read performance is greatly improved:- Fewer disk seeks

- Optimization with tuned block sizes and read ahead

Read operation analysis

Live Demo

http://www.zero12.it/lab-festivalict/

E’ possibile scaricare semplici esercizi realizzati per il Lab da:

lab pratico per la progettazione di soluzioni mongodb in ambito internet of things e big data - by...

Technology