back to basics webinar 4: advanced indexing, text and geospatial indexes

50
MongoDB Europe 2016 Old Billingsgate, London 15 th November Use my code JD20 for 20% off tickets mongodb.com/europe

Upload: mongodb

Post on 10-Jan-2017

3.630 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

MongoDB Europe 2016Old Billingsgate, London

15th November

Use my code JD20 for 20% off ticketsmongodb.com/europe

Page 2: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

Back to Basics 2016 : Webinar 4

Advanced Indexing – Text and Geospatial Indexes

Joe DrumgooleDirector of Developer Advocacy, EMEA

@jdrumgoole

V1.1

Page 3: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

3

Recap

• Webinar 1 – Introduction to NoSQL– The different types of NoSQL databases– What kind of database is MongoDB? A document database.

• Webinar 2 – My First Application– Creating databases and collections– CRUD operations– Indexes and Explain

• Webinar 3 – Schema Design– Dynamic schema– Embedding approaches– Examples

Page 4: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

4

Indexing

• An efficient way to look up data by its value• Avoids table scans

Page 5: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

5

Traditional Databases Use Btrees

• … and so does MongoDB

Page 6: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

6

Queries, Inserts, Deletes O(Log(n) Time

Page 7: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

7

Creating a Simple Index

db.coll.createIndex( { fieldName : <Direction> } )

Database Name

Collection Name

Command

Field Name to be indexed

Ascending : 1 Descending : -1

Page 8: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

8

Two Other Kinds of Indexes

• Full Text Index– Allows searching inside the text of a field ( Lucene, Solr and Elastic

Search)• Geospatial Index

– Allows searching by location (e.g. people near me)• These indexes do not use Btrees

Page 9: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

9

Full Text Indexes

• An “inverted index” on all the words inside a single field (only one text index per collection)

{ “comment” : “I think your blog post is very interesting and informative. I hope you will post more info like this in the future” }

>> db.posts.createIndex( { “comments” : “text” } )

MongoDB Enterprise > db.posts.find( { $text: { $search : "info" }} ){ "_id" : ObjectId(“…"), "comment" : "I think your blog post is very interesting and informative. I hope you will post more info like this in the future" }MongoDB Enterprise >

Page 10: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

10

Results

MongoDB Enterprise > db.posts.getIndexes()...

{"v" : 1,"key" : {

"_fts" : "text","_ftsx" : 1

},"name" : "comment_text","ns" : "test.posts","weights" : {

"comment" : 1},"default_language" : "english","language_override" : "language","textIndexVersion" : 3

}

Page 11: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

11

Dropping Text Indexes

• We drop text indexes by name rather than shapedb.posts.getIndexes()

{"v" : 1,"key" : {

"_fts" : "text","_ftsx" : 1

},"name" : "comment_text_text","ns" : "test.posts","weights" : {

"comment" : 5,"tags" : 10

},"default_language" : "english","language_override" : "language","textIndexVersion" : 3

}

Page 12: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

12

Hence

MongoDB Enterprise > db.posts.dropIndex( "comment_text_tags_text" ){ "nIndexesWas" : 2, "ok" : 1 }MongoDB Enterprise >

• You can give an index an explict name to make this easier

MongoDB Enterprise > db.posts.createIndex( { "comments" : "text", "tags" : "text" }, { "name" : "text_index" } ){

"createdCollectionAutomatically" : false,"numIndexesBefore" : 1,"numIndexesAfter" : 2,"ok" : 1

}

Page 13: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

13

On The Server

I INDEX [conn275] build index on: test.posts properties: { v: 1, key: { _fts: "text", _ftsx: 1 }, name: "comment_text", ns: "test.posts", weights: { comment: 1 }, default_language: "english", language_override: "language", textIndexVersion: 3 }}I INDEX [conn275] building index using bulk methodI INDEX [conn275] build index done. scanned 3 total records. 0 secs

Page 14: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

14

More Detailed Example

>> db.posts.insert( { "comment" : "Red yellow orange green" } )>> db.posts.insert( { "comment" : "Pink purple blue" } )>> db.posts.insert( { "comment" : "Red Pink" } )

>> db.posts.find( { "$text" : { "$search" : "Red" }} ){ "_id" : ObjectId(“…”), "comment" : "Red yellow orange green" }{ "_id" : ObjectId(  »…"), "comment" : "Red Pink" }>> db.posts.find( { "$text" : { "$search" : "Red Green" }} ){ "_id" : ObjectId(« …"), "comment" : "Red Pink" }{ "_id" : ObjectId(« …"), "comment" : "Red yellow orange green" }>> db.posts.find( { "$text" : { "$search" : "red" }} ) # <- Case Insensitve{ "_id" : ObjectId(“…"), "comment" : "Red yellow orange green" }{ "_id" : ObjectId(«…”), "comment" : "Red Pink" }>>

Page 15: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

15

Using Weights

• We can assign different weights to different fields in the text index• E.g. I want to favour tags over comments in searching• So I increase the weight for the the tags field

>> db.blog.createIndex( { comment: "text", tags : "text” }, { weights: { comment: 5, tags : 10 }} )• Now searches will favour tags

Page 16: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

16

$textscore

• Weights impact $textscore:

>> db.posts.find( { "$text" : { "$search" : "Red" }}, { score: { $meta: "textScore" }} ).sort( { score: { $meta: "textScore" } } ){ "_id" : …, "comment" : "hello", "tags" : "Red green orange", "score" : 6.666666666666666 }{ "_id" : …, "comment" : "Red Pink", "score" : 3.75 }{ "_id" : …, "comment" : "Red yellow orange green", "score" : 3.125 }>>

Page 17: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

17

Other Parameters

• Language : Pick the language you want to search in e.g. – $language : Spanish

• Support case sensitive searching– $caseSensitive : True (default false)

• Support accented characters (diacritic sensitive search e.g. café is distinguished from cafe )– $diacriticSensitive : True (default false)

Page 18: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

Geospatial Indexes

Page 19: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

19

Geospatial Indexes

• MongoDB supports 2D Sphere indexes• Allows a user to represent location on the earth (which is a sphere)• Coordinates are stored in GeoJSON format• The Geospatial index supports subset of the GeoJSON operations• The index is based on a QuadTree representation• Index is based on WGS 84 standard

Page 20: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

20

Coordinates

• Coordinates are represented as longitude, latitude• longitude

– Measured from Greenwich meridian in London (0 degrees) locations east (up to 180 degrees)

– For locations west we specify as negative • Latitude

– Measured from equator north and south (0 to 90 north, 0 to -90 south)• Coordinates in MongoDB are stored on Longitude/Latitude order• Coordinates in Google are stored in Latitude/Longitude order

Page 21: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

21

2DSphere Versions

• Three versions of 2dSphere index in MongoDB• Version 1 : Up to MongoDB 2.4• Version 2 : From MongoDB 2.6 onwards• Version 3 : From MongoDB 3.2 onwards• We will only be talking about Version 3 in this webinar

Page 22: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

22

Creating a 2dSphere Index

db.collection.createIndex ( { <location field> : "2dsphere" } )

• Location field must be coordinate or GeoJSON data

Page 23: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

23

Example

>> db.test.createIndex( { loc : "2dsphere" } ){

"createdCollectionAutomatically" : false,"numIndexesBefore" : 1,"numIndexesAfter" : 2,"ok" : 1

}

Page 24: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

24

Output

>> db.test.getIndexes()[

{"v" : 1,"key" : {

"loc" : "2dsphere"},"name" : "loc_2dsphere","ns" : "geo.test","2dsphereIndexVersion" : 3

}]>>

Page 25: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

25

Use a Simple Dataset to investigate Geo Queries

• Lets search for restaurants in Manhattan• Using two candidate collections

–  https://raw.githubusercontent.com/mongodb/docs-assets/geospatial/neighborhoods.json– https://raw.githubusercontent.com/mongodb/docs-assets/geospatial/restaurants.json

• Import them into MongoDB– mongoimport –c neighborhoods –d geo neighborhoods.json– mongoimport –c restaurants –d geo restaurants.json

Page 26: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

26

Neighborhood Document

MongoDB Enterprise > db.neighborhoods.findOne(){

"_id" : ObjectId("55cb9c666c522cafdb053a1a"),"geometry" : {"coordinates" : [[[-73.94193078816193,40.70072523469547],

...[-73.94409591260093,40.69897295461309],

]

"type" : "Polygon"},"name" : "Bedford"

}

Page 27: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

27

Restaurant Document

MongoDB Enterprise > db.restaurants.findOne(){

"_id" : ObjectId("55cba2476c522cafdb053adf"),"location" : {

"coordinates" : [-73.98241999999999,40.579505

],"type" : "Point"

},"name" : "Riviera Caterer"

}MongoDB Enterprise >

You can type this into google maps but

remember to reverse the coordinate order

Page 28: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

28

Add Indexes

MongoDB Enterprise > db.restaurants.createIndex({ location: "2dsphere" }){

"createdCollectionAutomatically" : false,"numIndexesBefore" : 1,"numIndexesAfter" : 2,"ok" : 1

}MongoDB Enterprise > db.neighborhoods.createIndex({ geometry: "2dsphere" }){

"createdCollectionAutomatically" : false,"numIndexesBefore" : 1,"numIndexesAfter" : 2,"ok" : 1

}MongoDB Enterprise >

Page 29: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

29

Use $geoIntersects to find our Neighborhood

• Assume we are at -73.93414657, 40.82302903• What neighborhood are we in? Use $geoIntersects

db.neighborhoods.findOne({ geometry: { $geoIntersects: { $geometry: { type: "Point", coordinates: [ -73.93414657, 40.82302903 ]}}}})

Page 30: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

30

Results

{"geometry" : {

”coordinates" : [[

-73.9338307684026,40.81959665747723

], ...

[-73.93383000695911,40.81949109558767

] ]

"type" : "Polygon"},"name" : "Central Harlem North-Polo Grounds"

}

Page 31: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

31

Find All Restaurants within 0.35 km

db.restaurants.find({ location: { $geoWithin: { $centerSphere: [ [ -73.93414657, 40.82302903 ], 5 / 6,378.1 ] } } })

Distance in km Divide by radius of earth to convert to radians

Page 32: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

32

Results – (Projected)

{ "name" : "Gotham Stadium Tennis Center Cafe" }{ "name" : "Chuck E. Cheese'S" }{ "name" : "Red Star Chinese Restaurant" }{ "name" : "Tia Melli'S Latin Kitchen" }{ "name" : "Domino'S Pizza" }

• Without projection

{ "_id" : ObjectId("55cba2476c522cafdb0550aa"), "location" : { "coordinates" : [ -73.93795159999999, 40.823376 ], "type" : "Point" }, "name" : "Domino'S Pizza" }

Page 33: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

33

Summary of Operators

• $geoIntersect: Find areas or points that overlap or are adjacent

• $geoWithin: Find areas on points that lie within a specific area• $geoNear: Returns locations in order from nearest to furthest

away

Page 34: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

34

Summary

• Text Indexes : Full text searching of all the text items in a collection

• Geospatial Indexes : Search by location, by intersection or by distance from a point

Page 35: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

35

Q & A

Page 36: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Page 37: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

37

• This is slide content

Page 38: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Page 39: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Page 40: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Page 41: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

41

Page 42: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

42

Page 43: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

LOREM IPSUM

LOREM IPSUM

LOREM IPSUM

LOREM IPSUM

Sollicitudin VenenatisLOREM IPSUM

LOREM IPSUM

LOREM IPSUM

LOREM IPSUM

Graphic Element Examples

Page 44: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

Porta Ultricies

Commodo Porta

Graph Examples

Category 1 Category 2 Category 3 Category 40

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

Series 1Series 2

Page 45: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

Category 1 Category 2 Category 3 Category 40

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

Series 1Series 2

Page 46: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

{ _id : ObjectId("4c4ba5e5e8aabf3"), employee_name: "Dunham, Justin", department : "Marketing", title : "Product Manager, Web", report_up: "Neray, Graham", pay_band: “C", benefits : [ { type :  "Health", plan : "PPO Plus" }, { type :   "Dental", plan : "Standard" }

] }

Code/Highlight Example

Page 47: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

Aggregation Framework Agility Backup Big Data Briefcase

Buildings Business Intelligence Camera Cash Register Catalog

Chat Checkmark Checkmark Cloud Commercial Contract

Computer Content Continuous Development Credit Card Customer Success

Page 48: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

Data Center Data Variety Data Velocity Data Volume Data Warehouse Database

Dialogue Directory Documents Downloads Drivers Dynamic Schema

EDW Integration Faster Time to Market File Transfer Flexible Gear Hadoop

Health Check High Availability Horizontal Scaling Integrating into Infrastructure Internet of Things Iterative Development

Page 49: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

Life Preserver Line Graph Lock Log Data Lower Cost Magnifying Glass

Man Mobile Phone Meter Monitoring Music New Apps

New Data Types Online Open Source Parachute Personalization Pin

Platform Certification Product Catalog Puzzle Pieces RDBMS Realtime Analytics Rich Querying

Page 50: Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes

Life Preserver RSS Scalability Scale Secondary Indexing Steering Wheel

Stopwatch Text Search Tick Data Training Transmission Tower Trophy

Woman World