enhanced site search with cognitive apis - glynn bird

42
Enhanced Site Search with Cognitive APIs Glynn Bird Developer Advocate @ IBM Cloud Data Services [email protected] @glynn_bird

Upload: data-driven-innovation

Post on 09-Apr-2017

35 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: Enhanced site search with cognitive APIs - Glynn Bird

Enhanced Site Search with Cognitive APIsGlynn BirdDeveloper Advocate @ IBM Cloud Data [email protected]@glynn_bird

Page 2: Enhanced site search with cognitive APIs - Glynn Bird

●What is search?●Simple Search●Adding some "cognitive"

Agenda

Page 3: Enhanced site search with cognitive APIs - Glynn Bird

@glynn_bird

Primary search

Page 4: Enhanced site search with cognitive APIs - Glynn Bird

@glynn_bird

In-site search

Page 5: Enhanced site search with cognitive APIs - Glynn Bird

@glynn_bird

Elasticsearch• Stores JSON Documents• Search based on Apache Lucene• Provides HTTP search API• Pay per-GB on compose.com

Page 6: Enhanced site search with cognitive APIs - Glynn Bird

@glynn_bird

Cloudant• Stores JSON Documents• Based on Apache CouchDB• Search based on Apache Lucene• Provides HTTP search API• PAYG/Dedicated-as-a-service or Local

Page 7: Enhanced site search with cognitive APIs - Glynn Bird

@glynn_bird

Get started - Simple Search Service

https://developer.ibm.com/clouddataservices/simple-search-service/

Page 8: Enhanced site search with cognitive APIs - Glynn Bird

@glynn_bird

Game of Thrones search demo

http://sss-got-theme.mybluemix.net/

Page 9: Enhanced site search with cognitive APIs - Glynn Bird

@glynn_bird

Structured vs Unstructured DataStructured Data

● known schema● predictable● indexable

Unstructured Data

● unknown schema● difficult to parse

and index

DB

Page 10: Enhanced site search with cognitive APIs - Glynn Bird

@glynn_bird

Example data{ "url": "http://www.bbc.co.uk/news/business-37742991", "title": "AT&T announces it will buy Time Warner", "description": "US telecoms giant AT&T announces it will buy entertainment group Time Warner", "date": "2016-10-22T23:44:03.000Z", "image_url": "http://c.files.bbci.co.uk/_91950162_breaking_image_large-3-1.png"}

Page 11: Enhanced site search with cognitive APIs - Glynn Bird

@glynn_bird

Structured data{ "url": "http://www.bbc.co.uk/news/business-37742991", "title": "AT&T announces it will buy Time Warner", "description": "US telecoms giant AT&T announces it will buy entertainment group Time Warner", "date": "2016-10-22T23:44:03.000Z", "image_url": "http://c.files.bbci.co.uk/_91950162_breaking_image_large-3-1.png"}

Page 12: Enhanced site search with cognitive APIs - Glynn Bird

@glynn_bird

Unstructured data{ "url": "http://www.bbc.co.uk/news/business-37742991", "title": "AT&T announces it will buy Time Warner", "description": "US telecoms giant AT&T announces it will buy entertainment group Time Warner", "date": "2016-10-22T23:44:03.000Z", "image_url": "http://c.files.bbci.co.uk/_91950162_breaking_image_large-3-1.png"}

Page 13: Enhanced site search with cognitive APIs - Glynn Bird

@glynn_bird

Let's build news website● take RSS feeds● put the data into a database● index it

○ newest articles first○ keyword search

Page 14: Enhanced site search with cognitive APIs - Glynn Bird

@glynn_bird

Node-RED● visual programming tool● https://nodered.org/

Page 15: Enhanced site search with cognitive APIs - Glynn Bird

@glynn_bird

Indexing data in Cloudant - MapReduce

function(doc) { emit(doc.date, doc.title);}

● Build index sort articles by date● Create custom 'map' function

Page 16: Enhanced site search with cognitive APIs - Glynn Bird

@glynn_bird

Indexing data in Cloudant - MapReduce

Page 17: Enhanced site search with cognitive APIs - Glynn Bird

@glynn_bird

Front end

Page 18: Enhanced site search with cognitive APIs - Glynn Bird

@glynn_bird

Indexing data in Cloudant - Search

function(doc) { index('default', doc.title); index('default', doc.description);}

● Build full-text index● Create custom 'map' function

Page 19: Enhanced site search with cognitive APIs - Glynn Bird

@glynn_bird

Cloudant Search● Punctuation removal● Word splitting/stemming● Stop-word removal● Full-text indexing using Apache

Lucene

Page 20: Enhanced site search with cognitive APIs - Glynn Bird

@glynn_bird

Front end

Page 21: Enhanced site search with cognitive APIs - Glynn Bird

@glynn_bird

Front end

Page 22: Enhanced site search with cognitive APIs - Glynn Bird

@glynn_bird

Summary so far...

Page 23: Enhanced site search with cognitive APIs - Glynn Bird

@glynn_bird

But can we do better?

Page 24: Enhanced site search with cognitive APIs - Glynn Bird

@glynn_bird

Watson Alchemy Language API● Feed it text or a URL● Returns:

○ entities - people/places/companies○ taxonomy

Page 25: Enhanced site search with cognitive APIs - Glynn Bird

@glynn_bird

Watson Alchemy Language APIEntities

Country: US Company: AT&T Company: Time Warner JobTitle: Telecoms

Taxonomy /art and entertainment /technology and computing/internet technology/isps /business and industrial/company/merger and acquisition

Page 26: Enhanced site search with cognitive APIs - Glynn Bird

@glynn_bird

How can we use Alchemy in our workflow?

Page 27: Enhanced site search with cognitive APIs - Glynn Bird

@glynn_bird

How can we use Alchemy in our workflow?

Page 28: Enhanced site search with cognitive APIs - Glynn Bird

@glynn_bird

More indexing● Index the Alchemy entities

○ e.g. Country:US● Index the Alchemy taxonomy

○ e.g. ["Finance","Investing"]

Page 29: Enhanced site search with cognitive APIs - Glynn Bird

@glynn_bird

Front end

Page 30: Enhanced site search with cognitive APIs - Glynn Bird

@glynn_bird

Page 31: Enhanced site search with cognitive APIs - Glynn Bird

@glynn_bird

Page 32: Enhanced site search with cognitive APIs - Glynn Bird

@glynn_bird

Page 33: Enhanced site search with cognitive APIs - Glynn Bird

@glynn_bird

Demo

https://glynnbird.github.io/alchemy-news/

Page 34: Enhanced site search with cognitive APIs - Glynn Bird

@glynn_bird

It's not just language...

Page 35: Enhanced site search with cognitive APIs - Glynn Bird

@glynn_bird

Watson saw….

Page 36: Enhanced site search with cognitive APIs - Glynn Bird

@glynn_bird

Just one more

Page 37: Enhanced site search with cognitive APIs - Glynn Bird

@glynn_bird

Watson saw...

Page 38: Enhanced site search with cognitive APIs - Glynn Bird

@glynn_bird

Page 39: Enhanced site search with cognitive APIs - Glynn Bird

@glynn_bird

Page 40: Enhanced site search with cognitive APIs - Glynn Bird

@glynn_bird

Page 41: Enhanced site search with cognitive APIs - Glynn Bird

Summary● Node-RED● Cloudant● Alchemy Language API

Bluemix: https://www.ibm.com/cloud-computing/bluemix/

Simple Search Service: https://developer.ibm.com/clouddataservices/simple-search-service/

News Demo: https://glynnbird.github.io/alchemy-news/

Page 42: Enhanced site search with cognitive APIs - Glynn Bird

Developer [email protected]

ThanksGlynn Bird

Blog: www.glynnbird.comTwitter: @glynn_bird