“ leveraging sharepoint 2010 search technologies ” with: ivan neganov

32
Leveraging SharePoint 2010 Search TechnologiesWith: Ivan Neganov

Upload: mariah-berry

Post on 17-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: “ Leveraging SharePoint 2010 Search Technologies ” With: Ivan Neganov

“Leveraging SharePoint 2010 Search

Technologies”With: Ivan Neganov 

Page 2: “ Leveraging SharePoint 2010 Search Technologies ” With: Ivan Neganov

Sponsors

Page 3: “ Leveraging SharePoint 2010 Search Technologies ” With: Ivan Neganov

Agenda Open Discussion

Topic of the day QA 

Page 4: “ Leveraging SharePoint 2010 Search Technologies ” With: Ivan Neganov

Leveraging SharePoint 2010

Search Technologies

Mississauga SharePoint User Group, October 19, 2010

Page 5: “ Leveraging SharePoint 2010 Search Technologies ” With: Ivan Neganov

About the SpeakerIvan NeganovFounder of SoftForte, Inc. 11 years of experience in developing WCM solutions based on ASP.NET and SharePoint platforms. Focusing on SharePoint since 2007. Blog: neganov.blogspot.com

the Science of Quality

Page 6: “ Leveraging SharePoint 2010 Search Technologies ” With: Ivan Neganov

Agenda• Enterprise Search defined• Common search concepts and terms• Search architecture• SharePoint search technologies

Page 7: “ Leveraging SharePoint 2010 Search Technologies ” With: Ivan Neganov

What is Enterprise Search

• Why not use Google Appliance aka “Google Box”?

• Why not use open source engine like Lucene?

• Why SharePoint search isn’t enough?

• Do I need taxonomy & faceted search?

• Can users just go ahead and tag everything?

Page 8: “ Leveraging SharePoint 2010 Search Technologies ” With: Ivan Neganov

Enterprise is not just a large Intranet

• Large volumes of data• Usually there exists a “right” or highly relevant

document• Security is critical• Taxonomies and vocabularies are important• Dates are important• Corporate data does have structure

• Search is convenient for surfacing content• Search is promising for future BI applications

Page 9: “ Leveraging SharePoint 2010 Search Technologies ” With: Ivan Neganov

Search Scenarios• Two types of scenarios in an enterprise:

o Productivity search• Intranet/team collaboration search• People search/Social computing• Site search

o Search applications• Parts search (fuzzy search requirement)• Intelligence & Investigation (heavy use of entity extraction)• IP protection• Compliance/Records management• E-commerce• Knowledge management & Support• BI applications

Page 10: “ Leveraging SharePoint 2010 Search Technologies ” With: Ivan Neganov

Microsoft Search Technologies

• Desktop search, successor of Index Server• SQL Server Search – Full Text Search (FTS)• Exchange Search – uses same iFilters as

SharePoint• Bing (formerly live search)

o Bing + Yahoo = 9.5%

• SharePoint & FAST Search

Page 11: “ Leveraging SharePoint 2010 Search Technologies ” With: Ivan Neganov

SharePoint 2010 Search Technologies

• Microsoft SharePoint Foundation (Free)o Single site collection, 10 million itemso No external searcho Automatic configuration

• Microsoft Search Server 2010 Express (Free)o Enterprise-level search, 10 million items but single search server onlyo No people search

• Microsoft Search Server 2010o Enterprise-level, redundancy support, 100 million itemso No people search

• Microsoft SharePoint Server 2010o 100 million items, added people search, tagging

• Microsoft FAST Search Server for SharePointo Over 200 million itemso Improved and flexible relevancyo Entity extraction

• Microsoft FAST ESP Servero Advanced entity extractiono Standalone product

Page 12: “ Leveraging SharePoint 2010 Search Technologies ” With: Ivan Neganov

Relevancy• Google: PageRank algorithm

• Same approach is used in FAST and SharePoint 2010

• FAST provides ability to dynamically boost rank

Page 13: “ Leveraging SharePoint 2010 Search Technologies ” With: Ivan Neganov

Index

Page 14: “ Leveraging SharePoint 2010 Search Technologies ” With: Ivan Neganov

Linguistics• Word stemming

• Word lemmatization

• Word morphologyo Collapsing indices

Page 15: “ Leveraging SharePoint 2010 Search Technologies ” With: Ivan Neganov

Other Common Search Concepts

• Crawling• Querying• Crawled & Managed Properties• Best Bets• Refiners aka Facets• Linguistics: Stemma & Lemma• Entity Extraction

Page 16: “ Leveraging SharePoint 2010 Search Technologies ” With: Ivan Neganov

High Level Search Architecture

Page 17: “ Leveraging SharePoint 2010 Search Technologies ” With: Ivan Neganov

Demo: Search Experience

Page 18: “ Leveraging SharePoint 2010 Search Technologies ” With: Ivan Neganov

FAST Search Server 2010 for SharePoint

• Advanced scalability & performance• Advanced content processing• Extensibility

FAST Content Processing Pipeline:

Page 19: “ Leveraging SharePoint 2010 Search Technologies ” With: Ivan Neganov

FAST ESP• Essentially re-packaged FAST ESP 5.3

• Planned two SKUs (according to SPC 2009)o FAST Search Server for Internet Siteso Fast Search Server for Internal Applications

• Updates?

Page 20: “ Leveraging SharePoint 2010 Search Technologies ” With: Ivan Neganov

Planning Enterprise Search

• Search is redundant and scalable

Page 21: “ Leveraging SharePoint 2010 Search Technologies ” With: Ivan Neganov

Planning FAST Search

Page 22: “ Leveraging SharePoint 2010 Search Technologies ” With: Ivan Neganov

Which Search Technology Is Appropriate?

• FAST Search Server requires enterprise CALs

Page 23: “ Leveraging SharePoint 2010 Search Technologies ” With: Ivan Neganov

Estimating Costs

SharePoint Enterprise Search FAST Search Server for SharePoint

4 – 6 query and index servers 4 – 6 query and index servers, 0 – 2 content distributor & web analyzer servers

1 – 2 database clusters (share) 1 – 2 database clusters (share)

40 million documents, medium dedicated search farm

Page 24: “ Leveraging SharePoint 2010 Search Technologies ” With: Ivan Neganov

Search UI• Search Web Parts• Search Center• Thick clients

Page 25: “ Leveraging SharePoint 2010 Search Technologies ” With: Ivan Neganov

Extending Search• Federation - OpenSearch• Query Object Model• BCS Connectors• RANK & XRANK• Tapping in Document Processing Pipeline

Page 26: “ Leveraging SharePoint 2010 Search Technologies ” With: Ivan Neganov

Federation

MadridLos Angeles

Hong Kong

South Africa

Page 27: “ Leveraging SharePoint 2010 Search Technologies ” With: Ivan Neganov

Demo: Search Federation

Page 28: “ Leveraging SharePoint 2010 Search Technologies ” With: Ivan Neganov

Connector Framework• Leverage tooling (SPD, VS2010)

Page 29: “ Leveraging SharePoint 2010 Search Technologies ” With: Ivan Neganov

Entity Extraction in FAST

• Automatically create crawled properties for a given vocabulary

• Useful for advanced scenarios: for example 1. Extract property at crawl time,2. Enrich a property3. Index enriched property

Page 30: “ Leveraging SharePoint 2010 Search Technologies ” With: Ivan Neganov

Search in the Enterprise: Future

• Amount of content will continue to grow• Search will integrate with Business Intelligence

applications• Entity, Sentiment and Fact extraction• Search as navigation• Search visualization• Search as a service• Many more custom applications leveraging search

Page 31: “ Leveraging SharePoint 2010 Search Technologies ” With: Ivan Neganov

Resources• Microsoft Technet, MSDN• Professional Microsoft Search 2010

Page 32: “ Leveraging SharePoint 2010 Search Technologies ” With: Ivan Neganov

Questions