cloud hybrid search with sharepoint
TRANSCRIPT
Understanding and ApplyingCloud Hybrid Search
@jefffried
Jeff Fried CTO, BA Insight
we love hybrid search - it's amazing how fast usage is growing
Jeff Teper @jeffteper
Today’s Session
Focused on Search and
SharePoint since 2004
Longtime
Search Nerd
• CTO, BA Insight
• Senior PM, Microsoft
• VP, FAST
• SVP, LingoMotors
About Jeff Fried
Passionate About
• Search
• SharePoint
• Search-driven
applications
• Information Strategy
Blog:
BAinsight.com/blog
Technet Column
“A View from the
Crawlspace”
About BA Insight
– Connectivity
– Applications -
– Classification -
– Analytics
6
Demo
9
The
Evolution
of
SharePoint:
HYBRID Management ExtensibilityExperiences
| Server
Experiences Management Extensibility
| Server | Server
HYBRID
Team
Sites
Portals
Enterprise
Content Mngt
BI
–
–
–
–
–
Why Hybrid SharePoint?
The Future of SharePoint Search, with Expert Jeff Fried
by Christian Buckley. March 23, 2015
Today’s Session
“Classic” Hybrid Search is Federated
not a single result set OOB
Cloud Hybrid Search
Access anywhere
Consistent user experience
Unified search results
No upgrades
No infrastructure mgt
Index storage scalable
Benefits of Cloud Hybrid Search
Reduce Your Footprint
Servers
Volume of Content(indexable items) Pattern
On-prem Search Farm
Cloud Hybrid Search
0-10 million items Small 4 App + 2 DB 1 or 2
10-40 million items Medium 12 App + 2 DB 2
40-100 million items Large 28 App + 4 DB 2
400 million items XL example (SP2016) 86 App + 4DB 2 or 3
SharePoint Server
(On-premises or Hosted)Office 365
SharePoint Online Content
Onedrive for Business ContentSharePoint Content
Cloud Hybrid Search
SharePoint 2013/2016 Search Architecture
Web Service (CEWS)
Walk-through: indexing & queries
SharePoint Server
(On-premises or Hosted)Office 365
Today’s Session
Case Study: Large University
Setting up Cloud Hybrid Search
•Create
• Cloud Search Service Application in
SharePoint Server 2016
•Set up
• search architecture in SharePoint
Server 2016 for cloud hybrid search
•Connect
• your Cloud Search Service Application
to your Office 365 tenant
•Create
• a content source to crawl for cloud
hybrid search
•Setup
• Search Center to validate
hybrid search results in O365
•Start
• full crawl of on-premises
content for cloud hybrid search
•Verify• that cloud hybrid search works
Tune
• cloud hybrid search
experiences
SupportSales & Marketing
Knowledge Articles
Fileshares
OneDrive
Support forum
SPO
Search Farm
SP 2013 content SP 2010 content
On-premises
Office 365
SPO content
SP 2013/2016
Cloud SSA
Example: Support Content
Setup for Support Search
The Support Search vertical only searches sites that are relevant to the Support team.
It uses Local SharePoint results plus a filter on which sites to include in the search results
Result source query:
{searchTerms} (
Path:»http://sp2010» OR
Path:»file://fileshare» OR
Path:»http://demohybrid.../../supportforum»)
SharePoint Online Support Search
Demo
26
Search
Unified search across SharePoint on-premises and Office 365 content and people
SharePoint 2013/2016
Deliver unified search results
from Office 365 and on-
premises in a single search
Search & discovery architecture wireframe --Online, on-premises, and hybrid
External Content
(on-premises and/or
in the cloud)
SharePoint Server
(On-premises or Hosted)Office 365
SharePoint Online Content
Onedrive for Business Content
Co
nnect
ors
SharePoint Content
Adding External Content
Cloud Hybrid Search
Also drives:
• Office Graph (delve,..)
• Compliance (DLP, …)
Connectors to Many Enterprise Systems• Aderant
• Amazon S3
• Alfresco
• Box
• Confluence
• CuadraSTAR
• Elite / 3E
• EMC Documentum
• EMC eRoom
• Google Drive
• HP Consolidated Archive
• (EAS, aka Zantaz)
• HPE Records Manager/HP TRIM
• IBM Connections
• IBM Content Manager
• IBM DB2
• IBM FileNet P8
• IBM Lotus Notes
• IBM WebSphere
• iManage Work
• Jive
• LegalKEY
• LexisNexis Interaction
• Lotus Notes Databases
• Microsoft Dynamics CRM
• Microsoft Exchange
• Microsoft Exchange Public Folders
• Microsoft SQL Server
• MySQL
• NetDocuments
• Neudesic The Firm Directory
• Objective
• OpenText LiveLink/RM
• OpenText eDOCS DM
• Oracle Database
• Oracle WebCenter
• Oracle WebCenter Content (UCM/Stellent)
• PLC/Practical Law
• ProLaw
• Salesforce.com
• SAP ERP
• ServiceNow
• SharePoint Online
• SharePoint 2016
• SharePoint 2013
• SharePoint 2010
• SharePoint 2007
• Sitecore
• Any SQL-based CRM system
• Veeva Vault
• Veritas Enterprise Vault
(Symantec eVault)
• West km
• Xerox DocuShare
• Yammer
Plus a proven architecture and process for creating new connectors to complex systems
External Content in O365 UX
Unified view across all content
- on-premises and on-line
- inside and outside SharePoint
Current Caveats:
1) don’t see thumbnails, just file icons
2) Have to query for it to show up
External blog
SP OnPrem Yammer
Yammer
OneDrive SP Online
OneDrive
–
–
–
–
Case Study: Cloud SSA, external content
Large global company
in materials science
Today’s Session
Issues with Cloud Hybrid Search (1)Cloud Hybrid Search "annoyances"
Performance Characteristicsslower query latency for on-prem queries against Cloud SSA
SharePoint Online Limitationsno synonyms
no site-level schema
no full trust code access
Hybrid Administration Weaknessesclunky metadata mapping
can't remove on-premises search results from Cloud SSA
trickier to test & debug crawls
can't reset index from Cloud SSA
Be aware of these
& compensate for them
(Fixed in August PU)
(Semi-addressed in June PU)
And it’s getting better:
2017
38
Performance
https://<<tenant_name>>-admin.sharepoint.com/_layouts/15/searchadmin/TA_SearchAdministration.aspx
Item Limits and Pricing
Licensing: 1M items of external content in index for every 1TB storage in O365
1TB included by default
+ 0.5 GB per licensed O365 user
No limit on number of items from O365 in the index
Default throttling at 20M external items; current threshold at 25M
2000 users x 0.5 GB = 1TB
+ 1TB default = 2 TB total
-> 2M external items indexed
+ Can also buy the “Office 365 Extra File Storage” Add-on
$0.20/GB/Month = $200/TB/Month = $200/M items/Month
50,000 users x 0.5 GB = 25TB
+ 1TB default = 26 TB total
-> 26M external items indexed
Should I run index reset?
NO!DeleteAllCloudHybridSearchContent()
https://blogs.technet.microsoft.com/beyondsharepoint/2016/07/07/cloud-hybrid-search-service-application-removing-items-from-the-office-365-search-index/
Issues with Cloud Hybrid Search (2)
43
Content Enrichmentno CEWS
no Entity Extraction
Securityno Custom Security Trimming
Can't crawl across Multiple Domains
Can't Crawl SP in Classic Auth Mode
Data Sovereigntyexport-restricted content
can't be put in O365 index
Limitations of Cloud SSA
External Content
(on-premises and/or
in the cloud)
SharePoint Server
(On-premises or Hosted)
SPO Content
OneDrive Content
Co
nnect
ors SharePoint Content
Connector
Framework
Office 365
AutoClassifier
(app version)
CEWS
Custom
Processing
Case study:Content Enrichment
Content
CloudSSA
Connector Framework
IndexingConnectors
Smart Pipeline
AutoClassifierCustom Stage A
CustomStage C
Custom Stage B
Online
On-Prem
Cloud Hybrid Search under the coversSecurity = identity sync + ACL mapping
Cloud SSACloud SSA
ParseCrawl
SCS
ACL Map Process
Blob
storequeue
•
•
Directory Synchronization
SID S-1-5-21-1212121212-1212121212-1212
msOnline-
OnPremiseSecurity
Identifier
S-1-5-21-1212121212-1212121212-1212
PUID PUID-XXXX-XXXXXXXXXX
Mapping of Access Control Lists
Allow: S-1-5-21-1212121212-1212121212-1212 Allow: PUID-XXXX-XXXXXXXXXX
• User SIDs are mapped to PUIDs
• Group SIDs are mapped to Object IDs
• «Everyone» and «Authenticated users» are mapped to
«Everyone except external users»
Only AD Users and Groups,
Only from one domain
Case Study: Crawling Cross-Domain
A global single index solution
Cloud SSA
Cloud SSA
Cloud SSA
Cloud SSA
Cloud SSA
BUT export-restricted content
can’t be in the global index
Hybrid searchFederated search
Azure
Issues with Cloud Hybrid Search OOB
Content Enrichmentno CEWS
no Entity Extraction
Securityno Custom Security Trimming
Can't crawl across Multiple Domains
Can't Crawl SP in Classic Auth Mode
Data Sovereigntyexport-restricted content
can't be put in O365 index
Limitations of Cloud SSA BA Insight Solution
Connector Framework
AutoClassifier
Connector Framework
can 'map down' to AD groups
can 'map across' cross-domain
can crawl and map security
Federator
Today’s Session
Federated / Hybrid
Compliance
Constraints
Desired UX
Complexity of environment
A/A, Trusts and Federation
Extension of SP farm design
Skill-set required• Identity, Security
• Networking
• SP Infrastructure
• Information mgmt. design
•
•