db2 net search extender ibm db2 data management march 2003

18
IBM Software Group | DB2 Data Management Software IBM DB2 Net Search Extender © 2003 IBM Corporation 1 DB2 Net Search Extender IBM DB2 Data Management March 2003

Upload: curran-marquez

Post on 30-Dec-2015

39 views

Category:

Documents


2 download

DESCRIPTION

DB2 Net Search Extender IBM DB2 Data Management March 2003. Agenda. Overview of Search Products in IBM Product Objectives: DB2 Net Search Extender Product Overview Key Features Positioning of the Text Extender family Customer Scenarios Future direction. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: DB2 Net Search Extender IBM DB2 Data Management March 2003

IBM Software Group | DB2 Data Management Software

IBM DB2 Net Search Extender © 2003 IBM Corporation1

DB2 Net Search Extender

IBM DB2 Data ManagementMarch 2003

Page 2: DB2 Net Search Extender IBM DB2 Data Management March 2003

IBM Software Group | DB2 Data Management Software

IBM DB2 Net Search Extender © 2003 IBM Corporation2

Overview of Search Products in IBM

Product Objectives: DB2 Net Search Extender

Product Overview

Key Features

Positioning of the Text Extender family

Customer Scenarios

Future direction

Agenda

Page 3: DB2 Net Search Extender IBM DB2 Data Management March 2003

IBM Software Group | DB2 Data Management Software

IBM DB2 Net Search Extender © 2003 IBM Corporation3

Products with Text Search in IBM Today

Brokered Search

Index-Based Search

Category

IBM Lotus Extended Search

WebSphere Portal search

DB2 Information Integrator for Content

DB2 Net Search Extender

Lotus Discovery Server

Product Positioning

A brokered, index-free search for parallel, distributed, heterogeneous search of specific content sources. Bundled with DB2 II, II for Content, WP

Portal product which includes full-text search library designed for high precision search on small/mid-sized collections

Federated text and parametric search for content and data sources. Web crawler for indexing web sites.

Knowledge management system for full text search & expertise location

DB2 extension for fast, scalable full-text search with a SQL/MM dialect. For text stored in DB2 and federated databases. Integrated to DB2 Content Manager.

Page 4: DB2 Net Search Extender IBM DB2 Data Management March 2003

IBM Software Group | DB2 Data Management Software

IBM DB2 Net Search Extender © 2003 IBM Corporation4

DB2 Net Search Extender is recommended for applications that need:

– A full-text search to handle for example the demands of e-Business applications with important textual content

– A relational database and a rich data schema to support the application requirements

DB2 Net Search Extender is:

– An extension to DB2 designed to provide excellent text search capabilities for e-business applications

– Seamless integrated in SQL query language using an extension of Structured Query Language: SQL/MM (multi-media)

The DB2 Net Search Extender is NOT:

– An Internet search product like Google– A generalized free text search product like Verity or Autonomy

Objective of the DB2 Net Search Extender

Page 5: DB2 Net Search Extender IBM DB2 Data Management March 2003

IBM Software Group | DB2 Data Management Software

IBM DB2 Net Search Extender © 2003 IBM Corporation5

Overview of DB2 Net Search Extender

Provides a parallel, scalable, full-text search Delivers excellent performance and scalability Tailored for e-Business applications such as e-Commerce and

Content Management with text search requirements Works seamlessly with text documents contained in DB2 and other

federated databases Extends existing DB2 applications easily by using standard

extensions to SQL. Provides very fast indexing and dynamic index update which is the

basis for a high speed search solution. Integrates with the DB2 Control Center for seamless and easy to

use administration

Page 6: DB2 Net Search Extender IBM DB2 Data Management March 2003

IBM Software Group | DB2 Data Management Software

IBM DB2 Net Search Extender © 2003 IBM Corporation6

DB2 Net Search Extender Key Features

Search functions: Options to refine the search process – Boolean operations

– Proximity search for words in the same sentence or paragraph

– "Fuzzy" searches for words having a similar spelling as the search term

– Wildcard searches, using front, middle, and end masking

– Thesaurus support to broaden the query

– Search within sections within documents for more targeted search

– Search on numeric attributes

– Supports search in 37 languages

– Highlight function

Page 7: DB2 Net Search Extender IBM DB2 Data Management March 2003

IBM Software Group | DB2 Data Management Software

IBM DB2 Net Search Extender © 2003 IBM Corporation7

DB2 Net Search Extender Key Features

Search Results: Presentation of Search results and responsiveness– Set a result limit on queries where a high hit count is anticipated

– Built-in SQL functionality is combined with the optimizer automatically to select the best optimization plan according to the expected search results

– Order the results by the document score

– Returns results quickly – a high performance search solution. Search methods: Programming mechanisms tailored to different e-

Business requirements– SQL function for general text search applications

– SQL scalar search function– General text search on views and presorted indexes

– SQL table-valued function – High performance dedicated text search

– Text Search Stored Procedure

Page 8: DB2 Net Search Extender IBM DB2 Data Management March 2003

IBM Software Group | DB2 Data Management Software

IBM DB2 Net Search Extender © 2003 IBM Corporation8

SQL Search: SQL scalar search function

The recommended search method - useful for most situations Use where standard SQL would be used Use when text search results are combined with other, different conditions Integrated with the DB2 optimizer for excellent performance where JOIN of data is needed

SQL scalar search

DB2 Server

DB2 table

“CONTAINS” Index

Join

Extract matching primary keys

Arrows are data flows

Return results

Page 9: DB2 Net Search Extender IBM DB2 Data Management March 2003

IBM Software Group | DB2 Data Management Software

IBM DB2 Net Search Extender © 2003 IBM Corporation9

Text Search on Views:SQL table-valued function for search

Use where you would normally use an SQL scalar function, but you want to exploit text indexes on views or presorted text indexes.

TextSearch table-valued search function

DB2 Server

DB2 table

“db2ext.textsearch” Index

Join

Extract matching primary keys

Arrows are data flows

Return results

Page 10: DB2 Net Search Extender IBM DB2 Data Management March 2003

IBM Software Group | DB2 Data Management Software

IBM DB2 Net Search Extender © 2003 IBM Corporation10

High Performance, Dedicated Text SearchStored Procedures for Search

Use for high performance/high scalability applications that need text search-only queries

Use for queries that do not need to join text search results with the results of other complex SQL conditions.

TextSearch stored procedure search

DB2 Server

DB2 table

Index“db2ext.textsearch”

Cache

Arrows are data flows

Columns in cache defined at text index creation

Page 11: DB2 Net Search Extender IBM DB2 Data Management March 2003

IBM Software Group | DB2 Data Management Software

IBM DB2 Net Search Extender © 2003 IBM Corporation11

DB2 Net Search Extender Key Features

Indexing: Very fast indexing and dynamic index update is the basis for a high speed search solution

– Provides fast indexing of large data volumes

– Provides incremental updates of indexes

– Indexes text documents stored in DB2 and federated databases

– Provides a choice of command line or interface through the DB2 Control Center for indexing

– Supports language-specific stopword lists to reduce the index size and search speed

– Monitors the progress of indexing

– Optional: supports presorted text indexes

– Optional: provides caching of table columns in main memory at indexing time to avoid physical read operations at search time

Page 12: DB2 Net Search Extender IBM DB2 Data Management March 2003

IBM Software Group | DB2 Data Management Software

IBM DB2 Net Search Extender © 2003 IBM Corporation12

Indexing Very fast indexing and dynamic index update is the basis for a high speed

search solution

read

DB2 Server

RDBMS tables

log table

Index

Net Search ExtenderInstance Services

read

update

trigger

Insert/Update/Delete

“UPDATE INDEX…”

Page 13: DB2 Net Search Extender IBM DB2 Data Management March 2003

IBM Software Group | DB2 Data Management Software

IBM DB2 Net Search Extender © 2003 IBM Corporation13

Indexing Performance with DB2 Net Search Extender

DB2 Net Search Extender shows excellent scalability and performance when it is used together with partitioned database setup.

0

2

4

6

8

10

12

Nodes

GB

per

ho

ur 1 Node

2 Nodes

4 Nodes

8 Nodes

Page 14: DB2 Net Search Extender IBM DB2 Data Management March 2003

IBM Software Group | DB2 Data Management Software

IBM DB2 Net Search Extender © 2003 IBM Corporation14

Summary

Search and information mining is a complex problem– The amount of accessible data (petabytes)

– Diversity of sources, types & formats Despite heterogeneity, users would like seamless use of all kinds of

information– Parametric & Text

– Multilingual

– Without syntax/protocol differences

– And they want good results! We have core technologies

– Historic trends are toward integrating technologies

– Key IBM products are being extended with search and mining capabilities

– Search and mining technologies are evaluated as standalone products as well as embedded components

– There is a search product available to solve your business problem

Page 15: DB2 Net Search Extender IBM DB2 Data Management March 2003

IBM Software Group

© 2003 IBM Corporation

Backup charts

Page 16: DB2 Net Search Extender IBM DB2 Data Management March 2003

IBM Software Group | DB2 Data Management Software

IBM DB2 Net Search Extender © 2003 IBM Corporation16

More info

DB2 V8.1 announcement:

– at:http://www.ibmlink.ibm.com/usalets&parms=H_202-214

– found at:http://www-3.ibm.com/software/data/db2/udb/v8/

"What's new in DB2" PDF document:

–http://www-3.ibm.com/software/data/db2/udb/pdfs/db2q0.pdf

DB2 Net Search Extender web site with Data Sheet

http://www-3.ibm.com/software/data/db2/extenders/netsearch/

Page 17: DB2 Net Search Extender IBM DB2 Data Management March 2003

IBM Software Group | DB2 Data Management Software

IBM DB2 Net Search Extender © 2003 IBM Corporation17

DB2 Net Search Extender V8 – for use with DB2 UDB V8

– The strategic product going forward – Improvements over both TIE and NSE V7 capability – Merges the functionality of TIE and NSE V7 products– Backward compatibility for DB2 NSE V7 and TIE V7 applications

DB2 Net Search Extender V7

– Designed to support web site traffic– Uses faster underlying search engine than Text Extender– Caches all potential results– Data scalability only limited by physical memory– Less SQL functionality and flexibility than the Text Information Extender

DB2 Text Information Extender (TIE) V7.2

– Uses same underlying search engine as NSE– Has the SQL flexibility of Text Extender

DB2 Text Extender (TE) is the original text extender

– Limited new investment in this Extender– High functionality but limited scalability

Positioning the three DB2 Text-based Extenders

Page 18: DB2 Net Search Extender IBM DB2 Data Management March 2003

IBM Software Group | DB2 Data Management Software

IBM DB2 Net Search Extender © 2003 IBM Corporation18

DB2 Net Search Extender - Formats and Languages

The text document formats supported are:

–HTML : Hypertext Markup Language (document models supported)–XML: Extended Markup Language (document models supported) –GPP: General Purpose format (aka flat text with user-defined tags, document models supported)–TEXT: Flat text –INSO: Plug in for Outside-In filtering software by Stellent

Language support is defined as follows:

–tokenization of textual data–applying language specific processing where required (e.g. "new paragraph" indicator for Hindi)–support for DBCS languages using the proven bi-gram approach for tokenization

Language/Codesets as follows:

–19 Group One languages (English through Korean)–15 Group Two languages (Arabic through Turkish)–17 Group Three languages (Albanian through Vietnamese)–5 Group Four languages (Indonesian through Telugu/India)