search and classification trex

13
Search and Classification (TREX) SAP NetWeaver Search and Classification (TREX) finds information in both structured and unstructured data. TREX provides SAP applications with services for searching and classifying large collections of documents and for searching and aggregating business objects. The Search Engine Service (SES) enables users to search for business objects. SES is part of the SAP NetWeaver Application Server and accesses TREX functions through the TREX ABAP client (for more information, see Administration of the Search Engine Service ). Technical System Landscape For an introduction to the TREX technical system landscape, see Technical System Landscape . Tools For a list of the tools that you use to administrate and monitor TREX and SES, see Tools . Prerequisites TREX offers a flexible architecture that enables a distributed installation that is modified to different requirements. A minimal system consists of a single host that provides all TREX functions. Starting with a single-host system, you can extend TREX to be a distributed system and thus increase its capacity. For more detailed information, see the SAP Service Marketplace at service.sap.com\instguidesNW70 : Single-host installation: SAP NetWeaver 7.0 Search and Classification (TREX) Single Host installation guide Distributed installation: SAP NetWeaver 7.0 Search and Classification (TREX) Multiple Hosts installation guide Tasks On Demand The following section lists the most important administration tasks that are performed as required: Tasks Additional Information Start and stop TREX servers More information: Starting and Stopping TREX Optimize performance More information: Delta Index Configuration Configuring Queue Parameters Performance Settings for the Operating System Reorganization of the TREX System Landscape Configure security More information: Configuration of the TREX Security Settings Software logistics More information: Software Logistics . Information about quality and test management, release and upgrade management and support packages, and implementing SAP Notes. Change the TREX host name More information: Changing the TREX Host Name (single and multiple host installation) Periodic Tasks The following tasks must be completed periodically: Tasks Additional Information Monitor TREX More information: Monitoring Back up and restore data More information: Data Backup and Restore for TREX Troubleshoot More information: Troubleshooting Additional Information For more information about TREX, see Search and Classification (TREX) . Search and Classification (TREX) (SAP Library - Technical Operations Manual for SAP NetWeaver) 6/6/2012 http://help.sap.com/saphelp_nw70/helpdata/en/70/0837ced133304eba452c45b6047c74/content.htm 1 / 13

Upload: siva-balan

Post on 18-Apr-2015

66 views

Category:

Documents


6 download

TRANSCRIPT

Page 1: Search and Classification TREX

Search and Classification (TREX) SAP NetWeaver Search and Classification (TREX) finds information in both structured and unstructured data. TREX provides SAP applications with services forsearching and classifying large collections of documents and for searching and aggregating business objects.The Search Engine Service (SES) enables users to search for business objects. SES is part of the SAP NetWeaver Application Server and accesses TREXfunctions through the TREX ABAP client (for more information, see Administration of the Search Engine Service).

Technical System LandscapeFor an introduction to the TREX technical system landscape, see Technical System Landscape.

ToolsFor a list of the tools that you use to administrate and monitor TREX and SES, see Tools.

PrerequisitesTREX offers a flexible architecture that enables a distributed installation that is modified to different requirements. A minimal system consists of a single hostthat provides all TREX functions. Starting with a single-host system, you can extend TREX to be a distributed system and thus increase its capacity. For moredetailed information, see the SAP Service Marketplace at service.sap.com\instguidesNW70:

● Single-host installation:SAP NetWeaver 7.0 Search and Classification (TREX) Single Host installation guide

● Distributed installation:SAP NetWeaver 7.0 Search and Classification (TREX) Multiple Hosts installation guide

Tasks On DemandThe following section lists the most important administration tasks that are performed as required:

Tasks Additional Information

Start and stop TREX servers More information: Starting and Stopping TREX

Optimize performance More information:● Delta Index Configuration● Configuring Queue Parameters● Performance Settings for the Operating System● Reorganization of the TREX System Landscape

Configure security More information: Configuration of the TREX SecuritySettings

Software logistics More information: Software Logistics.Information about quality and test management, releaseand upgrade management and support packages, andimplementing SAP Notes.

Change the TREX host name More information: Changing the TREX Host Name (singleand multiple host installation)

Periodic TasksThe following tasks must be completed periodically:

Tasks Additional Information

Monitor TREX More information: Monitoring

Back up and restore data More information: Data Backup and Restore for TREX

Troubleshoot More information: Troubleshooting

Additional InformationFor more information about TREX, see Search and Classification (TREX).

Search and Classification (TREX) (SAP Library - Technical Operations Manual for SAP NetWeaver) 6/6/2012

http://help.sap.com/saphelp_nw70/helpdata/en/70/0837ced133304eba452c45b6047c74/content.htm 1 / 13

Page 2: Search and Classification TREX

SAP Li br ar y - Techni cal O per at i ons i n Det ai l

dSAP Library - Technical Operations in Detail 6/6/2012

http://help.sap.com/saphelp_nw70/helpdata/en/3f/f0130c14df6101e10000000a114cbd/frameset.htm 2 / 13

Page 3: Search and Classification TREX

SAP Li br ar y - Techni cal O per at i ons i n Det ai l

dSAP Library - Technical Operations in Detail 6/6/2012

http://help.sap.com/saphelp_nw70/helpdata/en/2a/c4a74046033813e10000000a155106/frameset.htm 3 / 13

Page 4: Search and Classification TREX

Delta Index When TREX updates an index, it rewrites the majority of the index files. If the indexes are large this process can take a long time and generate a high systemload.TREX allows you to activate a delta index in order to speed up the update. The delta index is a separate index that TREX creates in addition to the main index.The main index and its delta index only differ TREX-internally. Outside of TREX they form a unit.If the delta index is activated changes flow into the delta index. Because the delta index is smaller than the main index, fewer documents are affected by theupdate. The delta index can therefore be updated more quickly.

The delta index is deactivated by default. The following rules are valid for its activation:· If you have a single host system the activation is optional. However, it is recommended if the main index has reached a certain size. If you activate the

delta index to soon, performance does not improve.· If you have a distributed TREX system the activation is obligatory. However, you still only activate it once the main index has reached a certain size.

Activating the delta index doesn't only speed up the update of the master index - it also enables fast index replication with a low network load.When index replication takes place the master index server replicates all changed master index files. Because the delta index consists of fewer files, itnaturally has fewer files to replicate. This means that index replication is quicker. Moreover, if you have decentralized data storage the network load isalso less because TREX has to copy less files to the slave hosts.

The delta index only speeds up the update if it is kept small. If it becomes too large, it no longer improves performance. When it reaches a certain size youhave to integrate it in the main index. You can integrate the delta index manually or configure TREX so that TREX regularly integrates it automatically. TREXcreates a new delta index automatically when the integration of the previous delta index is complete.

Delta Index (SAP Library - Technical Operations in Detail) 6/6/2012

http://help.sap.com/saphelp_nw70/helpdata/en/d9/0418418291a854e10000000a1550b0/content.htm 4 / 13

Page 5: Search and Classification TREX

Activating the Delta Index UseThe delta index is deactivated by default. You can activate it using the TREX admin tool. You activate it per index, not globally.The best time for activating it depends on your indexing process.

SAP recommends the following:· Initial indexing of large document sets

Activate the delta index after the initial indexing run. If you do not do this, the delta index grows too quickly and you have to integrate itinto the main index earlier than you would wish. This means that you need twice the indexing time: Firstly to index the documents in thedelta index, and then to integrate the delta index into the main index.

· No initial indexing of large data setsMonitor the size of the main index during routine operation. Activate the delta index if the main index reaches 100,000 to 1,000,000documents or 500 MB.

Procedure. . .

1. Go to the window Index Admin ® Index Info in the TREX admin tool. 2. Select the index that you want to activate the delta index for. Choose Delta Index On.

Activating the Delta Index (SAP Library - Technical Operations in Detail) 6/6/2012

http://help.sap.com/saphelp_nw70/helpdata/en/78/c4a74046033813e10000000a155106/content.htm 5 / 13

Page 6: Search and Classification TREX

Integrating a Delta Index into the Main Index UseA delta index only speeds up the update of the corresponding index if it is small. If it becomes too large, you have to integrate it into the main index. After theintegration has taken place TREX creates a new delta index.The integration process involves TREX rewriting all main index files. The duration of the integration process depends on the size of the main index. It can last afew minutes or several hours.In a distributed system the entire main index has to be replicated after the integration has taken place. This replication takes about the same amount of timeas the initial replication.The index server cannot index new documents during the integration of the delta index. This has the following effects:· If indexing takes place with a queue server, the queue server retains the documents until the integration process has been completed. Then the queue

server transmits the documents to the index server.· If indexing takes place without a queue server, the application can continue to send indexing requests to the index server. However, the index server only

processes them after the completion of the integration process. This means that it takes longer for indexing requests to be processed and for theapplication to receive the relevant response.

You can trigger the integration process manually or carry it out at defined time intervals. There are two difference procedures for time-dependent integration. Theprocedure that you use depends on whether indexing takes place with or without a queue server (QS). The table below gives an overview of the procedures.

Use with

Procedure Indexing withQS

Indexingwithout QS

Manual ! !

Time-dependent using the queue server !

Time-dependent using the Python scheduler !

We recommend the following for the time of the integration:· Trigger the first integration process if the delta index is bigger than 500 MB. You can find out the size of the delta index in the window

Index Admin ® Index Info in the TREX admin tool.· The integration process should take place at times when the system is not too busy.· Do not carry out the integration process too often. With large indexes, the integration and subsequent replication of the main index takes a

corresponding amount of time.

Integrating the Delta Index Manually. . .

1. Go to the window Index Admin ® Index Info in the TREX admin tool. 2. Select the index in question and choose Merge Delta Index.

Integrating the Delta Index Time-Dependently Using the Queue ServerIn the queue parameters enter the time for the integration in Merge Time for Delta Index.

Use All (4:00) to trigger replication every morning at 4am.

You do not need to coordinate the integration time with other activities carried out by the queue server and index server. If the activities collide, theindex server coordinates when it carries out which action.

For more information on changing queue parameters, see Configuring Queue Parameters.

Integrating the Delta Index Time-Dependently Using the Python SchedulerChange the following configuration files on all master name servers:

Configuration file ChangeTREXDaemon.ini

. . .

1. Activate the Python scheduler by changing the TREXconfiguration file TREXDaemon.ini in the TREX admintool, menu path Landscape ® Ini as follows:[daemon]programs=<other_sections>,cron

2. Once you have saved the changes, the TREX admin toolasks you whether it should trigger reconfiguration sothat the changes to the configuration file take effect.Confirm this query by choosing Yes.

crontab.ini Remove the comment sign from the following line:<schedule> python mergeDeltaIndex.py silentallIndexes=1 ''

Modify the schedule if necessary. For information on syntaxand for examples, see the configuration file.

Integrating a Delta Index into the Main Index (SAP Library - Technical Operations in Detail) 6/6/2012

http://help.sap.com/saphelp_nw70/helpdata/en/89/65a740aa053a13e10000000a155106/content.htm 6 / 13

Page 7: Search and Classification TREX

SAP Li br ar y - Techni cal O per at i ons i n Det ai l

dSAP Library - Technical Operations in Detail 6/6/2012

http://help.sap.com/saphelp_nw70/helpdata/en/2c/f2b140cbe49d2ae10000000a155106/frameset.htm 7 / 13

Page 8: Search and Classification TREX

SAP Li br ar y - Sear ch and Cl assi f i cat i on TREX

dSAP Library - Search and Classification TREX 6/6/2012

http://help.sap.com/saphelp_nw70/helpdata/en/9d/6b197a503166459cbc7e6d10d97f66/frameset.htm 8 / 13

Page 9: Search and Classification TREX

SAP Li br ar y - Techni cal O per at i ons i n Det ai l

dSAP Library - Technical Operations in Detail 6/6/2012

http://help.sap.com/saphelp_nw70/helpdata/en/eb/322c42be6fde2ce10000000a1550b0/frameset.htm 9 / 13

Page 10: Search and Classification TREX

SAP Li br ar y - Sear ch and Cl assi f i cat i on TREX

dSAP Library - Search and Classification TREX 6/6/2012

http://help.sap.com/saphelp_nw70/helpdata/en/72/b595166625664fbdc110d54639f27b/frameset.htm 10 / 13

Page 11: Search and Classification TREX

SAP Li br ar y - Sear ch and Cl assi f i cat i on TREX

dSAP Library - Search and Classification TREX 6/6/2012

http://help.sap.com/saphelp_nw70/helpdata/en/44/1a6f3de2cd5308e10000000a11466f/frameset.htm 11 / 13

Page 12: Search and Classification TREX

SAP Li br ar y - Sear ch and Cl assi f i cat i on ( TREX) Secur i t y G ui de

dSAP Library - Search and Classification (TREX) Security Guide 6/6/2012

http://help.sap.com/saphelp_nw70/helpdata/en/1d/e6d9610acada409d59945617271169/frameset.htm 12 / 13

Page 13: Search and Classification TREX

Software Logistics Software Logistics standardizes and automates software distribution and maintenance as well as test procedures for complex software landscapes and varioussoftware development platforms. These functions support your project, development, and application support teams.Software Logistics aims to implement a consistent, cross-solution change management that provides several procedures for maintenance, global rollouts, andlocalization, as well as open integration with products offered by third-party organizations.

Quality and Test ManagementNew TREX releases are always tested internally using a predefined test package with a standard test landscape and with verifiable test data. In particular, thehandling of mass data (mass tests), load restrictions (stress tests), and the performance of TREX are checked. The test package calls test atoms in the formof Python scripts that test the basic functionality of TREX and are stored in the following directory:

● UNIX/usr/sap/<sapsid>/TRX<instance_number>/exe/python_support

● Windows<disk_drive>:\usr\sap\<SAPSID>\TRX<instance_number>\exe\python_support

When you have installed TREX, you execute the Python script runInstallationTest.py that is used to test the basic TREX functions. This script calls asubset of TREX test atoms to check the functional correctness of TREX. If the Python script is executed successfully, you know that TREX has been installedproperly, the configuration files contain the necessary entries, and the TREX servers are running.

Transporting TREX Data and ConfigurationOn the TREX side, the data exists in the form of indexes that are built from the original data in the application using TREX. Therefore, TREX indexes areredundant copies of the application data. The TREX indexes can be restored from the original data at any time. There are no data transports for TREX that arecomparable with the usual procedures for SAP systems and that you could use to transport TREX data and configuration settings from a development systemthrough a consolidation system to a production system. However, it is possible to use Python scripts to copy TREX indexes and configuration files betweenservers. There are customer-specific solutions that make transports of this type possible for your specific situation. In the case of a data transport of this type,you must always copy the application data consistently with the TREX data. A procedure of this type is expressly not recommended by TREX and therefore notsupported.

Release and Upgrade Management and Support PackagesThe documentation below supports you in the installation, upgrade, and migration of TREX.

Make sure that you use the current version of the TREX guides. You can find them on the SAP Service Marketplace atservice.sap.com/instguidesNW70

Guide Description

SAP NetWeaver 7.0 Search and Classification (TREX)Multiple Hosts installation guide

For the installation and configuration ofdistributed TREX systems

SAP NetWeaver 7.0 Search and Classification (TREX) SingleHost installation guide

For single-host installation of TREX

Support Package Stack Guides – SAP NetWeaver For updates to a new support packagestack

Central Update of a System Landscape By Replicating the BinariesIn a distributed TREX system, you perform an update on the TREX host on which the central TREX instance and the global file system were installed. In thiscase, the central TREX instance selected is updated first and then once TREX is restarted, all TREX dialog instances in a TREX system landscape are updatedautomatically by means of the replication of the current binaries.

Implementation of SAP NotesThe following SAP Notes contain current installation information and corrections to the installation documentation.

Make sure that you use the current version of the SAP Notes. You can find SAP Notes on the SAP Service Marketplace atservice.sap.com/notes.

Relevant SAP Notes

SAP Note Title Comment

843360 Installing TREX 7.0 Contains current information on installation of SAPNetWeaver 7.0 TREX

802987 TREX 7.0: Central Note Central SAP Note on SAP NetWeaver 7.0 TREX

492305

INST: SAP J2EE Engine /Dialog Inst. / Gateway Inst.6.20

Relevant for an installation with an RFC connection:Contains current information on the SAP Gatewayinstallation.

658052

TREX 6.0/6.1/7.0: AdditionalInformation About TREX ABAPClient

Relevant for an installation with an RFC connection:Contains recommendations for using the queue serverand information on which application uses whichversion of the ABAP client. This information is relevantfor configuration steps after the TREX installation.

Software Logistics (SAP Library - Technical Operations Manual for SAP NetWeaver) 6/6/2012

http://help.sap.com/saphelp_nw70/helpdata/en/82/0397fc42e1bc45895dba2b37a94c5b/content.htm 13 / 13