executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/mdms-ha.pdf · these include...

51
IBM ® InfoSphere Master Data Management Server Best Practices Achieving high availability and scalability with IBM InfoSphere MDM Server Nick Kanellos MDM Product Architect Dennis Shi MDM Infrastructure Specialist Managing Consultant

Upload: others

Post on 06-Aug-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

IBM® InfoSphere™ Master Data Management Server

Best Practices

Achieving high availability and scalability with IBM InfoSphere MDM Server

Nick Kanellos MDM Product Architect

Dennis Shi MDM Infrastructure Specialist Managing Consultant

���

Page 2: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 2

Executive summary Master data, by its nature, is mission critical. Critical information such as customer names and addresses, accounts, customer assets, contact information, product definitions and descriptions, customer or supplier demographics, customer relationships, marketing assets, and so on, are all considered master data, and IBM® InfoSphere™ Master Data Management (MDM) Server is the hub that stores and maintains the integrity of this data. The multitude of IT applications that an organization relies on such as customer relationship management, marketing, payroll, accounting, and billing systems might all rely on InfoSphere MDM Server for at least a portion of the data that they need to function correctly. The loss of InfoSphere MDM Server capabilities could have a significant impact on the operation of an organization, the quality of service that it offers its customers, and its profitability.

This document describes the strategies you can use to ensure that IBM InfoSphere Master Data Management Server is available when it is needed. It describes various IBM components that you can use together to ensure that InfoSphere MDM Server remains available to meet its commitments to provide timely, accurate, mission-critical data within an organization.

The key to achieving high availability (HA) is to ensure redundancy. IBM WebSphere® Application Server Network Deployment Edition (WebSphere Application Server Network Deployment) provides this redundancy with its ability to cluster many servers so that they behave as one. Redundancy ensures that, if any server fails, the system as a whole can continue functioning. This document introduces the key concepts and terms related to the ability of WebSphere Application Server Network Deployment to create and maintain clusters.

InfoSphere MDM Server is a Java Enterprise Edition (JEE) application. It contains both EJB and web application components and is designed to be deployed onto WebSphere clusters. This document describes how to deploy InfoSphere MDM Server onto a cluster and also describes the clustering strategies available to you. These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components onto clusters dedicated to each component, the use of IBM HTTP Server to provide clustering support of the InfoSphere MDM Server web components, and the configuration of InfoSphere MDM Server resources to take full advantage of the clusters. The deployment of IBM HTTP Server onto its own cluster is discussed briefly as well.

Clients that access InfoSphere MDM Server for their data must also be built with the clustering, or high availability requirements of the system, in mind. This document provides coding examples that illustrate how you can access InfoSphere MDM Server in a cluster and how you can build robust code that recognizes when a server on a cluster might not be operational and how you can take appropriate action.

Finally, because high availability is a broad topic, additional references are provided at the end of this document.

Page 3: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 3

Table of contents Executive summary............................................................................................................................ 2 Table of contents................................................................................................................................ 3 Introduction: High availability, fault tolerance, and workload balancing ......................................... 5 Introduction to clustering with WebSphere Application Server and HTTP Server........................... 6

WebSphere Application Server and clusters................................................................................ 6 Overview................................................................................................................................ 6 Vertical or horizontal clustering ............................................................................................ 7

Accessing EJBs on a cluster ........................................................................................................ 9 Installing an application on a cluster ..................................................................................... 9 A client accessing the EJBs on a cluster with workload management and failover ............ 10

Accessing web content or web services on a cluster ................................................................. 12 Accessing web content or web services directly.................................................................. 12 Accessing web content or web services by using IBM HTTP Server ................................. 13

Java Message Service ................................................................................................................ 15 Database high availability................................................................................................................ 16 Deploying InfoSphere MDM Server onto a cluster......................................................................... 17

InfoSphere MDM Server architecture and components ............................................................ 17 InfoSphere MDM Server Java EE components ................................................................... 17

Deploying InfoSphere MDM Server onto a cluster................................................................... 18 Overview.............................................................................................................................. 18 Our cluster strategy .............................................................................................................. 19 Installing InfoSphere MDM Server onto our cluster ........................................................... 21 Installing the non-Java EE components of InfoSphere MDM Server ................................. 23 Postinstallation configuration steps ..................................................................................... 23

Pointing client applications to the InfoSphere MDM Server on the cluster ............................................. 23 Updating the Configuration Management Component repository............................................................ 24 Updating the config/bootstrap.properties file ........................................................................................... 25

Other scenarios........................................................................................................................... 26 Converting InfoSphere MDM Server from a stand-alone instance to a cluster................... 26

Configuring the IBM HTTP Servers to access InfoSphere MDM Server web services and web content........................................................................................................................................ 26

Testing the installation for high availability .................................................................................... 27 Overview.................................................................................................................................... 27 Using the Installation Verification Tool .................................................................................... 27 Using the Batch Processor ......................................................................................................... 27 Testing the web user interfaces.................................................................................................. 28 Testing the web services ............................................................................................................ 28

Writing clients that access InfoSphere MDM Server components on a cluster .............................. 30 Accessing the Enterprise JavaBeans.......................................................................................... 30 Accessing web services and web applications........................................................................... 33

Building fault tolerance applications that access MDM Services ................................................... 35 Enterprise JavaBeans ................................................................................................................. 35 Web services .............................................................................................................................. 36

Best Practices ................................................................................................................................... 38

Page 4: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 4

Conclusion ....................................................................................................................................... 39 Appendix 1 - Converting InfoSphere MDM Server from a stand-alone server to a cluster ............ 40

Overview.................................................................................................................................... 40 Postinstallation configuration .................................................................................................... 42

Appendix 2 – Further reading .......................................................................................................... 44 WebSphere Application Server network deployment high availability .................................... 44 DB2 high availability................................................................................................................. 44

Notices ............................................................................................................................................. 49 Trademarks ................................................................................................................................ 51

Page 5: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 5

Introduction: High availability, fault tolerance, and workload balancing

A system is highly available when it achieves no unwanted “downtime.” Downtime is the time that a system spends not servicing its users or clients and not meeting its commitments. A system can be “down” for any number of reasons. The computers on which it operates might be down. The system might have experienced a fault that caused it to stop functioning and responding to client requests. Or the system might be experiencing such a heavy volume of service requests by its clients that it is unable to service any more.

A system is fault tolerant if it can keep operating in spite of having experienced a fault. A fault is an event that a system does not expect or cannot handle. It might arise from errors in programming, receiving incorrect instructions from clients, receiving corrupt data, or being unable to access some of the resources on which it depends (such as computer memory, processor capacity, databases, networks, and so on). A system can achieve high availability without being fault tolerant if it can be built in such a way that it never experiences any faults. But that is never going to happen, and so high availability and fault tolerance go hand in hand.

One cause of system downtime is that the system is simply not “big enough” to handle the load placed on it by its users and clients. That is, it does not “scale” to the needs of its clients and users. There are two ways to rectify such a limitation. You can make the system bigger (that is, put it on bigger computers with more memory, more CPU capacity, more disk capacity, higher network bandwidth, and so on) or you can spread the work around. Computers get exponentially more expensive the bigger they get; and if your system is largely a transactional one that handles large volumes of small, relatively straightforward, independent transactions, you do not need large, complex computers with massive processing capacity. A more cost-effective approach might be to install your system on more than one computer and share the workload among them to make your system available to more users and clients. This approach is called workload balancing.1

1 Workload balancing, strictly speaking, is not in the domain of “high availability.” It generally falls into the domain of scalability whereby, as the load on a system grows, the IT infrastructure is able to grow proportionally. But as you will see, the strategies that you employ to achieve high availability also confer scalability, and vice versa. So if your goal is high availability, you get scalability for free; and if your goal is scalability, you get high availability for free. Consequently, this document discusses both.

Page 6: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 6

Introduction to clustering with WebSphere Application Server and HTTP Server

WebSphere Application Server and clusters

Overview As we have discussed, high availability is itself a goal that is achieved through fault tolerance and workload balancing. The best way to achieve both fault tolerance and workload balancing is by introducing redundancy to your system (no, we are not recommending that you make your system itself redundant). “Redundancy” in this sense means crafting a system in such a way that no one component of the system is critical to the operation of the overall system itself. If any one component of a system were to fail, the remaining components can continue to work, enabling the system to continue to meet its commitments.

With InfoSphere MDM Server running as a Java Platform, Enterprise Edition (Java EE) application within WebSphere Application Server, the simplest way to achieve HA is to install InfoSphere MDM Server onto a WebSphere cluster. A cluster is a group of application servers that all behave as a single server. The workload of the cluster is shared among the individual servers that make up the cluster. If one server experiences a fault and is disabled, the other servers can take over and redistribute the load. The overall system remains available. Clients accessing the system might never even know that a server has experienced any difficulties.

If you want to set up WebSphere clusters, you have to use WebSphere Application Server Network Deployment edition (WAS ND). The Base edition does not support clusters. The WebSphere Application Server Network Deployment documentation provides more details. In this document we only briefly introduce the various components that make up WebSphere Application Server Network Deployment, the terms that describe them, and how they are employed by InfoSphere MDM Server to achieve high availability. Refer to Appendix 2 for additional reading material about this topic.

The following diagram illustrates the basic components that make up WebSphere Application Server Network Deployment: a cell, cluster, node, and an application server.

Page 7: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 7

Figure 1: WebSphere Application Server Network Deployment components consist of, at a minimum, a cell, a cluster, a node, and an application server.

In WebSphere Application Server Network Deployment, a cell is a collection of nodes that are jointly managed. A node is the equivalent of a WebSphere Application Server Network Deployment application server profile. An application server profile is a runtime environment in which actual application servers run. The application server profile is what you see manifested on a computer after you install WebSphere Application Server Network Deployment. Inside the directory where you installed WebSphere Application Server is a profiles directory. Each new profile that you create on that computer will get its own directory. Inside that directory will be the executable and configuration components that are required to make a runtime environment. An application server can generally be equated to the Java virtual machine (JVM), the EJB container and web container in which all Java and Java EE applications, such as InfoSphere MDM Server, run. An application server runs within the context of a node. A cluster is a logical grouping of application servers. The application servers that belong to a cluster are called its members. A cluster can encompass as its members the application servers of many nodes. The application servers continue to run within the runtime environment of the node, however they operate jointly within the cluster as a single, virtual, distributed application server. A deployment manager is a type of node (application server profile) that is used to manage the other nodes in a cell. For example, through the deployment manager and its user interface console, you can add or remove application servers from a cell, create clusters, add or remove application servers to or from clusters, start or stop application servers or nodes, federate more nodes into a cell, remove nodes from a cell, deploy applications onto application servers or clusters, and so on.

Vertical or horizontal clustering There are two strategies that you can employ when deciding how to configure a WebSphere Application Server Network Deployment cluster: Vertical clustering or horizontal clustering. You can also use a combination of both. When you install WebSphere Application Server Network Deployment on a computer, the installer usually generates an application server profile

Page 8: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 8

and creates a single application server within that profile2. You can create more. The following illustration shows a cell in which four application nodes are installed on three computers:

ComputerNode

Computer

Node

Computer

NodeNode

Computer

Application Server

Application Server

Application Server

Application Server

Application Server

Application Server

Application Server

Application Server

Deployment Manager

Cluster

Cell

Figure 2: This depiction of a WebSphere Application Server cluster consists of 7 application servers (also known as members) deployed onto 4 nodes installed on 3 separate computers. The deployment manager is installed as its own node on a fourth computer. All 4 computers are managed within a single cell.

In the preceding illustration the deployment manager is installed on a fourth computer. The deployment manager is used to manage the cell. In the cell, one cluster consists of seven application servers (also known as cluster members) configured on the four nodes. Each node has two application servers3. The illustration shows that it is possible to spread a cluster across more than one computer and to configure more than one node on a computer. It is also possible to configure more than one application server on a single node.

The practice of adding more computers to a cluster is known as horizontal clustering. This type of clustering spreads the cluster out broadly across many computers – horizontally. The practice of deploying many application servers on a single node is known as vertical clustering because you stack the application servers onto a single computer – vertically.

2 You can also choose other installation options. You can choose to install a cell that includes a deployment manager profile along with an application server profile (application node). Or, you can choose to install a custom application server profile with no application servers installed. 3 Note that on one application node, an application server is not part of the cluster. Although it is managed as part of the cell, it is a stand-alone server. It might be used to deploy non-critical applications or it might belong to another cluster. We include it to illustrate all the possibilities available to you when you configure your server topology.

Page 9: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 9

Guideline:

Horizontal clustering increases the fault tolerance of an application that runs on the cluster. In the event of a hardware failure on a single computer, the remaining computers remain unaffected. It also increases the scalability of the system by distributing the processing across many CPUs. When people think of clustering, they generally think of horizontal clustering.

Vertical clustering also can be useful. You might do implement this type of cluster if you have sufficiently powerful computers and you are not getting the throughput that you want. Sometimes the JVM itself can be the bottleneck, and because each application server is its own JVM, the simplest way to increase throughput is to have more JVMs running. Also, because system faults are not always due to hardware problems, having more than one application server on each computer further increases your system’s fault tolerance (and hence its availability) without additional investment in hardware.

Recommendation: A combination of vertical and horizontal clustering can be the most effective approach. The precise mix-and-match of horizontal and vertical clustering will require testing in your environment.

Accessing EJBs on a cluster

Let’s take a quick look at a cluster from the standpoint of an application running on the cluster and a client accessing the EJBs within the application.

Installing an application on a cluster When you install an EJB application or a web application onto a cluster, you are simultaneously installing it on all the application servers that make up that cluster. The following diagram illustrates:

Page 10: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 10

ComputerNode

Computer

Node

Computer

NodeNode

Computer

Application Server

Application Server

Application Server

Application Server

Application Server

Application Server

Application Server

Application Server

Deployment Manager

Cluster

Cell

JEE ApplicationEJB

EJBEJB

EJBEJB EJB

EJB

Deploying a JEE application onto a cluster effectively deploys it simultaneously onto all the application servers in the cluster.

Figure 3: Java EE application on a cluster. When a Java EE application is installed onto a cluster, the

executables for that application are deployed onto each application server on that cluster.

By installing a Java EE application onto the cluster, we have effectively installed it onto seven application servers within the cluster. You cannot control that, because it is automatic. That is what it means to access an EJB on a cluster.

A client accessing the EJBs on a cluster with workload management and failover The following diagram illustrates how a client application accesses the EJBs that are running on a cluster:

Page 11: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 11

Computer 1Node

Computer 3

Node

Computer 2

NodeNode

Computer 0

Application Server

Application Server

Application Server

Application Server

Application Server

Application Server

Application Server

Application Server

Deployment Manager

Cluster

Cell

ORBWLM

Pluggin

EJB Client

A client application can access the EJBsof a JEE application from any cluster member

2810

2811

2812

2813

2810

2810

2811

2811

Figure 4: Client application accessing EJBs on a cluster. A client application, outside the cluster, can use an Object Request Broker with a workload management plug-in to access the EJBs of the JEE application.

A client accesses an EJB by locating it on any one of the servers running on the cluster using RMI over IIOP protocols. The client application does not do it by itself. It uses an Object Request Broker (ORB) with a workload management (WLM) plug-in. The WebSphere Application Server run time provides the infrastructure, and it is transparent to the client application (we discuss writing clients to access EJBs in a cluster in Writing clients that access InfoSphere MDM Server components on a cluster later in this document). The client must know the DNS name or the IP address of any computer in the cluster. It also must know the port number of any application server on that computer. This is known as the bootstrap port. WebSphere Application Server Network Deployment assigns a unique bootstrap port number to each application server on a computer. The syntax that a client uses to access an application server on a cluster looks like this example: corbaloc:iiop:<hostname>:<bootstrap port number>

It does not matter to which application server the client goes. The application servers that make up a cluster all know about each other and which are operating. To learn more, refer to the WebSphere Application Server Network Deployment documentation about clusters and high availability listed in Appendix 2. Any application server in the cluster that the client accesses will return to the WLM plug-in the addresses of all the other application servers in the cluster that are also functioning.

Page 12: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 12

Having a list of the application servers in a cluster, the ORB does not limit itself to the application server it first accessed. It uses an algorithm that is specified in the cluster to access a different application server for each invocation of an EJB method, even if it is accessing the same method. That is how the workload is distributed among all the members of a cluster.

If one of the cluster members fails, the other members will soon detect it and update their list of active cluster members. Any clients that access the cluster will receive an updated list of active servers. Any clients that already have a list, might end up calling the failed cluster member for some of their transactions. Those transactions will fail. The client can try again, and the likelihood is high that it will access a different server. Eventually that client will receive an updated list of active cluster members and will bypass the failed one altogether.

Accessing web content or web services on a cluster

Accessing web content or web services directly So far we have discussed how a client might access EJBs in a cluster. A different approach is required to access web applications or web services. Instead of using an Object Request Broker, web and web services clients access a web container directly. Also, instead of using RMI, web clients use HTTP. Like the bootstrap port number for ORB, WebSphere Application Server Network Deployment also assigns a port number used by HTTP clients to access web content or web services. This is known as the WC_defaulthost port number. The same port number is used for both web content and web services. Because it is HTTP, the syntax is familiar: http://<Computer DNS name or IP address>:<WC_defaulthost number>/<Name Web Content or web Service>

Like for the EJBs described earlier, you can access the web content or the web services on any application server in the cluster. Unlike what happens for EJBs, there is no WLM-like plug-in that enables you to switch from application server to application server based on an algorithm. After an application client has established a session with a particular cluster member, it cannot change members. If that cluster member fails, so does the application accessing it. The client itself is responsible for maintaining a list of all the cluster members hosting that web service or web content.

The following diagram illustrates how an application client accesses web content or web services on a cluster member.

Page 13: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 13

Computer 1Node

Computer 3

Node

Computer 2

NodeNode

Computer 0

Application Server

Application Server

Application Server

Application Server

Application Server

Application Server

Application Server

Application Server

Deployment Manager

Cluster

Cell

A client application can access the web content or web services on JEE application from any cluster member

9810

9811

9812

9813

9810

9810

9811

9811

JEE applicationWeb Content

Web ServiceWeb Service

Http://<Computer 1>:9812/<Web Service Name>

Web Services

HTTP Client

Figure 5: Application client accessing web content or web services on a cluster member. A client application, outside the cluster, uses HTTP to access the web service of a JEE application or web content from a web server.

Accessing web content or web services by using IBM HTTP Server You can use the IBM HTTP Server or any other web server with a supported WebSphere Application Server plug-in. This document focuses on IBM HTTP Server. Refer to the WebSphere Application Server documentation to make sure that there is a plug-in for your web server. You can configure IBM HTTP Server with the WebSphere Application Server plug-in to route requests to WebSphere Application Server Network Deployment and to distribute the load to all the members of the WebSphere Application Server cluster.

You need to install IBM HTTP Server and the corresponding WebSphere Application Server plug-in. You also need to copy some files to the WebSphere Application Server Network Deployment deployment manager bin directory. Follow the instructions accompanying the installation programs for the IBM HTTP Server and the WebSphere Application Server plug-in for HTTP Server. They are detailed and thorough. After you install IBM HTTP Server and the WebSphere Application Server plug-in for IBM HTTP Server, you can also configure WebSphere Application Server so that you can start and stop the IBM HTTP Server from the WebSphere Application Server administrative console. Do this by clicking the Servers Web Servers menu items in the administrative console, and then click the NEW button at the top of the list. Follow the instructions in the wizard.

Page 14: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 14

After IBM HTTP Server is installed and running, and the plug-in has been installed and configured, you can access the web content and web services installed on the cluster through the HTTP Server. How does the IBM HTTP Server know to which computers it can route the requests? And how does it recognize which requests it must route and which it must service on its own (that is, normal web pages)? The answer lies in a plug-in configuration file named, as one might guess, plugin-cfg.xml. This file is created when the plug-in is installed and configured. If you configured your WebSphere Application Server console to access the web server, you can view the contents of the file from the console. Click the Servers Web Servers menu item, and then click the web server that you configured. Then click the Plug-in Properties link. In the resulting pane, click the View button next to the textbox containing “plugin-cfg.xml,” to view the contents of the file.

It seems as though the HA chain has a weak link. What happens if IBM HTTP Server experiences a problem and stops operating? To ensure that you maintain HA, you can add several computers running IBM HTTP Server in a cluster (note that this is not a WebSphere Application Server Network Deployment cluster). Describing IBM HTTP Server clusters is beyond the scope of this document, so we will not go into details here. Here is a picture of how our hypothetical cluster might now look:

Computer 1Node

Computer 3

Node

Computer 2

NodeNode

Computer 0

Application Server

Application Server

Application Server

Application Server

Application Server

Application Server

Application Server

Application Server

Deployment Manager

Cluster

Cell

9810

9811

9812

9813

9810

9810

9811

9811

JEE applicationWeb Content

Web ServiceWeb Service

Web Services Client

Computer 4 Computer 5

IBM HTTP Server

WAS Plug-in

IBM HTTP Server

WAS Plug-in

Virtual IP Address

Web Client

Figure 6: IBM HTTP Server. By adding IBM HTTP Server, web clients and web services clients can enjoy a

degree of high availability similar to JEE clients. HTTP Server can redirect an HTTP request to any application server on the cluster. A virtual IP address can abstract a cluster of HTTP Servers to further enhance high availability.

Page 15: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 15

Guideline:

Accessing web services or web content directly in a cluster offers neither workload balancing nor failover. Consequently, you cannot achieve high availability for your applications without handling both workload balancing and failover explicitly in your application.

IBM HTTP Server enables you to achieve workload balancing and failover with WebSphere Application Server clusters. By setting up an IBM HTTP Server cluster, you complete your HA topology and ensure that all the components in your system have backups.

Java Message Service

In addition to using RMI and web services, InfoSphere MDM Server allows you to access services by using Java Message Service (JMS). Additionally, the InfoSphere MDM Server notification framework uses JMS to issue notifications of events and transactions that other applications can consume. A key consideration when employing messaging for InfoSphere MDM Server is to assure that after you post a message onto the queue, it will be processed successfully. There is a risk that a server in the cluster will pick up the message from the queue and fail. You will not know that your queue message was not successfully processed. Ensure that your messaging engine provides the ability to persist messages, and to remove them from the queue only after they have been successfully processed.

Two JMS implementations are available: WebSphere Application Server Network Deployment embedded JMS provider and IBM WebSphere MQ. While it is beyond the scope of this document to describe WebSphere MQ in detail, it is recommended that you use WebSphere MQ as your JMS provider.

Page 16: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 16

Database high availability Database high availability is an extensive topic by itself. It is beyond the scope of this document to describe IBM DB2® high availability in detail. Here we briefly survey the significant features offered by DB2.

To achieve high availability for IBM DB2 for Linux, UNIX and Windows, Version 9.5 and 9.7, the following features are available:

• DB2 high availability disaster recovery is a database replication feature that protects

against data loss and provides failover ability to its client applications by replicating data from a source database, called the primary database, to a target database, called the standby database. Updates to the standby database are made by sending the log records to the standby database. The HADR standby database continuously replays all the log records to keep in sync with the primary database.

• Automatic client reroute is a database feature that allows you to provide an alternate server

location that clients can redirect requests to if the primary database server fails.

• DB2 high availability feature enables the integration between the IBM Data Server and cluster management software, which automates the process of monitoring the system and the failover process if the database server goes down.

Other features that are available for IBM DB2 on z/OS® are listed in Appendix 2.

Page 17: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 17

Deploying InfoSphere MDM Server onto a cluster

InfoSphere MDM Server architecture and components

InfoSphere MDM Server is a large, sophisticated application with components spanning the three tiers of an enterprise application. The following diagram illustrates the significant components of InfoSphere MDM Server:

Application Server

EJB Container Web Container

Get name of

deployment module

MDMServer

MDM Server EJBs

MDM Server Web

Services

MDM Server Admin EJBs

MDM Server Event

Manager MDBs

MDM Server JMS Adapter

Java EE Client

MDM Server Batch

Processor

Java EE Client

MDM Server Notification Framework

MDM Server Config

Repository

MDM Server Event

Manager Client

Java EE Client

MDM Server Config

Management Console

JMS

JMS JMS

RMI

Get name of

deployment module

Data Steward-ship UI

RMI RMI

Admin UI

Figure 7: InfoSphere MDM Server consists of these main components: the MDMServer JEE application, the Data

Stewardship UI, and the Admin UI.

InfoSphere MDM Server Java EE components Note: In this document the term “InfoSphere MDM Server” refers to the InfoSphere MDM Server product. “MDMServer” refers to the EJB application, a component of InfoSphere MDM Server, which is installed onto the cluster.

InfoSphere MDM Server consists of three main components that are deployed onto an application server. It consists of the MDMServer which is the core business application. It is the application that other applications will access to store and retrieve their master data. InfoSphere MDM Server also consists of two web-based UI applications: the Data Stewardship UI and the Administration UI.

Page 18: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 18

• MDMServer is the enterprise server application that contains the EJBs and web services that comprise the core of InfoSphere MDM Server capabilities. In addition to the EJBs and web services, it also includes the InfoSphere MDM Server notification framework that enables JMS-based communication with external applications; the InfoSphere MDM Server JMS adapter that enables JMS-based communications with InfoSphere MDM Server; the administration EJBs that enable the maintenance of the InfoSphere MDM Server code tables and the configuration of its business parameters; and InfoSphere MDM Server Configuration Management Component that enables the configuration of InfoSphere MDM Server behavior and the enabling or disabling of various features.

• The Data Stewardship UI is used by the data stewards of an organization to manage the data stored within InfoSphere MDM Server. Activities performed within the DataSteward UI include editing parties (people or organizations), managing duplicates, and so on.

• The Business Administration UI is used by InfoSphere MDM Server system administrators to configure InfoSphere MDM Server, including configuring code tables, enabling or disabling behavioral extensions, and so on.

The Data Stewardship UI and the Business Administration UI are web applications that are also clients to the MDMServer EJB application. They use RMI to connect to MDMServer and follow the same HA mechanism as external Java clients that access MDM EJB components.

InfoSphere MDM Server also consists of several components that run outside MDMServer and access the MDMServer application as Java EE clients. They include the Batch Processor, the Event Manager Client, and Configuration Management Component.

• The Event Manager client is a Java EE client application used to control the time-based events within Event Manager.

• The Configuration Management Component is used to edit configuration items in the InfoSphere MDM Server configuration repository.

• The Batch Processor is used to perform bulk operations such as bulk loads or bulk updates in InfoSphere MDM Server.

• The Installation Verification Tool is a test client that you can use to validate the InfoSphere MDM Server installation. You can also use it to validate your XML-based transactions as you develop them.

Deploying InfoSphere MDM Server onto a cluster

Overview As mentioned earlier, a WebSphere Application Server Network Deployment cluster is like a large, distributed, virtual server spanning many application servers (and likely spanning several computers). When you install InfoSphere MDM Server onto a cluster, you are physically installing it on several computers simultaneously. Fortunately, WebSphere Application Server Network Deployment does all the work for you. All you need to do is specify the cluster (or clusters) and let WebSphere Application Server Network Deployment do the rest. Furthermore,

Page 19: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 19

the InfoSphere MDM Server installer allows you to specify whether you want to install InfoSphere MDM Server and its components on a cluster.

Guideline:

Where possible, install InfoSphere MDM Server directly onto a cluster. Avoid the complexities of installing InfoSphere MDM Server on a single WebSphere Application Server server, and then converting it to a cluster.

Our cluster strategy InfoSphere MDM Server consists of three Java EE applications: MDMServer, the EJB server; the Data Stewardship user interface; and the Business Administration user interface. These conform to the Model-View-Controller design pattern of Java EE by separating the business logic from the user interface. Being separate applications, they might be subject to different frequencies of updates or patches. Consequently, it is advisable to separate the applications from each other onto separate clusters.

Guideline:

After you install and configure WebSphere Application Server Network Deployment on the computers you will use to run InfoSphere MDM Server, create at least two clusters: one for MDMServer and one for the Data Stewardship user interface application4.

Guideline:

Employ a mix of vertical and horizontal clustering as your hardware capability warrants.

The following diagram provides an overview of a hypothetical deployment of InfoSphere MDM Server with four computers (named Pickerel, Pike, Sturgeon, and Rockbass):

4 Note that InfoSphere MDM Server consists of a third application, the Business Admin UI. This application is used to perform configuration functions for MDM such as changing the code in a code table. These functions are quite infrequent and are carried out by a relatively small number of people. Consequently, clustering the Business Admin UI for HA is not vital. We leave it up to you.

Page 20: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 20

InfoSphere MDM Server Three Tiers Architecture Scenario on Rockbass, Sturgeon, Pike, and Pickerel

Laptops

Database MDMSVCS1Schema: rocsvcs1

Laptops

Sturgeon

For simplicity, we have not drawn lines between AdminUI and InfoSphere MDM Server or show the WebSphere MQ Server.

Pickerel

Pike Rockbass

Plug-in

IBM IHS Server

IBM IHS Server

Plug-in

MDM_Server22

MDM_Server21

MDM_Server11

DataStewardUI_Server21

Node AgentNode Agent

Node Agent

Deployment Manager

DataStewardUI_Server21

DataStewardUI_Server21

AdminUI_Server11

AdminUI_Server21

AdminUI_Server22

Node Agent

Figure 8: Example deployment of InfoSphere MDM Server on four computers. Two computers run IBM

HTTP Server and form the front end to InfoSphere MDM server. Two computers run a cluster that runs InfoSphere MDM Server and the InfoSphere MDM Server UIs.

Following the recommendations that we made earlier, we will create two clusters: MDM Cluster, onto which we will install InfoSphere MDM Server, and DataStewardUI_Cluster, onto which we will install the data stewardship web applications. For consistency and symmetry for our example, we will also create a third cluster, the AdminUI_Cluster, onto which we will install the MDM Business Admin UI web application. The following diagram shows the three clusters configured and running in the WebSphere Application Server Network Deployment administrative console.

Page 21: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 21

Figure 9: This screen capture shows how cluster names and status are displayed in the WebSphere

Application Server administrative console

We will follow our recommendations and implement both a vertical and horizontal clustering strategy. Our cluster will span two computers: Rockbass and Sturgeon (horizontal clustering). Our IT department informed us that Sturgeon is a more powerful computer and so, for each of our three clusters, we will create two application servers (vertical clustering) on it and a single application server on Rockbass. These six application servers, along with the deployment manager and two node agents make up the DefaultCoreGroup to provide high availability.

We will also install IBM HTTP Server and the WebSphere Application Server plug-in on the Pike and Pickerel computers. By creating a cluster of the IBM HTTP Server, we will use a single virtual IP address between both. The default computer in the cluster is Pike. If Pike fails, Pickerel will take over; however, our clients will continue to address Pike because both Pike and Pickerel share the same virtual IP address.

Installing InfoSphere MDM Server onto our cluster After preparing the cluster environment, we can install InfoSphere MDM Server, the BusinessAdmin web application, and the DataStewardship web application. During installation, for each of the three InfoSphere MDM Server Java EE applications, you must specify that you want to install it onto a cluster. You do this by selecting a check box as shown in the following illustration:

Page 22: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 22

Figure 10: Installing Java EE applications onto a cluster. After you pick the computer on which Deployment

Manager is installed, the MDM installer lists the clusters that are managed by that deployment manager.

After you specify that you want to install InfoSphere MDM Server on a cluster, you can select the cluster from the list, as shown in the preceding illustration. The same is true for the other InfoSphere MDM Server Java EE applications as shown in the following illustrations:

Figure 11: Selecting a cluster. As this illustration shows, if you created separate clusters for the InfoSphere

MDM Server UIs, you can select the appropriate cluster, depending on whether you are installing the InfoSphere MDM Server JEE application or the Admin or Data Stewardship UIs. In this instance, the cluster for the Admin UI is selected during the installation of the Admin UI component.

Page 23: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 23

Figure 12: Selecting a cluster. As this illustration shows, if you created separate clusters for the InfoSphere

MDM Server UIs, you can select the appropriate cluster, depending on whether you are installing the InfoSphere MDM Server JEE application or the Admin or Data Stewardship UIs. In this instance, the cluster for the Data Stewardship UI is selected during the installation of the Data Stewardship UI component.

After the installation, you will see all applications are deployed onto the cluster, instead of on separate application servers.

All resources, including JDBC data sources, Queue connection factories, Queue destinations, Topic connection factories, Topic destinations, Activation specifications (if you use WebSphere Application Server default messaging provider for InfoSphere MDM Server) defined for InfoSphere MDM Server are defined at the cluster level by default. This ensures that they are available to all the members of the cluster. Also, if you change one of those resources, the changes will apply to all the members of the cluster.

Installing the non-Java EE components of InfoSphere MDM Server These components can be installed anywhere you like. They can be installed on client computers or other server computers. After you install them, you have to configure them so that they know on which computers the Java EE components are installed. This topic is discussed more fully in the next section.

Postinstallation configuration steps

Pointing client applications to the InfoSphere MDM Server on the cluster For each client component, you have to update a properties file to ensure that the client component can find the InfoSphere MDMServer application on the cluster where it is installed. Each client component has a properties file that includes an entry named NAME_SERVER_URL, or ServerConfiguration.provider_url, or java.naming.provider.url. The applications use that entry to locate the bootstrap port address for the EJBs that make up the MDMServer Java EE application. For the cluster that we created, we will update the components as follows:

Page 24: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 24

corbaloc:iiop:rockbass:2811,:sturgeon:2811,2812

where rockbass and sturgeon are the computer DNS names and the numbers 2811 and 2812 are the bootstrap port numbers for the application servers in the cluster named MDM_Cluster.

The following list shows the InfoSphere MDM Server client components and the entry in each of the properties files for the bootstrap locations of the InfoSphere MDMServer application:

• Install Verification Tool (verify.sh): NAME_SERVER_URL=corbaloc:iiop:rockbass:2811,:sturgeon:2811,2812

• BatchProcessor (Batch.properties): ServerConfiguration.provider_url = corbaloc:iiop:rockbass:2811,:sturgeon:2811,2812

• eventManagerClient (EventManagerClient.properties)) ProcessControllerInternal.PROVIDER_URL= corbaloc:iiop:rockbass:2811,:sturgeon:2811,2812

• BusinessAdmin web application (mdmUIConfiguration.properties within propertiesUI.jar) java.naming.provider.url= corbaloc:iiop:rockbass:2811,:sturgeon:2811,2812

• DataStewardship web application (mdmUIConfiguration.properties within propertiesUI.jar) java.naming.provider.url= corbaloc:iiop:rockbass:2811,:sturgeon:2811,2812

Guideline:

In order for the client applications to take full advantage of InfoSphere MDM Server running on a cluster, update the provider URLs to ensure that they point to all the servers on the cluster on which InfoSphere MDM Server is running. This step ensures that there is no single point of failure when a client application attempts to obtain the initial context of an EJB from a single cluster member that might not be operational.

Updating the Configuration Management Component repository The Configuration Management Component stores configuration items that govern the behavior of InfoSphere MDM Server in its repository. One of the behaviors that it governs is the generation of primary keys for each of the business tables in the InfoSphere MDM Server database. To ensure that InfoSphere MDM Server running on separate cluster members doe not inadvertently create identical primary keys for different rows in the same business table, the configuration management component can be used to stipulate separate application instances and the instance ID to the primary keys that each one creates. We do this by running the following SQL on the Configuration Management database (Note: Your values might be different based on the data that is already in your tables): insert into appinstance (INSTANCE_ID,DEPLOYMENT_ID,NAME,LAST_UPDATE_DT,LAST_UPDATE_USER) values(00000001,9361271823498366,'instance01',current_timestamp,null); insert into appinstance (INSTANCE_ID,DEPLOYMENT_ID,NAME,LAST_UPDATE_DT,LAST_UPDATE_USER) values(00000002,9361271823498366,'instance02',current_timestamp,null); insert into appinstance (INSTANCE_ID,DEPLOYMENT_ID,NAME,LAST_UPDATE_DT,LAST_UPDATE_USER) values(00000003,9361271823498366,'instance03',current_timestamp,null); insert into configelement (ELEMENT_ID,DEPLOYMENT_ID,NAME,VALUE,VALUE_DEFAULT,INSTANCE_ID,LAST_UPDATE_DT,LAST_UPDATE_USER) values(00000011,9361271823498366,'/IBM/CoreUtilities/KeyGeneration/instancePKIdentifier','01','',00000001,current_timestamp,null); insert into configelement (ELEMENT_ID,DEPLOYMENT_ID,NAME,VALUE,VALUE_DEFAULT,INSTANCE_ID,LAST_UPDATE_DT,LAST_UPDATE_USER) values(00000012,9361271823498366,'/IBM/CoreUtilities/KeyGeneration/instancePKIdentifier','02','',00000002,current_timestamp,null);

Page 25: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 25

insert into configelement (ELEMENT_ID,DEPLOYMENT_ID,NAME,VALUE,VALUE_DEFAULT,INSTANCE_ID,LAST_UPDATE_DT,LAST_UPDATE_USER) values(00000013,9361271823498366,'/IBM/CoreUtilities/KeyGeneration/instancePKIdentifier','03','',00000003,current_timestamp,null); insert into configelement (ELEMENT_ID,DEPLOYMENT_ID,NAME,VALUE,VALUE_DEFAULT,INSTANCE_ID,LAST_UPDATE_DT,LAST_UPDATE_USER) values(00000021,9361271823498366,'/IBM/DWLCommonServices/KeyGeneration/instancePKIdentifier','01','',00000001,current_timestamp,null); insert into configelement (ELEMENT_ID,DEPLOYMENT_ID,NAME,VALUE,VALUE_DEFAULT,INSTANCE_ID,LAST_UPDATE_DT,LAST_UPDATE_USER) values(00000022,9361271823498366,'/IBM/DWLCommonServices/KeyGeneration/instancePKIdentifier','02','',00000002,current_timestamp,null); insert into configelement (ELEMENT_ID,DEPLOYMENT_ID,NAME,VALUE,VALUE_DEFAULT,INSTANCE_ID,LAST_UPDATE_DT,LAST_UPDATE_USER) values(00000023,9361271823498366,'/IBM/CoreUtilities/KeyGeneration/instancePKIdentifier','03','',00000003,current_timestamp,null);

The DEPLOYMENT_ID in the SQL shown in the preceding example can be retrieved from the deployment table.

Guideline:

To ensure that the primary keys that InfoSphere MDM Server generates for its business tables are unique, you must add an entry to the APPINSTANCE table for each member of the cluster on which MDMServer is installed.

Updating the config/bootstrap.properties file On each cluster member, you must update the config/bootstrap.properties file in the properties.jar file. The file is in the MDMServer application EAR file. The following example illustrates a command for extracting the file from the application server profile on the Rockbass computer: ws7admin:rockbass:/usr/IBM/WebSphere7/AppServer/profiles/DemoAppSrv01/installedApps/rockbassCell01/MDM-App.ear> ws7admin:rockbass:/usr/IBM/WebSphere7/AppServer/profiles/DemoAppSrv01/installedApps/rockbassCell01/MDM-App.ear> jar -tvf properties.jar config 0 Tue Apr 20 23:21:08 EDT 2010 config/ 9635 Thu Apr 22 18:35:18 EDT 2010 config/bootstrap.properties ws7admin:rockbass:/usr/IBM/WebSphere7/AppServer/profiles/DemoAppSrv01/installedApps/rockbassCell01/MDM-App.ear>

In our case we have two cluster members running MDMServer on MDM_Cluster. We will do the updates as follows:

For cluster member one, Rockbass: …… #-------------------------------------------------------------------------------- # Application Runtime Instance Name #-------------------------------------------------------------------------------- # # Specifies the name of the application runtime instance. # The meaning of this name depends on the edition of the application. # J2EE : name of the cluster node on which the instance runs # J2SE : convention-based name to reflect the purpose of running the instance # # An instance name is not typically required and can be left empty. #

Page 26: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 26

# Examples: # instance.name=SERVER-01 # instance.name=Monthly instance.name=instance01

The name in instance.name must match the name of the instance that was created in the APPINSTANCE table in the Configuration Management repository.

Restart each cluster member after you update the properties file. We have now finished installing and configuring InfoSphere MDM Server to operate on our clusters.

Guideline

To ensure that InfoSphere MDMServer generates unique primary keys across all cluster members, you must update the config/bootstrap.properties file in the properties.jar file for InfoSphere MDM Server on each cluster member with a unique instance name.

Other scenarios

Converting InfoSphere MDM Server from a stand-alone instance to a cluster The scenario described here assumes that the cluster has been configured before you install InfoSphere MDM Server. This is the recommended approach. However, if you find that you need to redeploy InfoSphere MDM Server from a single server instance onto a cluster, we present an approach in Appendix 1 - Converting InfoSphere MDM Server from a stand-alone server to a cluster.

Configuring the IBM HTTP Servers to access InfoSphere MDM Server web services and web content

After installing MDM Server and the web-based user interfaces on InfoSphere MDM Server, you can configure IBM HTTP Server. Recall that in the preceding diagram IBM HTTP Server and the WebSphere Application Server plug-in are installed on two additional computers: Pike and Pickerel. Recall from our earlier discussion that IBM HTTP Server uses a file named plugin-cfg.xml. That file contains a list of the URLs of all the web content, web services, and the associated host computers to which the IBM HTTP Server will have access. Recall that you can view the contents of the plugin-cfg.xml file from the WebSphere Application Server administrative console by navigating to Servers Web Servers menu item, and then clicking the web server that you configured. Then click the Plug-in Properties link. In the resulting pane, click the View button next to the text box containing “plugin-cfg.xml” to view the contents of the file. It will contain the URLs and port numbers of the Data Stewardship UI, the Business Administration UI, and the InfoSphere MDM Server Web Services endpoints.

Page 27: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 27

Testing the installation for high availability

Overview

After the deployment and configuration of InfoSphere MDM Server is completed, it is a good practice to verify that the failover and workload balancing are indeed functioning. InfoSphere MDM Server provides the Installation Verification Tool (IVT) for testing. It invokes simple transactions on InfoSphere MDM Server. You can also use the InfoSphere MDM Server Batch Processor as a test tool. It performs high volume operations that will enable you to confirm that WebSphere Application Server is correctly balancing the load across all the servers in the cluster. Finally, you have to test that you can access InfoSphere MDM Server web services and the InfoSphere MDM Server web applications (the Data Stewardship UI and the Business Administration UI. In a later section we describe how to write code to access InfoSphere MDM Server components. You can adapt the examples to write your own applications that test and stress test InfoSphere MDM Server on a cluster.

Using the Installation Verification Tool

This is a simple test. The Installation Verification Tool (IVT) invokes a single transaction on InfoSphere MDM Server. You can use the WebSphere Application Server Network Deployment administrative console to alternately shut down one of the servers on the MDM_Cluster. You can then use the IVT to execute InfoSphere MDM Server services, such as adding a party or an organization. If everything is configured and running correctly, InfoSphere MDM Server returns a “SUCCESS” status to the IVT. You can run the test repeatedly while alternating which servers have been shut down.

Using the Batch Processor

This is a more sophisticated test. This test can be used for both fault tolerance and workload balancing for MDMServer operating on the MDM_Cluster.

The InfoSphere MDM Server Batch Processor is designed to load large volumes of data into InfoSphere MDM Server quickly. Consequently it is an excellent tool to verify the workload balancing and fault tolerance of an HA topology. It works by reading an input file consisting of MDM transactions formatted according to an MDM XML-based API, and executing each one sequentially. It can also be configured to invoke the MDM transactions in parallel. You can control the number of parallel threads by specifying the number of “submitters” in the Batch.properties file that is in the directory into which you installed the Batch Processor. For example, to set up the Batch Processor to invoke two streams, you can set the number of submitters to 2 as follows:

Submitter.number = 2

Page 28: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 28

You can create an input file consisting of several hundred or several thousand “AddPerson” transactions. You can then start the Batch Processor to run the “AddPerson” transactions in your input file.

To verify that Workload Management is distributing the load to each cluster member, you can use a performance monitor such as Tivoli Performance Viewer to monitor the load on each cluster member. You should see each cluster member receive a load proportional to the weighting that you assigned to it when you created the cluster member.

Alternatively, while the Batch Processor is running, you can shut down the cluster member gently or by pulling the plug on the actual computer that hosts the cluster member. You should expect to see all your “AddParty” transactions succeed as WebSphere Application Server Network Deployment HA capabilities fail over the transactions to the remaining cluster members.

Testing the web user interfaces

InfoSphere MDM Server consists of two web based user interfaces: the Data Stewardship UI and the Business Administration UI. The first test is quite straightforward. To test that IBM HTTP Server is effectively routing requests through to one of the cluster members, you can use your web browser and type in the address:

http://<serverDNSName or IP Address>/CustomerDataStewardshipWeb/ to access the Data Stewardship UI, or

http://<Server DNS Name or IP Address>/ CustomerBusinessAdminWeb/ to access the Business Administration UI

where <Server DNS Name or IP Address> is the virtual address of your IBM HTTP Server cluster. The default HTTP port number 80 is omitted. If you are not using the IBM HTTP Server and are accessing the web applications directly on the application servers, the <Server DNS Name or IP Address> will be the host name or IP address of any of the host computers on the cluster on which InfoSphere MDM Server UI applications are running. In that case, the default HTTP port does not apply and you have to include the WC_defaulthost port number in the URL for the application.

Another straightforward test is to test the fault tolerance of the application. By using one of the two addresses shown above, you can open several tabs and browse through the various panes of the application UIs while shutting down one or more of the cluster members that contain Data Stewardship or the Business Administration applications. While you will lose any HTTP sessions you might have had with one of the servers that was shut down, you will be able to continue working because HTTP Server will redirect you to an alternate cluster member. You will likely have to log in again, and re-enter any data that was on your screen.

Testing the web services

A simple test to ensure that the web services are active and that HTTP Server is routing requests to them also involves using a web browser. Type the following address into the address bar of your web browser:

Page 29: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 29

http://<Server DNS Name or IP Address>/PartyWS_HTTPRouter/services/PartyPort If the web service is active and the IBM HTTP Server is successfully routing the requests to the cluster, the browser will display the following text:

{http://www.ibm.com/xmlns/prod/websphere/wcc/party/service}PartyPort Hi there, this is a Web service!

Page 30: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 30

Writing clients that access InfoSphere MDM Server components on a cluster

InfoSphere MDM Server is a Java EE application whose services can be accessed by using Enterprise Java Beans, web services, and web-based user interfaces. To access the InfoSphere MDM Server EJBs or web services, your application must be a Java EE application client or be running within the context of the WebSphere Application Server runtime environment. This is done most easily if you are using IBM Rational® Application Developer or Rational Software Architect.

Guideline:

To access InfoSphere MDM Server EJBs and web services, it is highly recommended that you use Rational Software Architect or Rational Application Developer. Your applications must be either a Java EE application client or a Java EE application that runs in a WebSphere Application Server application container. In both cases, it will be running in a WebSphere Application Server runtime environment. The underlying infrastructure for workload balancing and failover is implemented with the WebSphere Application Server runtime environment. If you do not use it, you will be reinventing the wheel, and it is a very big wheel. This section and the next are predicated on building your application as a Java EE application client or as a Java EE application.

Accessing the Enterprise JavaBeans

InfoSphere MDM Server is a headless application (that is, one with no primary user interface) that is integrated with an organization’s line of business applications around customers, products, and contracts. These applications integrate InfoSphere MDM Server into their processes by using the InfoSphere MDM Server API. The API is in the form of several hundred services that are accessed through a single InfoSphere MDM Server stateless session bean. The EJB through which all of InfoSphere MDM Server services are accessed is called the DWLServiceControllerEJB.

Let’s refer again to the cluster that we installed and deployed in the previous section. We show the illustration here again for convenience:

Page 31: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 31

InfoSphere MDM Server Three Tiers Architecture Scenario on Rockbass, Sturgeon, Pike, and Pickerel

Laptops

Database MDMSVCS1Schema: rocsvcs1

Laptops

Sturgeon

For simplicity, we have not drawn lines between AdminUI and InfoSphere MDM Server or show the WebSphere MQ Server.

Pickerel

Pike Rockbass

Plug-in

IBM IHS Server

IBM IHS Server

Plug-in

MDM_Server22

MDM_Server21

MDM_Server11

DataStewardUI_Server21

Node AgentNode Agent

Node Agent

Deployment Manager

DataStewardUI_Server21

DataStewardUI_Server21

AdminUI_Server11

AdminUI_Server21

AdminUI_Server22

Node Agent

Figure 13: Example deployment of InfoSphere MDM Server on four computers. Two computers run IBM

HTTP Server and form the front end to InfoSphere MDM server. Two computers run a cluster that runs InfoSphere MDM Server and the InfoSphere MDM Server UIs.

Recall that we created three clusters: MDM_Cluster into which we installed the InfoSphere MDM Server application, AdminUI_Cluster into which we installed the MDM Business Administration UI web application, and DataStewardUI_Cluster into which we installed the Data Stewardship web application. Each cluster spans two computers: Sturgeon and Rockbass. Each cluster has two cluster members on Sturgeon and one cluster member on Rockbass.

So where does one get the EJB in order to use it? After all, the EJB is deployed and running on two separate computers and in three separate cluster members. The following code snippet illustrates how you can access the home interface of the DWLServiceController EJB on a cluster that spans two computers and encompasses three cluster members:

Page 32: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 32

public static DWLServiceControllerHome createDWLServiceController2() {

String ControllerJNDIName = "cell/clusters/MDM_Cluster/com/dwl/base/” + “requestHandler/beans/DWLServiceController";

DWLServiceControllerHome controllerHome = null; Hashtable<String, String> env = new Hashtable<String, String>(); env.put(Context.INITIAL_CONTEXT_FACTORY, "com.ibm.websphere.naming.WsnInitialContextFactory"); env.put(Context.PROVIDER_URL, "corbaloc:iiop:Sturgeon:2811,:2812,:Rockbass:2811”; try { InitialContext initCtx = new InitialContext(env); Object obj = initCtx.lookup(JNDIName); Class theClass = DWLServiceControllerHome.class; controllerHome = (DWLServiceControllerHome)

PortableRemoteObject.narrow(obj, theClass); return controllerHome; } catch (NamingException e) { e.printStackTrace(); return null; } catch (Exception e) { e.printStackTrace(); return null; } }

Guideline:

In this snippet and in subsequent snippets we use string literals in the program for instructional purposes only. We recommend that when you actually write your code, you externalize these strings into properties files and use strings variables instead.

The following examples show the two pieces of the snippet that interest us.

String ControllerJNDIName = "cell/clusters/MDM_Cluster/com/dwl/base/” +

“requestHandler/beans/DWLServiceController";

The actual JNDI name for the DWLServiceController is com/dwl/base/requestHandler/ beans/DWLServiceController. You can still use that JNDI name if you plan to access the EJB directly from a single cluster member such as MDM_Server21 on Sturgeon. For example, the following code will do just that:

... String ControllerJNDIName = “com/dwl/base/requestHandler/beans/DWLServiceController";

DWLServiceControllerHome controllerHome = null; Hashtable<String, String> env = new Hashtable<String, String>(); env.put(Context.INITIAL_CONTEXT_FACTORY,

"com.ibm.websphere.naming.WsnInitialContextFactory"); env.put(Context.PROVIDER_URL, "corbaloc:iiop:Sturgeon:2811”);

...

But what happens if that cluster member is not working for some reason? Your program will not be able to get the home interface of the bean, and there goes our high availability.

Page 33: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 33

This brings us to the second interesting piece of code from our snippet:

env.put(Context.PROVIDER_URL, "corbaloc:iiop:Sturgeon:2811,:2812,:Rockbass:2811”;

To use an EJB that is running remotely on another computer, you need the URI of the computer and bootstrap port number through which the application server can return the EJB interface. In this case, the code that we wrote specifies all the members of MDM_Cluster where the EJB is residing. The WebSphere Application Server runtime environment will pick one from the list and use it. If it cannot connect, it moves to a second one on the list, and so on. Of course if it cannot connect to any one on the list, then the entire system is down.

After you have the home interface of the EJB, you can use the EJB create method to get the actual EJB and carry on with your operations. All the rest is handled by the WebSphere Application Server runtime environment. You might think that if you got a home interface from a specific cluster member, then that is where all your future service requests will be directed. But that is not the case. The WebSphere Application Server runtime environment will actually distribute your requests to all the cluster members, depending on how much “weight” you assigned to each cluster member. The higher the weight assigned to a cluster member, the greater the proportion of requests will be routed to that member. But, as we have said, all that is handled by WebSphere Application Server runtime environment and there is not much that we need to worry about in our code.

Accessing web services and web applications

All MDMServer services can be accessed as web services in addition to EJB services. The same protocols used to access the UIs are used to access the web services. Instead of using the bootstrap port number, the WC_DefaultHost port number is used instead. Also, because these web services use HTTP, the familiar HTTP syntax is used. For example when accessing the Party Web Service, your code might look something like this: com.ibm.www.xmlns.prod.websphere.wcc.party.service.PartyService ps = new com.ibm.www.xmlns.prod.websphere.wcc.party.service.PartyServiceLocator(); PartyService thePartyService = ps.getPartyPort(new java.net.URL( "http://rockbass:9081/PartyWS_HTTPRouter/services/PartyPort"));

The number 9081 represents the WC_DefaultHost port number for the MDM. However, your program will be bound to MDM_Server11 running on Rockbass. If it goes down, so does your program. Your program will have to explicitly establish a connection to another cluster member.

IBM HTTP Server to the rescue Like for the web user interfaces, you can use IBM HTTP Server to route your web services requests to the cluster members.

Instead of accessing the InfoSphere MDM Server web services directly from the cluster members, you can point to the IBM HTTP Server as follows: com.ibm.www.xmlns.prod.websphere.wcc.party.service.PartyService ps = new com.ibm.www.xmlns.prod.websphere.wcc.party.service.PartyServiceLocator();

PartyService thePartyService = ps.getPartyPort(new java.net.URL("http://Pike/PartyWS_HTTPRouter/services/PartyPort"));

Page 34: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 34

In this case the default port, port 80, is left off the URL. The IBM HTTP Server on Pike will receive the request and recognize the /PartyWS_HTTPRouter/services/PartyPort as being a web service provided within WebSphere Application Server. It will then route the request to one of the cluster members.

Page 35: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 35

Building fault tolerance applications that access MDM Services

Guideline:

Building robust applications with fault tolerance means knowing when your applications have attempted to invoke a transaction on a failed server and trying the transaction again. The WebSphere Application Server runtime environment will route your call to another server.

Enterprise JavaBeans

Whenever a cluster member experiences a fault and stops operating, the WebSphere Application Server runtime environment will automatically reroute all subsequent requests to the remaining cluster members. All the members of a cluster will send out “heartbeats” to the other members that indicate that they are operational. If a cluster member goes down, and it stops issuing its heartbeats, it might take a few seconds for WebSphere Application Server Network Deployment to detect the missing heartbeats. If your program happened to issue a service request that was routed by WebSphere Application Server Network Deployment to the faulty cluster member, you will likely get back an exception. Typically the exception is a MarshalException that contains something like this in the message: java.rmi.MarshalException: CORBA COMM_FAILURE 0x4942f306 Maybe; nested exception is: org.omg.CORBA.COMM_FAILURE: purge_calls:1866 Reason: CONN_ABORT (1), State: ABORT (5) vmcid: IBM minor code: 306 completed: Maybe

If your program makes a service request to InfoSphere MDM Server and receives such an exception, simply try it again. If the cause of the exception was the failure of a cluster member, subsequent attempts should be successful as the WebSphere Application Server runtime environment reroutes the RMI requests to the remaining cluster members. The following code snippet provides a simple illustration:

public Serializable callMDM () { try { DWLServiceController thisDWLServiceController; DWLServiceControllerHome controllerHome = null; Hashtable<String, String> env = new Hashtable<String, String>(); env.put(Context.INITIAL_CONTEXT_FACTORY, "com.ibm.websphere.naming.WsnInitialContextFactory"); env.put(Context.PROVIDER_URL, "corbaloc:iiop:Sturgeon:2811,:2812,:RockBass:2811"); InitialContext initCtx = new InitialContext(env); Object obj = initCtx.lookup(JNDIName); Class theClass = DWLServiceControllerHome.class; controllerHome = (DWLServiceControllerHome) PortableRemoteObject .narrow(obj, theClass); int maxTries = 3; int numTries = 0; Serializable theResponse;

Page 36: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 36

while (numTries <= maxTries) { try { thisDWLServiceController = controllerHome.create(); theResponse = thisDWLServiceController.

processRequest(myHashMap, theRequest); return theResponse; } catch (DWLResponseException DWLe) { //This could be a business error identified by MDMServer. throw DWLe; } catch (java.rmi.MarshalException e) { Thread.sleep(300); numTries++; if (numTries == maxTries) throw e } catch (Exception e) { // Something else is wrong throw e; } } } catch (Exception e) { // error trying to get the initial context in the first place.

// Something could be wrong not related to HA } }

Web services

The same approach applies to InfoSphere MDM Server web services. Instead of receiving a MarshalException, however, your program will likely receive a WebServicesFault. However, the same type of algorithm applies as the one show in the preceding example. Your code could do something similar to the following code snippet.

public PersonSearchResultsResponse searchPerson() throws javax.xml.rpc.ServiceException, java.net.MalformedURLException, com.ibm.ws.webservices.engine.WebServicesFault, com.ibm.www.xmlns.prod.websphere.wcc.common.intf.schema.ProcessingException, java.rmi.RemoteException { int numTries = 0; int maxTries = 3; boolean gotTheService = false; com.ibm.www.xmlns.prod.websphere.wcc.party.service.PartyService partyServiceLocator = new PartyServiceLocator(); PartyService thePartyService = null; while (!gotTheService) { try { thePartyService = partyServiceLocator.getPartyPort(new java.net.URL( "http://Pike/PartyWS_HTTPRouter/services/PartyPort")); gotTheService = true; } catch (javax.xml.rpc.ServiceException serviceException) { numTries++; if (numTries >= maxTries) throw serviceException; } catch (java.net.MalformedURLException malformedURLException) { throw malformedURLException; } } //if you made it to this point you've successfully retrieved the Web Service from the remote computer.

Page 37: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 37

PersonSearch ps = new PersonSearch(); ps.setGivenNameOne("Fred"); ps.setLastName("Flintstone"); numTries = 0; PersonSearchResultsResponse response = null Control control = new Control(); control.setRequesterName("user1"); control.setRequesterLanguage(100); control.setRequesterLocale("us-en"); control.setRequestId(1234567); String[] roles = {"Superuser","normal"}; control.setRequesterRole(roles); control.setReturnAvailableResultCount(true); while (numtries <= maxtries) {

try { response = thePartyService.searchPerson(control, ps); } catch (com.ibm.ws.webservices.engine.WebServicesFault e) { numTries++; if (numTries >= maxTries) throw e; } catch

(ProcessingException e) { throw e; } catch (java.rmi.RemoteException e) { throw e; }

} return response; }

This snippet makes three attempts to acquire the web service. If it cannot acquire the service it re-throws the exception it received during its attempt to acquire the service. After it has successfully acquired the service, it makes three attempts to execute the PersonSearch service. After the third attempt has failed, it simply re-throws the web services fault it received. Any other exceptions are thrown immediately.

Page 38: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 38

Best Practices • When deploying InfoSphere MDM Server for high availability, use a

combination of vertical and horizontal clustering to achieve your business needs. The precise mix-and-match of horizontal and vertical clustering will require testing in your environment.

• Where possible, install InfoSphere MDM Server directly onto a cluster. Avoid the complexities of installing the application on a single WebSphere Application Server server, and then converting it to a cluster.

• Always use IBM HTTP Server in a clustered topology for web services and web applications.

• Separate the core server and the UI application components. Create more than one cluster, one for InfoSphere MDM Server and one for the business applications such as the Data Stewardship UI.

• Ensure that the provider URL for all client applications points to all the servers on the cluster for InfoSphere MDM Server so that no single point of failure causes a cluster member to become unavailable.

• Ensure the uniqueness of primary keys generated for operational tables by making updates to the InfoSphere MDM Server Configuration and Management component and by making changes to the application properties files.

• Include fault tolerant logic in any client applications that you write that will invoke InfoSphere MDM Server services that are available in a clustered environment.

Page 39: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 39

Conclusion This document has shown how you can achieve high availability for InfoSphere MDM Server by installing it on a WebSphere Application Server Network Deployment cluster. It has also shown how you can write programs that use InfoSphere MDM Server services to achieve a high degree of fault tolerance. And, as we have discussed, fault tolerance and high availability go hand in hand. And finally, we discussed how you can configure the InfoSphere MDM Server components that are themselves clients for InfoSphere MDM Server services, so that they too can achieve increased fault tolerance and high availability.

Contributors

The authors would like to recognize the following individuals for their feedback on this document and their contributions to this topic:

Stephanie Hazlewood

Product Architect, Manager of AdTech, InfoSphere MDM Server

Alex Jia

Implementation Services Desk Technical Manager

Nigel Jones

InfoSphere Tools Architecture

Sriram Padmanabhan

Distinguished Engineer, Chief Architect InfoSphere Servers

John Thomas

WW MDM Server Competency Manager

Lena Woolf

Senior Product Architect, InfoSphere MDM Server

Paul van Run

Senior Technical Staff Member, Chief MDM Architect

Wei Zheng

Product Architect, InfoSphere MDM Server

Michelle Corbin

Information Architect, ID Team Lead, InfoSphere Information Server

Page 40: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 40

Appendix 1 - Converting InfoSphere MDM Server from a stand-alone server to a cluster

Overview

Many organizations start small and grow as time goes on. Or they might have installed InfoSphere MDM Server and focused on its functionality and only later considered the high availability aspects of InfoSphere MDM Server. Consequently, an organization might have elected to install InfoSphere MDM on a stand-alone server. Later, the organization might decide to redeploy InfoSphere MDM Server onto a cluster. This section describes the process of doing so.

We will use the same configuration of computers that we used when describing the deployment of InfoSphere MDM Server directly onto a cluster. We include the diagram here for convenience:

Page 41: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 41

InfoSphere MDM Server Three Tiers Architecture Scenario on Rockbass, Sturgeon, Pike, and Pickerel

Laptops

Database MDMSVCS1Schema: rocsvcs1

Laptops

Sturgeon

For simplicity, we have not drawn lines between AdminUI and InfoSphere MDM Server or show the WebSphere MQ Server.

Pickerel

Pike Rockbass

Plug-in

IBM IHS Server

IBM IHS Server

Plug-in

MDM_Server22

MDM_Server21

MDM_Server11

DataStewardUI_Server21

Node AgentNode Agent

Node Agent

Deployment Manager

DataStewardUI_Server21

DataStewardUI_Server21

AdminUI_Server11

AdminUI_Server21

AdminUI_Server22

Node Agent

Figure 14: Example deployment of InfoSphere MDM Server on four computers. Two computers run IBM

HTTP Server and form the front end to InfoSphere MDM server. Two computers run a cluster that runs InfoSphere MDM Server and the InfoSphere MDM Server UIs.

Recall our setup: Rockbass and Sturgeon have WebSphere Application Server installed. Each will have an application server profile (a node) configured. Pike and Pickerel have IBM HTTP Server with the WebSphere Application Server plug-in installed. For details about configuring IBM HTTP Server in a cluster see “Configuring the IBM HTTP Servers to access InfoSphere MDM Server web services and web content” in the main body of this document

It is typical (and recommended) practice when employing WebSphere Application Server Network Deployment to install a WebSphere Application Server deployment manager profile alongside an application server profile (that is, a node). This first application server profile (node) will be federated into the Deployment Manager profile. To be federated means to be managed by the Deployment Manager instead of being self-managed (that is, stand-alone).

Initially we will have installed InfoSphere MDM Server onto a single computer. The first computer on which we installed InfoSphere MDM Server is Rockbass. Because it is the first computer, it will have the Deployment Manager Profile and the first federated Application

Page 42: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 42

Server Profile (that is, node). To install the MDMServer onto our first computer we will do the following steps:

1. Ensure that the Deployment Manager installed on Rockbass server is running. 2. Ensure that the application server profile on Rockbass has been federated and is being

managed by the Deployment Manager. 3. Install MDMServer onto the node on Rockbass. 4. After running the installation script, a new server (for example, MDM_Server11) will be

created under node Rockbass. You can pre-create the application server for InfoSphere MDM Server without any resources defined or let the installer create the application server for you during the installation. A new application (for example, MDM-App) will be installed and mapped to the server MDM_Server11 in the node Rockbass.

Having installed and configured InfoSphere MDM Server onto a single computer, we now turn to redeploying it onto a cluster. We follow these steps after we acquire Sturgeon as our second computer:

1. Install WebSphere Application Server Network Deployment onto Sturgeon and configure a single application server profile on it.

2. From the Administration Console of our Deployment Manager, we “federate” the application server profile on Sturgeon into our cell. The application server profile on Sturgeon now becomes a node in the cell managed by the Deployment Manager.

3. Create MDM Server Cluster. Create a new cluster (for example, MDM_Cluster) based on the server MDM_Server11 (convert the MDM_Server11 in node Rockbass as the existing server onto a cluster member to create the cluster) by using either the WebSphere Application Server Network Deployment administrative console or a wasadmin script. Also, select the server MDM_Server11 as the template server. Add one or more new cluster members into the cluster on other nodes in the cell (that is, horizontal clustering). A new server MDM_Server11 will be created in node Sturgeon and the MDM-App will be mapped to the cluster. You can create additional application servers (cluster members) in each node (vertical clustering).

4. After the cluster has been created, you can go to the Enterprise Application page in the WebSphere Application Server Network Deployment administrative console to verify that the MDM-App is now mapped to the cluster MDM_Cluster. (Before creating the cluster, the MDM-App was mapped to the MDM_Server11.) From that page you can also go to the application server and see to what Server/Cluster the MDM-App belongs.

5. If necessary, create additional clusters for other applications separately.

Postinstallation configuration

(For All IBM WCC/MDM Server versions)

Because we have cloned everything from the existing InfoSphere MDM Server template, all application properties will be identical across all the cluster members. We will need to change the following properties to reflect the actual name of cluster members and actual log file location. Or you can use a JVM variable to point to your designated log file location.

• MessageLog.properties

Page 43: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 43

• JDKLog.properties

• armfile.properties

• log.properties

• Log4J.properties

If you are running IBM MDM/WCC product pre-v8, you need to update the following properties (instancePKIdentifier) to differentiate each cluster member. For example, for cluster member one, set its instancePKIdentifier to 01 and for cluster member two, set its instancePKIdentifier to 02, and so on. This is not required after InfoSphere MDM Server, Version 8, because instancePKIdentifier has been migrated into Configuration Management tables (APPINSTANCE and CONFIGELEMENT).

• DWLCommon_extension.properties

• tcrm_extension.properties

If you are running IBM InfoSphere MDM Server, Version 8 or later, refer to “Postinstallation configuration steps” in this document.

Follow the same steps outlined above for deploying the MDM UI applications (for example, the Data Stewardship UI Application and the Business Administration Console UI Applications).

There are advantages and disadvantages for the approach described here. It is easy to set up an InfoSphere MDM Server cluster. You can start with a single InfoSphere MDM Server instance and you do not have to worry about HA at the beginning of the project. You can add any number of new cluster members whenever you are ready and resources are available. However, all WebSphere Application Server resources employed by InfoSphere MDM Server (such as JMS and JDBC resources) will have been defined at the application server level, that is, their scope will be for each application server). Those resources will be replicated for each member that you add to the cluster. You can remove the redundant resource definitions and redefine them at the cluster level. Further, if you are using WebSphere MQ as the InfoSphere MDM Server messaging provider, you might want to have another MQ queue manager for each additional cluster member. In this case, you will need to update the configuration from the WebSphere Application Server Network Deployment administrative console for each additional cluster member.

If you plan to use the WebSphere Application Server embedded messaging provider for InfoSphere MDM Server, after you create the cluster you have to remove the original SIB and message engine definitions and re-create them at the cluster level. This is because the InfoSphere MDM Server installer by default created the SIB buses at the application server level and created a single UUID stored in the SIB tables for the messaging engine. The new cluster members will not be able to share this SIB bus and message engine.

Page 44: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 44

Appendix 2 – Further reading

WebSphere Application Server network deployment high availability

Establishing highly available services for applications

http://publib.boulder.ibm.com/infocenter/wasinfo/v7r0/topic/com.ibm.websphere.nd.multiplatform.doc/info/ae/ae/welc6tophighavail.html

The WebSphere Contrarian: Run-time management high availability options, redux

http://www.ibm.com/developerworks/websphere/techjournal/1001_webcon/1001_webcon.html

The WebSphere Contrarian: A better Web application configuration for high availability

http://www.ibm.com/developerworks/websphere/techjournal/0802_webcon/0802_webcon.html

The WebSphere Contrarian: High availability (again) versus continuous availability

http://www.ibm.com/developerworks/websphere/techjournal/1004_webcon/1004_webcon.html

DB2 high availability

High Availability with DB2 Database products

http://publib.boulder.ibm.com/infocenter/db2luw/v9r5/topic/com.ibm.db2.luw.admin.ha.doc/doc/c0051346.html

High Availability and Disaster Recovery options for DB2 on Linux, UNIX and Windows

http://www.redbooks.ibm.com/redbooks/pdfs/sg247363.pdf

Implementing High Availability with DB2 9.5

http://www.ibm.com/developerworks/data/library/techarticle/dm-0807wright/index.html

Building a high availability database environment using

http://www.ibm.com/developerworks/websphere/techjournal/0705_lee/0705_lee.html?S_TACT=105AGX01&S_CMP=LP

Page 45: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 45

WebSphere middleware, Part 1 Building a high availability database environment using WebSphere middleware, Part 2

http://www.ibm.com/developerworks/websphere/techjournal/0706_banerjee/0706_banerjee.html

Building a high availability database environment using WebSphere middleware, Part 3

http://www.ibm.com/developerworks/websphere/techjournal/0710_barghouthi/0710_barghouthi.html

Configuring client reroute for applications that use DB2 database

http://publib.boulder.ibm.com/infocenter/wasinfo/v7r0/index.jsp?topic=/com.ibm.websphere.express.doc/info/exp/ae/tdat_clientreroute.html

WebSphere Application Server Data Source Properties

http://publib.boulder.ibm.com/infocenter/wasinfo/v7r0/index.jsp?topic=/com.ibm.websphere.express.doc/info/exp/ae/udat_jdbcdatasorprops.html

Best Practices: DB2 High Availability Disaster Recovery

http://www.ibm.com/developerworks/data/bestpractices/hadr/

HADR Simulator http://www.ibm.com/developerworks/wikis/display/data/HADR_sim Compare the distributed DB2 9.5 Data Servers

http://www.ibm.com/developerworks/data/library/techarticle/0301zikopoulos/0301zikopoulos1.html

Automated Cluster Controlled HADR Configuration Setup using the IBM DB2 High Availability Instance Configuration Utility (db2haicu)

ftp://ftp.software.ibm.com/software/data/pubs/papers/HADR_db2haicu.pdf

IBM DB2 V8.2 Automatic Client Reroute Facility

http://www.ibm.com/developerworks/data/library/techarticle/dm-0512zikopoulos/

Automatic Client Reroute limitations

http://publib.boulder.ibm.com/infocenter/db2luw/v9r5/topic/com.ibm.db2.luw.admin.ha.doc/doc/c0011977.html

Page 46: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 46

Configuring client reroute for applications that use DB2 databases

http://publib.boulder.ibm.com/infocenter/wasinfo/v7r0/index.jsp?topic=/com.ibm.websphere.express.doc/info/exp/ae/tdat_clientreroute.html

Java client support for high availability on IBM data servers

http://publib.boulder.ibm.com/infocenter/dzichelp/v2r2/index.jsp?topic=/com.ibm.db29.doc.dshare/db2z_dshare.htm

Automating IBM DB2 UDB HADR with HACMP

ftp://ftp.software.ibm.com/software/data/pubs/papers/hadr-hacmp.pdf

Automating DB2 HADR failover on Linux using Tivoli System Automation on Multiplatforms

ftp://ftp.software.ibm.com/software/data/pubs/papers/hadr_tsa.pdf

Availability with a click of a button: DB2 HADR

http://www.dbazine.com/db2/db2-disarticles/zikopoulos19

Restrictions for HADR DB2 9.7

\http://publib.boulder.ibm.com/infocenter/db2luw/v9r7/topic/com.ibm.db2.luw.admin.ha.doc/doc/c0011760.html

DB2 LUW v9.8 (pureScale) Transparent Application Scaling with DB2 pureScale

http://download.boulder.ibm.com/ibmdl/pub/software/data/sw-library/db2/papers/db2-pure-scale-wp.pdf

Information center (internal access only)

http://db2id.torolab.ibm.com/v98/CoralGA/index.jsp

What is DB2 pureScale

http://www.ibm.com/developerworks/data/library/dmmag/DBMag_2010_Issue1/DBMag_Issue109_pureScale/index.html

DB2 Chat with the Lab: pureScale Technology Preview

http://www.channeldb2.com/events/db2-purescale-scaling

Oracle 11g Technical Comparison of DB2 HADR and Oracle Data Guard

ftp://ftp.software.ibm.com/software/data/pubs/papers/hadr-comp.pdf

Oracle 11g High Availability Overview

http://www.filibeto.org/sun/lib/nonsun/oracle/11.1.0.6.0/B28359_01/server.111/b28281.pdf

Oracle Database Concepts

http://download.oracle.com/docs/cd/B19306_01/server.102/b14220/toc.htm

Page 47: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 47

Oracle Maximum Availability Architecture

http://www.oracle.com/technology/deploy/availability/htdocs/maa.htm

Oracle Real Application Clusters 11g

http://www.oracle.com/technology/products/database/clustering/pdf/twp_rac11g.pdf

Oracle Real Application Cluster Installation Guide

http://www.filibeto.org/sun/lib/nonsun/oracle/11.1.0.6.0/B28359_01/install.111/b28264.pdf

Oracle High Availability Documentation

http://download.oracle.com/docs/cd/E11882_01/server.112/e10804/overview.htm#i1006492

Oracle 11g Administrators Guide

http://download.oracle.com/docs/cd/B28359_01/server.111/b28310.pdf

Oracle High Availability Case Studies

http://www.oracle.com/technology/deploy/availability/htdocs/HA_CaseStudies.html

DB2 z/OS 9 DB2 9 for z/OS: Distributed Functions

http://www.redbooks.ibm.com/redbooks/pdfs/sg246952.pdf

DB2 9 for z/OS Data Sharing: Distributed Load Balancing and Fault Tolerant Configuration

http://www.redbooks.ibm.com/redpapers/pdfs/redp4449.pdf

Parallel Sysplex Clustering Technique

http://publib.boulder.ibm.com/infocenter/zos/basics/topic/com.ibm.zos.zmainframe/zconc_clusterPlSys.htm?resultof=%22%73%79%73%70%6c%65%78%22%20

How to Set Up Application Server to Access DB2 z/OS with High Availability

http://www-01.ibm.com/software/os/systemz/telecon/oct6/prz/

DB2 z/OS Information Center

http://publib.boulder.ibm.com/infocenter/dzichelp/v2r2/index.jsp?topic=/com.ibm.db29.doc/db2prodhome.htm

DB2 for z/OS Sysplex Workload Balancing in Java Applications

http://publib.boulder.ibm.com/infocenter/dzichelp/v2r2/topic/com.ibm.db29.doc.java/com.ibm.db2.luw.apdv.java.doc/doc/c0056067.htm

Page 48: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 48

DB2 z/OS 8 DB2 UDB for z/OS: Application Design for High Performance and Availability

http://www.redbooks.ibm.com/redbooks/pdfs/sg247134.pdf

Page 49: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 49

Notices This information was developed for products and services offered in the Canada.

IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate and verify the operation of any non-IBM product, program, or service.

IBM may have patents or pending patent applications covering subject matter described in this document. The furnishing of this document does not grant you any license to these patents. You can send license inquiries, in writing, to:

IBM Director of Licensing IBM Corporation North Castle Drive Armonk, NY 10504-1785 U.S.A.

For license inquiries regarding double-byte (DBCS) information, contact the IBM Intellectual Property Department in your country/region or send inquiries, in writing, to:

Intellectual Property Licensing Legal and Intellectual Property Law IBM Japan Ltd. 1623-14, Shimotsuruma, Yamato-shi Kanagawa 242-8502 Japan

The following paragraph does not apply to the United Kingdom or any other country where such provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you.

This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice.

This document may provide links or references to non-IBM Web sites and resources. IBM makes no representations, warranties, or other commitments whatsoever about any non-IBM Web sites or third-party resources that may be referenced, accessible from, or linked from this document. A link to a non-IBM Web site does not mean that IBM endorses the content or use of such Web site or its owner. In addition, IBM is not a party to or responsible for any transactions you may enter into with third parties, even if you learn of such parties (or use a link to such parties) from an IBM site. Accordingly, you acknowledge and agree that IBM is not responsible for the availability of such external sites or resources, and is not responsible or liable for any content, services, products, or other materials on or available from those sites or resources. Any software provided by third parties is subject to the terms and conditions of the license that accompanies that software.

IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you.

Page 50: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 50

Licensees of this program who wish to have information about it for the purpose of enabling: (i) the exchange of information between independently created programs and other programs (including this one) and (ii) the mutual use of the information that has been exchanged, should contact:

IBM Canada Limited Office of the Lab Director 8200 Warden Avenue Markham, Ontario L6G 1C7 CANADA

Such information may be available, subject to appropriate terms and conditions, including in some cases payment of a fee.

The licensed program described in this document and all licensed material available for it are provided by IBM under terms of the IBM Customer Agreement, IBM International Program License Agreement, or any equivalent agreement between us.

Any performance data contained herein was determined in a controlled environment. Therefore, the results obtained in other operating environments may vary significantly. Some measurements may have been made on development-level systems and there is no guarantee that these measurements will be the same on generally available systems. Furthermore, some measurements may have been estimated through extrapolation. Actual results may vary. Users of this document should verify the applicable data for their specific environment.

Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.

All statements regarding IBM's future direction or intent are subject to change or withdrawal without notice, and represent goals and objectives only.

This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible, the examples include the names of individuals, companies, brands, and products. All of these names are fictitious and any similarity to the names and addresses used by an actual business enterprise is entirely coincidental.

COPYRIGHT LICENSE:

This information contains sample application programs in source language, which illustrate programming techniques on various operating platforms. You may copy, modify, and distribute these sample programs in any form without payment to IBM, for the purposes of developing, using, marketing or distributing application programs conforming to the application programming interface for the operating platform for which the sample programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these programs.

Each copy or any portion of these sample programs or any derivative work must include a copyright notice as follows:

© (your company name) (year). Portions of this code are derived from IBM Corp. Sample Programs. © Copyright IBM Corp. _enter the year or years_. All rights reserved.

If you are viewing this information softcopy, the photographs and color illustrations may not appear.

Page 51: Executive summarypublic.dhe.ibm.com/software/dw/data/bestpractices/MDMS-HA.pdf · These include horizontal and vertical clustering, the deployment of InfoSphere MDM Server components

Achieving high availability and scalability with IBM InfoSphere MDM Server Page 51

Trademarks IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business Machines Corp., registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at “Copyright and trademark information” at www.ibm.com/legal/copytrade.shtml.

Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.

Windows is a trademark of Microsoft Corporation in the United States, other countries, or both.

UNIX is a registered trademark of The Open Group in the United States and other countries.

Java and all Java-based trademarks and logos are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both.

Other company, product, or service names may be trademarks or service marks of others.