lighten - hcl technologies€¦ · a true in-memory distributed data-grid (not just the notion of...

21
WHITE PAPER June 2010 lighten Oracle Coherence cache for a high performance eCommerce Application

Upload: others

Post on 01-Aug-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: lighten - HCL Technologies€¦ · A true In-Memory Distributed Data-Grid (not just the notion of distributed cache which is either based on replication or invalidation in distributed/clustered

WHITE PAPERJ u n e 2 0 1 0

nlighten

Oracle Coherence cache for a high performance eCommerce Application

Page 2: lighten - HCL Technologies€¦ · A true In-Memory Distributed Data-Grid (not just the notion of distributed cache which is either based on replication or invalidation in distributed/clustered

Oracle Coherence cache for a high performance eCommerce Application | June 2010

�© 2010, HCL Technologies. Reproduction Prohibited. This document is protected under Copyright by the Author, all rights reserved.

Abstract 3

Abbreviations 3

Challenges in Ecommerce Application 4

Caching solution – which one to use? 5

Coherence 6

Cache Topologies 7

Coherence Cache Settings 7

Coherence Extend 11

Coherence Configuration 14

Best Practices 18

Common Issues 18

Conclusion 20

Reference 20

TABLE OF CONTENTS

Page 3: lighten - HCL Technologies€¦ · A true In-Memory Distributed Data-Grid (not just the notion of distributed cache which is either based on replication or invalidation in distributed/clustered

Oracle Coherence cache for a high performance eCommerce Application | June 2010

�© 2010, HCL Technologies. Reproduction Prohibited. This document is protected under Copyright by the Author, all rights reserved.

Abstract

This document discusses how the in-memory Data-Grid based on Oracle’s Coherence product can be leveraged to implement a very high performing and highly-scalable e-commerce application where it provides the basic knowledge for understanding, planning and setting-up coherence Data-Grid(s) and then also touches-upon a few critical tasks involved in the implementation such as setting the appropriate cache size and configuring the relevant cache parameters. Finally, the document covers typical issues which can be encountered during implementation and the suggested resolutions for them.

Abbreviations

This section lists abbreviations required to accurately interpret the details in this document.

ORM – Object Relational Mapping

JDBC – Java Database Connectivity

JVM – Java Virtual Machine

GC – Garbage Collection

JMX – Java Management Extentions

Page 4: lighten - HCL Technologies€¦ · A true In-Memory Distributed Data-Grid (not just the notion of distributed cache which is either based on replication or invalidation in distributed/clustered

Oracle Coherence cache for a high performance eCommerce Application | June 2010

�© 2010, HCL Technologies. Reproduction Prohibited. This document is protected under Copyright by the Author, all rights reserved.

Challenges in E-commerce Application

A typical e-commerce application will have a very high performance & scalability requirement, such as page response time of < 3 seconds, 120K customer sessions per hour and 100K Orders per day.

Basically, ORM technologies will be used in the e-commerce application. Result of the ORM technologies are multiple select queries used compared to pure JDBC, high concurrent load, frequent database access for mostly read only data, high session memory resulting in garbage collection frequency and non-linear growth of mostly read only data over period of time. These all are the factors will lead to performance impact in the application. This is the major problem domain for the e-commerce application.

Mostly, read only data plays a major role in the e-commerce application performance. Number of JVM per process and Caching read only data will reduce the database access bottlenecks and performance impact. By default ORM supports orthogonal caching. Each caching technique has pros and cons. Example of OpenJPA, where extremely high frequency of data reads causes the thread contention and also a different object instance even for the same entity. Application caching is explicitly having logic in the application layer for retrieving or populating cache. Content cache is the example for application caching.

Caching more data in the application memory will impact more GC and will directly lead to performance degradation. Distributed caching is the best option for a large amount of cached data. Avoiding unnecessary caching, optimizing the heap memory size and finding suitable garbage collector will reduce the frequent GC.

Size of read only data will differ based on the number of countries, language, vendor and sites used in the application. The example below explains the predicted growth of the product content data in the ecommerce application.

104

161

2496

3744

322

Lang Level

Site Level (One Lang)

Site & Lang

Site, Country & Lang

Site, Country, Lang, Vendor

y = 28.179e0.9907x

R² = 0.9319

Example of predicted growth of mostly read only data of e-commerce application

Page 5: lighten - HCL Technologies€¦ · A true In-Memory Distributed Data-Grid (not just the notion of distributed cache which is either based on replication or invalidation in distributed/clustered

Oracle Coherence cache for a high performance eCommerce Application | June 2010

�© 2010, HCL Technologies. Reproduction Prohibited. This document is protected under Copyright by the Author, all rights reserved.

Caching solution – which one to use?

Lots of caching solutions are available for java web application. Based on the performance & scalability requirement, suitable caching solution should be implemented in the web application. Different caching solution as follows:

a) Java Caching System

JCS is an open source distributed caching system written in java for web applications. It is designed to speed up dynamic web applications by providing a means to manage cached data of various dynamic natures. Like any caching system, the JCS is most useful for high read, low put applications.

b) WhirlyCache

Whirlycache is a fast, configurable in-memory object cache for Java. Whirly cache is open source software. It can be used, to increase the performance of a website or an application by caching objects.

c) Jofti

Jofti is a simple to use, faster object indexing and searching solution for Objects in a Caching layer or storage structure that supports the Map interface.

d) EHCache

A replacement for JCS, EHCache started out as some patches for JCS to correct threading and memory leak problems. EHCache is faster than JCS and acts as a pluggable cache for Hibernate 2.1.

e) JBossCache

JBoss Cache is a product designed to cache frequently accessed Java objects in order to dramatically improve the performance of e-business applications. By avoiding unnecessary database access, JBoss Cache decreases network traffic and increases the scalability of applications.

f) OSCache

OSCache is a widely used, high performance J2EE caching solution. Its features include fast in-memory caching, persistent caching, clustering support, flexible caching System, comprehensive API, exception handling, cache flushing, portable caching and i18n aware.

g) OpenJPA

ORM products also have in-built caching solutions. OpenJPA and hibernates are the best example of this. Apache OpenJPA is a Java persistence framework that can be used as a stand-alone POJO persistence layer or integrated into any Java EE compliant container and many other lightweight frameworks.

Page 6: lighten - HCL Technologies€¦ · A true In-Memory Distributed Data-Grid (not just the notion of distributed cache which is either based on replication or invalidation in distributed/clustered

Oracle Coherence cache for a high performance eCommerce Application | June 2010

�© 2010, HCL Technologies. Reproduction Prohibited. This document is protected under Copyright by the Author, all rights reserved.

h) Hibernate

Hibernate is a powerful, ultra-high performance object/relational persistence and query service for Java. Hibernate lets the developer develop persistent objects following common Java idiom - including association, inheritance, polymorphism, composition and the Java collections framework. Hibernate is now the most popular ORM solution for Java.

Coherence

What is Coherence?

Coherence provides data performance, reliability and scalability in distributed or replicated manner for caching services. Coherence is unique in utilizing a caching grid that is highly available and extensible, providing the ability to linearly add capacity as demand increases. It also runs outside of any applications in its own JVM which negates the need to tie it to any particular application.

Coherence supports unicast and multicast UDP clustering protocols. The cluster can add dynamically to the already existing coherence grid servers. If any server is disconnected from the network, it will automatically redistribute the data from other clustered nodes. Coherence provides continuous availability for application data and processing and provides standard JMX APIs.

Why Coherence?

For a typical e-commerce application with high performance & high scalability requirements, the following are the features which should be considered in a caching solution:

A true In-Memory Distributed Data-Grid (not just the notion of distributed cache which is either based on replication or invalidation in distributed/clustered environments)

Provides Number of Topologies (Distributed, Replicated, Distributed-Near & Local)

Write-Behind Caches (this is feature will be used for Synchronizing caches b/w Datacenters)

Dynamic Discovery and Automatic-Failover of Clustered Node in the Data-Grid

JTA Transactional support where the Cache acts as Resource

Reliable Peer-to-peer communication using a proprietary protocol over UDP and proprietary Object serialization, making the communication extremely faster

Page 7: lighten - HCL Technologies€¦ · A true In-Memory Distributed Data-Grid (not just the notion of distributed cache which is either based on replication or invalidation in distributed/clustered

Oracle Coherence cache for a high performance eCommerce Application | June 2010

�© 2010, HCL Technologies. Reproduction Prohibited. This document is protected under Copyright by the Author, all rights reserved.

Comprehensive Monitoring & Management

Cache partitioning

Although nowadays most of the open-source and commercial caching solutions provide the above set of features which makes it really difficult to choose one over the other, it seems Oracle Coherence is the most matured and reliable and has been considered as the caching/data-grid solution in rest of the document.

Cache Topologies

Coherence supports three types of caches. These are Distributed, Near and Replicated cache. Distributed cache is each node in the server containing a unique set of application data in the cache. To scale the capacity of cache, increase the nodes in the cluster. Any type of cache will involve serialization /de-serialization and network transfers for application data read and write access in the cache. Distributed cache is best when the application requires heavy volume of read and write application data.

Near cache is each client node containing small amount of data in the local cache and larger amount of data in the distributed cache and these caches are synchronized with each other. There is some overhead involved with synchronizing the caches.

Replicated cache is each node in the cluster will contains all the application data in the cache. Replicated cache is best when application requires less application data and highly read access from cache.

Coherence Cache Settings

Deciding the number of nodes, number of JVMs per node, size of the heap in JVM, type of serialization, sizing the cache, cache invalidation strategy and cache eviction policy.

Sizing

When trying to determine how many systems and how many JVMs per system are required, the coherence cache is a good starting point to determine how much data need to cache in the application. Once this is calculated all other factors can be decided later.

Initially find out the cached object size by perform the below stapes:

Enable the unit calculator to BINARY in the local scheme cache configuration file

<unit-calculator>BINARY</unit-calculator>

Page 8: lighten - HCL Technologies€¦ · A true In-Memory Distributed Data-Grid (not just the notion of distributed cache which is either based on replication or invalidation in distributed/clustered

Oracle Coherence cache for a high performance eCommerce Application | June 2010

�© 2010, HCL Technologies. Reproduction Prohibited. This document is protected under Copyright by the Author, all rights reserved.

Add JMX monitoring settings to the coherence grid server JVMsDcom.sun.management.jmxremote Dtangosol.coherence.management.remote=true Dtangosol.coherence.management=all Dcom.sun.management.jmxremote.port=<JMX port> Dcom.sun.management.jmxremote.authenticate=false Dcom.sun.management.jmxremote.ssl=false

Once the application is started, connect the coherence grid server JVM using JConsole and the mbean tab find out grid server cached objects size and count. Single cached object size can be calculated using this information.

Based on the application requirement, calculate the number of objects to be stored in the coherence grid server cache. Number of objects and size of the single cached object decide the heap size of JVM and no of grid server nodes.

To maintain a highly available cluster, sufficient cache servers should be deployed in the clusters. Generally provide one additional grid server, so that the application works well in grid server failure scenario also.

Serialization

Network is the important factor in the coherence, since over the wire lots of data will be transferred and all the clustered instances will be intercommunicated. Serializing the application objects in the cache will give better performance. POF serializer is the best one, as it consumes less serialization /deserialization time. Application objects should implement the PortableObject interface and the PortableObjectSerializer implementation is used as the PofSerializer.

Cache Invalidation Strategy

Invalidating objects in the cache will sync the local cache with grid server cache. Coherence has pre-defined events for the invalidation strategy. Based on the invalidation strategy local cache will be updated from back cache. Invalidation strategies are Listen None, Listen Present, Listen All and Listen Auto.

Listen none strategy is not to listen for invalidation objects. Based on the eviction policy data freshness will be maintained in the cache.

Listen present strategy is local cache will listen to the back cache only objects in the local cache.

Listen all strategy will listen to the back cache for the all the objects.

Listen auto strategy is local cache automatically use listen present and listen all based on the cache statistics. This is the default invalidation strategy.

•––––––

Page 9: lighten - HCL Technologies€¦ · A true In-Memory Distributed Data-Grid (not just the notion of distributed cache which is either based on replication or invalidation in distributed/clustered

Oracle Coherence cache for a high performance eCommerce Application | June 2010

�© 2010, HCL Technologies. Reproduction Prohibited. This document is protected under Copyright by the Author, all rights reserved.

Cache Eviction Policy

Mainly, eviction is used to free the memory in the cache and remove the unused objects in the cache. Eviction policies are LRU, LFU and HYBRID.

LRU is where cache will keep the objects those are all least recently used only.

LFU is the least frequently used eviction policy that chooses which entries to evict based on how often they are being accessed, evicting those that are accessed least first.

HYBRID is a cache that will keep the objects those are all how often and recently accessed, evicting those that are accessed least frequently and were not accessed for the longest period first. This is the default eviction policy.

TCMP Protocol

Coherence uses TCMP, a clustered IP-based protocol, for server discovery, cluster management, service provisioning and data transmission. To ensure true scalability, the TCMP protocol is completely asynchronous, meaning that communication is never blocked, even when many threads on a server are communicating at the same time. Further, the asynchronous nature also means that the latency of the network does not affect the cluster throughput, although it will affect the speed of certain operations. TCMP uses a combination of UDP/IP multicast, UDP/IP unicast and TCP/IP as follows:

a) Multicast

Cluster discovery: Is there a cluster already running that a new member can join?

Cluster heartbeat: The most senior member in the cluster issues a periodic heartbeat via multi-cast; the rate is configurable and defaults to once per second.

Message delivery: Messages that need to be delivered to multiple cluster members will often be sent via multicast, instead of unicasting the message one time to each member.

b) Unicast

Direct member-to-member (“point-to-point”) communication, including messages, asynchronous acknowledgments (ACKs), asynchronous negative acknowledgments (NACKs) and peer-to-peer heartbeats.

Under some circumstances, a message may be sent via unicast even if the message is directed to multiple members. This is done to shape traffic flow and to reduce CPU load in very large clusters.

Page 10: lighten - HCL Technologies€¦ · A true In-Memory Distributed Data-Grid (not just the notion of distributed cache which is either based on replication or invalidation in distributed/clustered

Oracle Coherence cache for a high performance eCommerce Application | June 2010

10© 2010, HCL Technologies. Reproduction Prohibited. This document is protected under Copyright by the Author, all rights reserved.

c) TCP

An optional TCP/IP ring is used as an additional “death detection” mechanism, to differentiate between actual node failure and an unresponsive node, such as when a JVM conducts a full GC.

TCP/IP is not used as a data transfer mechanism due to the intrinsic overhead of the protocol and its synchronous nature.

Multicast Vs WKA

Multicast is always best compared to WKA (Well Known Address) settings. But network infrastructure should also support Multicast, otherwise use the WKA. This configuration can be defined in the tangosol configuration xml file.

Node discovery can be done using multicast or well known address. The choice is made based on the network topology. If coherence clients and servers are in the same subnet then multicast is the best, since it will use less configuration. If clients and servers are located in the different subnet i.e. datacenters or network didn’t support multicast packets then well known address is used with predefined configuration. Disadvantage of WKA is to update the configurations and restart all the servers and clients for adding new coherence server to the cluster.

The element multicast listener is used to define the address and port that cluster will use for cluster wide and point to multipoint communications. All the clients and servers should use the same address and port. This multicast IP address and port will be used for socket listening or publishing. Sample configuration file for multicast is as follows:

<multicast-listener>

<address system-property=”tangosol.coherence.clusteraddress”>xx.xx.xx.xx</address>

<port system-property=”tangosol.coherence.clusterport”>9000</port>

<time-to-live system-property=”tangosol.coherence.ttl”>1</time-to-live>

<packet-buffer>

<maximum-packets>64</maximum-packets>

</packet-buffer>

<priority>8</priority>

<join-timeout-milliseconds>30000</join-timeout-milliseconds>

<multicast-threshold-percent>25</multicast-threshold-percent>

</multicast-listener>

Page 11: lighten - HCL Technologies€¦ · A true In-Memory Distributed Data-Grid (not just the notion of distributed cache which is either based on replication or invalidation in distributed/clustered

Oracle Coherence cache for a high performance eCommerce Application | June 2010

11© 2010, HCL Technologies. Reproduction Prohibited. This document is protected under Copyright by the Author, all rights reserved.

The element socket address is used to define the WKA list IP addresses and ports. If one or more WKA is specified, for a member to join the cluster it will either have to be WKA or there will have to be at least one WKA member running. If this is empty or unspecified then multicast communication will be used. Sample configuration file for WKA is as follows:

<unicast-listener>

<well-known-addresses>

<socket-address id=”1”>

<address>xx.xx.xx.xx</address>

<port>8088</port>

</socket-address>

</well-known-addresses>

</unicast-listener>

Coherence Extend

Coherence Extend extends the reach of the Coherence TCMP cluster to a wider ranch of consumers, including desktops, remote servers and machines located across WAN connections. Typical uses of coherence extend include providing desktop applications with access to coherence caches and coherence cluster “bridges” that link together multiple coherence clusters connected via a high-latency, unreliable WAN.

Coherence extend consists of two basic components. A client and a Coherence extend clustered service hosted by one or more DefaultCacheServer processes. The adapter library includes implementations of both the CacheService and InvocationService interfaces that route all requests to a Coherence extend clustered service instance running within the Coherence cluster. The Coherence extend clustered service in turn responds to client requests by delegating to an actual Coherence clustered service. The client adapter library and Coherence extend clustered service use a low-level messaging protocol to communicate with each other.

Transport bindings for these protocols are extend JMS and extend TCP. Extend JMS use existing JMS infrastructure to connect to the server. Extend TCP provides high performance and scalable TCP/IP based communication layer to connect to the server.

Page 12: lighten - HCL Technologies€¦ · A true In-Memory Distributed Data-Grid (not just the notion of distributed cache which is either based on replication or invalidation in distributed/clustered

Oracle Coherence cache for a high performance eCommerce Application | June 2010

1�© 2010, HCL Technologies. Reproduction Prohibited. This document is protected under Copyright by the Author, all rights reserved.

<remote-cache-scheme>

<scheme-name>example-remote</scheme-name>

<service-name>RemoteCache</service-name>

<initiator-config>

<tcp-initiator>

<remote-addresses>

<socket-address>

<address>hostname </address>

<port>9099</port>

</socket-address>

</remote-addresses>

</tcp-initiator>

<outgoing-message-handler>

<request-timeout>10s</request-timeout>

</outgoing-message-handler>

</initiator-config>

</remote-cache-scheme>

Example

Two data centers have the same application and data. One data center data is updated through application then other data center data is replicated by streaming in the backend database. Same functionality may be required in the application cache level also. This is achieved by coherence extend feature. Define one data center is a master server and other data center as a replica server. Here is the example of <remote-cache-scheme> element for master server configuration file as follows:

Page 13: lighten - HCL Technologies€¦ · A true In-Memory Distributed Data-Grid (not just the notion of distributed cache which is either based on replication or invalidation in distributed/clustered

Oracle Coherence cache for a high performance eCommerce Application | June 2010

1�© 2010, HCL Technologies. Reproduction Prohibited. This document is protected under Copyright by the Author, all rights reserved.

Example of <remote-cache-scheme> element for replica server configuration file as follows:

<proxy-scheme>

<service-name>Proxy</service-name>

<thread-count>4</thread-count>

<acceptor-config>

<tcp-acceptor>

<local-address>

<address>hostname</address>

<port>9099</port>

<reusable>true</reusable>

</local-address>

</tcp-acceptor>

</acceptor-config>

<autostart>true</autostart>

</proxy-scheme>

Page 14: lighten - HCL Technologies€¦ · A true In-Memory Distributed Data-Grid (not just the notion of distributed cache which is either based on replication or invalidation in distributed/clustered

Oracle Coherence cache for a high performance eCommerce Application | June 2010

1�© 2010, HCL Technologies. Reproduction Prohibited. This document is protected under Copyright by the Author, all rights reserved.

Coherence Configuration

Three configuration files are used in the coherence. Files are tangosol configuration xml, cache configuration xml and pof configuration xml.

Tangosol Config XML

Tangosol configuration xml file will specify the operational and runtime elements that control clustering, communication and data management services and override the tangosol configuration xml file in the java argument by using -Dtangosol.coherence.override. Example xml file as follows:

<?xml version=’1.0’?>

<coherence>

<cluster-config>

<member-identity>

<cluster-name system-property=”tangosol.coherence.cluster”>example-node1</cluster-name>

</member-identity>

<unicast-listener>

<well-known-addresses>

<socket-address id=”1”>

<address>xx.xx.xx.xx</address>

<port>8088</port>

</socket-address>

</well-known-addresses>

</unicast-listener>

<packet-publisher>

<packet-delivery>

<timeout-milliseconds>60000</timeout-milliseconds>

</packet-delivery>

</packet-publisher>

<service-guardian>

<timeout-milliseconds system-property=”tangosol.coherence.guard.timeout”>65000</timeout-milliseconds>

</service-guardian>

</cluster-config>

<logging-config>

<severity-level system-property=”tangosol.coherence.log.level”>5</severity-level>

<character-limit system-property=”tangosol.coherence.log.limit”>1048576</character-limit>

</logging-config>

</coherence>

Page 15: lighten - HCL Technologies€¦ · A true In-Memory Distributed Data-Grid (not just the notion of distributed cache which is either based on replication or invalidation in distributed/clustered

Oracle Coherence cache for a high performance eCommerce Application | June 2010

1�© 2010, HCL Technologies. Reproduction Prohibited. This document is protected under Copyright by the Author, all rights reserved.

Cache Config XML

Coherence cache attributes and settings are defined in the cache configuration xml file. Cache attributes are cache type, cache invalidation strategy, cache policy, type of serialization and objects count of the cache. Specify the cache configuration xml file in the java argument by using -Dtangosol.coherence.cacheconfig. Example xml file as follows:

<?xml version=”1.0”?>

<cache-config>

<caching-scheme-mapping>

<cache-mapping>

<cache-name>demoCache</cache-name>

<scheme-name>demo-near-content</scheme-name>

</cache-mapping>

</caching-scheme-mapping>

<caching-schemes>

<distributed-scheme>

<scheme-name>distributed-content</scheme-name>

<service-name>DistributedCache-pof</service-name>

<serializer>

<class-name>com.tangosol.io.pof.ConfigurablePofContext

</class-name>

<init-params>

<init-param>

<param-value>C:/coherence/pof-config.xml</param-value>

<param-type>String</param-type>

</init-param>

</init-params>

</serializer>

<backup-count>1</backup-count>

<thread-count>8</thread-count>

<backing-map-scheme>

<local-scheme>

<scheme-ref>dist-binary-backing-map</scheme-ref>

</local-scheme>

</backing-map-scheme>

<autostart>true</autostart>

</distributed-scheme>

Page 16: lighten - HCL Technologies€¦ · A true In-Memory Distributed Data-Grid (not just the notion of distributed cache which is either based on replication or invalidation in distributed/clustered

Oracle Coherence cache for a high performance eCommerce Application | June 2010

1�© 2010, HCL Technologies. Reproduction Prohibited. This document is protected under Copyright by the Author, all rights reserved.

<local-scheme>

<scheme-name>dist-binary-front-map</scheme-name>

<eviction-policy>HYBRID</eviction-policy>

<high-units>50000</high-units>

<expiry-delay>1h</expiry-delay>

</local-scheme>

<local-scheme>

<scheme-name>dist-binary-backing-map</scheme-name>

<eviction-policy>HYBRID</eviction-policy>

<high-units>1000000</high-units>

<expiry-delay>1h</expiry-delay>

</local-scheme>

<near-scheme>

<scheme-name>demo-near-content</scheme-name>

<front-scheme>

<local-scheme>

<scheme-ref>dist-binary-front-map</scheme-ref>

</local-scheme>

</front-scheme>

<back-scheme>

<distributed-scheme>

<scheme-ref>distributed-content</scheme-ref>

</distributed-scheme>

</back-scheme>

<invalidation-strategy>all</invalidation-strategy>

<autostart>true</autostart>

</near-scheme>

</caching-schemes>

</cache-config>

Page 17: lighten - HCL Technologies€¦ · A true In-Memory Distributed Data-Grid (not just the notion of distributed cache which is either based on replication or invalidation in distributed/clustered

Oracle Coherence cache for a high performance eCommerce Application | June 2010

1�© 2010, HCL Technologies. Reproduction Prohibited. This document is protected under Copyright by the Author, all rights reserved.

POF Config XML

This xml file is used to define the objects to serialize and deserialized by using pof serializer. The name and location of this file can be configured in the coherence cache configuration xml file or by the system property tangosol.pof.config. It is recommended that all the nodes within a cluster use identical POF user type descriptors. Example file as follows:

<?xml version=”1.0”?>

<!DOCTYPE pof-config SYSTEM “pof-config.dtd”>

<pof-config>

<user-type-list>

<include>coherence-pof-config.xml</include>

<user-type>

<type-id>1004</type-id>

<class-name>com.example.test.SampleOneDataImpl</class-name>

</user-type>

<user-type>

<type-id>1005</type-id>

<class-name> com.example.test.SampleTwoDataImpl</class-name>

</user-type>

<user-type>

<type-id>1006</type-id>

<class-name> com.example.test.SampleThreeDataImpl </class-name>

</user-type>

</user-type-list>

</pof-config>

The application cached objects should implement the com.tangosol.io.pof.PortableObject interface to achieve POF serialization. The portable object interface is a simple interface and has two methods. The two methods are readExternal() and writeExternal(). The below example shows the sample code for PortableObject interface implementation:

public void readExternal(final PofReader reader) throws IOException {

uidPk = reader.readLong(0);

value = reader.readString(1);

contentType = (SampleContentType) reader.readObject(2);

contentData = (Set<SampleContentData>) reader.readCollection(3, contentData); }

public void writeExternal(final PofWriter writer) throws IOException {

writer.writeLong(0, uidPk);

writer.writeString(1, value);

writer.writeObject(2, contentType);

writer.writeCollection(3, contentData, SampleOneDataImpl.class); }

Page 18: lighten - HCL Technologies€¦ · A true In-Memory Distributed Data-Grid (not just the notion of distributed cache which is either based on replication or invalidation in distributed/clustered

Oracle Coherence cache for a high performance eCommerce Application | June 2010

1�© 2010, HCL Technologies. Reproduction Prohibited. This document is protected under Copyright by the Author, all rights reserved.

Best PracticesThis section describes the best practices to follow while implementing the coherence cache in the application.

Cache Warm Up

Preloading all the objects in the coherence cache will give better performance in the application start up using cache warm up utility. Write a small program that warms application cache server by causing it to fetch the data directly from the source which in most cases is the database. This warm up program should be run in the application start up process.

Minimal Data Cache Size in Application Heap

Some applications contain large amount of application data that will cause performance and memory insufficiency problems. Coherence distributed cache provides the capability to have small amount of data in the application heap and remaining data in the coherence grid server cache. Adding more nodes in the cluster will increase the size of the cache. Usually small amount of the application data being used in the coherence local cache depends upon the application requirement.

Number of JVMs per System

Depending on the application data size in the cache and number of processors per node, the number of JVMs per node is decided. More JVMs will cause context switching and contention on processors. Less number of JVMs will lead to the memory insufficiency problem. More than one JVMs per server will impact the network resources efficiency.

Common IssuesThis section describes the common issues encountered when implementing coherence. This section guides the reader how to address these issues and solve them.

Unique Cluster Name

All the coherence client and grid server instances should have a unique cluster name. Based on this name all the client’s applications will be connected to the coherence grid servers. This name is defined in the tangosol configuration xml file as follows:<cluster-name system-property=”tangosol.coherence.cluster”>demo-sample1</cluster-name>

If this cluster name is not properly defined, then some other application which is also using coherence within the network will be participated in the coherence cluster. This will lead to weird behavior in the application like optimistic lock issue from the DAO layer or dead lock threads.

Page 19: lighten - HCL Technologies€¦ · A true In-Memory Distributed Data-Grid (not just the notion of distributed cache which is either based on replication or invalidation in distributed/clustered

Oracle Coherence cache for a high performance eCommerce Application | June 2010

1�© 2010, HCL Technologies. Reproduction Prohibited. This document is protected under Copyright by the Author, all rights reserved.

Communication Delay Warning

If the application experience frequent multi-second communication pauses across multiple cluster, nodes may be required to increase switch’s buffer space. These communication pauses can be identified by a series of Coherence log messages identifying communication delays with multiple nodes which are not attributable to local or remote GCs.

Experienced a 8275 ms communication delay (probable remote GC) with Member(Id=7, Timestamp=2010-04-03 12:15:47.511, Address=255.255.255.255:8088, MachineId=13838); 320 packets rescheduled, PauseRate=0.31, Threshold=512

Some switches such as the Cisco 6500 series support configuration the amount of buffer space available to each ethernet port or ASIC. In high load applications it may be necessary to increase the default buffer space. On Cisco this can be accomplished by executing:

fabric buffer-reserve high

Some other switches contact with oracle coherence.

UnicastUdpSocket Failed Warning

Experiencing packet loss warning message in the coherence may be need to increase the maximum socket buffer size to get good performance. Warning message as follows:

2010-01-14 22:35:30.606/2.051 Oracle Coherence GE 3.5/459 <Warning> (thread=Main Thread, member=n/a): UnicastUdpSocket failed to set receive buffer size to 1428 packets (2096304 bytes); actual size is 92 packets (135168 bytes). Consult your OS documentation regarding increasing the maximum socket buffer size. Proceeding with the actual value may cause sub-optimal performance.

Though it is safe to operate with the smaller buffers it is recommended that configure OS to allow for larger buffers.

On Linux execute (as root):

sysctl -w net.core.rmem_max=2096304

sysctl -w net.core.wmem_max=2096304

On Solaris execute (as root):

ndd -set /dev/udp udp_max_buf 2096304

On AIX execute (as root):

no -o rfc1323=1

no -o sb_max=4194304

Page 20: lighten - HCL Technologies€¦ · A true In-Memory Distributed Data-Grid (not just the notion of distributed cache which is either based on replication or invalidation in distributed/clustered

Oracle Coherence cache for a high performance eCommerce Application | June 2010

�0© 2010, HCL Technologies. Reproduction Prohibited. This document is protected under Copyright by the Author, all rights reserved.

1�. Conclusion

Coherence has extensive features like dynamic high availability, automatic load balancing, transactional consistency, capacity on demand, grid wide JMX for management and monitoring and true linear scalability. It is important to have a matured and comprehensive caching solution to meet the non-functional requirements (performance & scalability) of a typical e-commerce application, and if the budget can afford, Coherence is definitely a reliable proposition to meet the non-functional objectives.

1�. Reference

Website:

http://www.oracle.com/technology/products/coherence/index.html

http://www.manageability.org/blog/stuff/distributed-cache-java

http://java-source.net/open-source/cache-solutions

About the Author

Rajendran Sundaram is a sun certificated 1.4 (SCJP) professional. He has 5 years and 8 months of experience in J2EE application development, performance analysis, tuning and fixing complex production issues, and is working as a performance analyst with HCL Technologies Limited Chennai.

Page 21: lighten - HCL Technologies€¦ · A true In-Memory Distributed Data-Grid (not just the notion of distributed cache which is either based on replication or invalidation in distributed/clustered

ENGINEERING AND R&D SERVICES

CUSTOM APPLICATION SERVICES

ENTERPRISE APPLICATION SERVICES

ENTERPRISE TRANSFORMATION SERVICES

IT INFRASTRUCTURE MANAGEMENT

BUSINESS PROCESS OUTSOURCING

About HCL

About HCL Technologies

HCL Technologies is a leading global IT services company, working with clients in the areas that impact and redefine the core of their businesses. Since its inception into the global landscape after its IPO in 1999, HCL focuses on ‘transformational outsourcing’, underlined by innovation and value creation, and offers integrated portfolio of services including software-led IT solutions, remote infrastructure management, engineering and R&D services and BPO. HCL leverages its extensive global offshore infrastructure and network of offices in 26 countries to provide holistic, multi-service delivery in key industry verticals including Financial Services, Manufacturing, Consumer Services, Public Services and Healthcare. HCL takes pride in its philosophy of ‘Employee First’ which empowers our 58,129 transformers to create a real value for the customers. HCL Technologies, along with its subsidiaries, had consolidated revenues of US$ 2.6 billion (Rs. 12,048 crores), as on 31st March 2010 (on LTM basis). For more information, please visit www.hcltech.com

About HCL Enterprise

HCL is a $5 billion leading global Technology and IT Enterprise that comprises two companies listed in India - HCL Technologies & HCL Infosystems. Founded in 1976, HCL is one of India’s original IT garage start-ups, a pioneer of modern computing, and a global transformational enterprise today. Its range of offerings spans Product Engineering, Custom & Package Applications, BPO, IT Infrastructure Services, IT Hardware, Systems Integration, and distribution of ICT products across a wide range of focused industry verticals. The HCL team comprises over 62,000 professionals of diverse nationalities, who operate from 26 countries including over 500 points of presence in India. HCL has global partnerships with several leading Fortune 1000 firms, including leading IT and Technology firms. For more information, please visit www.hcl.in

Hello, I’m from HCL’s Engineering and R&D Services. We enable technology led organizations to go to market with innovative products & solutions. We partner with our customers in building world class products & creating the associated solution delivery ecosystem to help build market leadership. Right now, 13000+ of us are developing engineering products, solutions and platforms across Aerospace and Defense, Automotive, Consumer Electronics, Industrial Manufacturing, Medical Devices, Networking & Telecom, Office Automation, Semiconductor, Servers & Storage for our customers.

For more details contact [email protected]

Follow us on twitter http://twitter.com/hclers and our blog http://ers.hclblogs.com/

Visit our website http://www.hcltech.com/engineering-services/