was liberty at scale

34
Click to edit Master © 2014 IBM Corporation 10,000 Servers and Climbing Achieving Liberty at Scale Session AAI-2827 Michael C Thompson [email protected]

Upload: sflynn073

Post on 17-Jul-2015

95 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Was liberty at scale

Click to edit Master subtitle style

© 2014 IBM Corporation

10,000 Servers and Climbing – Achieving Liberty at Scale Session AAI-2827 Michael C Thompson [email protected]

Page 2: Was liberty at scale

Agenda "The Mission"

Topology overview

Get stressed!

Tuning details

System test environment

Page 3: Was liberty at scale

WAS System Test had a mission • Build an "Internet Scale" Liberty Collective topology

− 10,000 collective members

− Stress the system management layer

− Stress against applications running on Liberty

− All over 7+ days

• Monitor & watch it go!

3

Page 4: Was liberty at scale

History Collective Scale in System Test

4

8.5.5.0 8.5.5.1 8.5.5.2 8.5.5.3 8.5.5.40

2000

4000

6000

8000

10000

12000

0

2000

4000

6000

8000

10000

12000

TargetActual

* Initial test was larger than Full Profile by 2,000 servers

5 Controllers 3 Controllers + Application Workload

* + MBean Stress

Presenter
Presentation Notes
History of scale, over time size of deployment and test strategies In 8.5.5 3 Controllers and 2200 Members Larger than Full Profile previous topology of 2000 Members In 8.5.5.1 5 Controllers and 5000 Members In 8.5.5.2 5 Controllers and 10,000 Members In 8.5.5.3 5 Controllers and 10,000 Members In 8.5.5.4 5 Controllers and 10,000 members plus Application Workload
Page 5: Was liberty at scale

Topology Overview

Page 6: Was liberty at scale

Internet Scale Collective Topology • 5 IHS Servers

− 5 Virtual Machines

• 5 Collective Controllers − 5 Virtual Machines

• 10,000 Collective Members − 225 Collective Members per Virtual Machine

− 2000 per Collective Controller

• 5 clusters − 2,000 members each

• 1 application (PingServlet) per member

6

Page 7: Was liberty at scale

Internet Scale Collective Topology

7

Collective Controller Replica Set

CC CC

CC CC CC

Machine Boundary

AppServer AppServer Liberty

Profile Clustered App IHS

IHS IHS IHS IHS

Collective

AppServer AppServer Liberty

Profile Clustered App

AppServer AppServer Liberty

Profile Clustered App

Page 8: Was liberty at scale

Topology – IHS Servers

• WebSphere Application Server 8.5.5 IHS • Hosted on VMWare ESX

• 4 CPU with 16 GB of RAM • Red Hat 6.5 x64

• Hosting merged plugin-cfg.xml for 2000 Liberty Servers

• Tuning Parameters

• standard application workload tuning

8

Page 9: Was liberty at scale

Topology – Collective Controller • WebSphere Liberty Profile 8.5.5.4 • Hosted on VMWare ESX

• 6 CPU with 32 GB of RAM • Red Hat 6.5 x64

• Features used in server.xml

<feature>jsp-2.2</feature> <feature>collectiveController-1.0</feature> <feature>restConnector-1.0</feature> <feature>monitor-1.0</feature> <feature>adminCenter-1.0</feature>

• Tuning Parameters

• OS: ulimit file handles increased • Java: heap size increased • WLP: thread pool increased

9

Page 10: Was liberty at scale

Topology – Liberty Collective Member • WebSphere Liberty Profile 8.5.5.4 • Hosted on VMWare ESX

• 8 CPU with 64 GB of RAM • Red Hat 6.5 x64

• Hosting one application

• Features used in server.xml <feature>jsp-2.2</feature> <feature>collectiveMember-1.0</feature> <feature>clusterMember-1.0</feature> <feature>restConnector-1.0</feature> <feature>monitor-1.0</feature>

• Tuning Parameters

• OS: ulimit file handles increased • WLP: TCP configuration (for application workload)

10

Page 11: Was liberty at scale

Get Stressed!

Page 12: Was liberty at scale

Management & Monitoring Workload • Apply stress at the system management layer

• Invocation of Liberty MBeans through REST connector

• ThreadPool – Display Active Threads and Pool Size

• JVM Statistics – Display UsedMemory, FreeMemory, and Heap Size

• File Transfer Operation – Transfer files of various sizes from Collective Controller to Collective members

• Continuously over a period of 7 days

12

Page 13: Was liberty at scale

Application Workload • Light-weight application workload: pingServlet

• Other Persona scenarios cover application workload

• Continuously over a period of 7 days

13

Page 14: Was liberty at scale

Stress Workload Flow

14

Collective Controller

Collective Controller

Collective Controller

Collective Controller

Collective Controller

2000 Servers 150-200 Clusters

2000 Servers 150-200 Clusters

2000 Servers 150-200 Clusters

2000 Servers 150-200 Clustes

2000 Servers 150-200 Clusters

System Management

Load

IHS

IHS

IHS

Application Traffic

Application Traffic

Application Traffic

IHS

IHS

Application Traffic

Application Traffic

Page 15: Was liberty at scale

Tuning Details

Page 16: Was liberty at scale

IHS – configuration & tuning

• No changes required to handle large scale collective • Collective size does not impact application workload

– No application workload on controllers

• Modified httpd.conf to accommodate general application stress

(followed standard practices for application load) MaxClient incremented to 1600 (up from 600)

16

Page 17: Was liberty at scale

Collective Controller – configuration & tuning • server.xml

<!-- Increase the operation timeouts to 10m, up from 1m for long running gen cluster plugin config --> <serverCommands startServerTimeout="600" stopServerTimeout="600" /> <executor name="LargeThreadPool" id="default" coreThreads="150" maxThreads="400" keepAlive="120s" stealPolicy="STRICT" rejectedWorkPolicy="CALLER_RUNS" />

• jvm.options

-Xms512m -Xmx12288m -verbose:gc -Xdump:heap -Xverbosegclog:logs/verbosegc.log

• OS tuning ulimit max files 20,000

17

Presenter
Presentation Notes
if the ulimit number is too low, the server logs will clearly indicate it. The heap of 12 GB was used for failover scenarios, 8 GB is observed to be sufficient in normal flows. Controller is the only process on the VM. #ulimit -a core file size (blocks, -c) 4096 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 1032388 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 80,000 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 10240 cpu time (seconds, -t) unlimited max user processes (-u) 1032388 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited
Page 18: Was liberty at scale

Collective Member – configuration & tuning • Use the collective configuration defaults

− heartbeat interval (1m) & controller read timeout (5m)

• server.xml

<!-- Dictated by your application -->

• jvm.options -Xms128m -Xmx256m -verbose:gc -Xdump:heap -Xverbosegclog:logs/verbosegc.log

• OS tuning ulimit max files 8192

18

Page 19: Was liberty at scale

Key configuration & tuning takeaways • No WLP tuning configuration to handle large scale collective

• Controller requires JVM and OS tuning to accommodate large data set

• Modify timeouts if using long running operations

• Collective size does not impact application workload

• Best practice: no application workload on controllers

• Tune your servers as you would normally

19

Page 20: Was liberty at scale

System Test Environment

Page 21: Was liberty at scale

Unlike Rome, collectives can be built in a day • In-house scripts built on standard Unix operations

• Time to build: 5 - 6 hours of time.

• jython scripting for MBean invocation

− Generating plugin-cfg.xml for 2,000 cluster members takes time

Set jython script timeout appropriately (20 min for 2k cluster)

• Not using DevOps tools (yet)

21

Page 22: Was liberty at scale

Many paths to the same result...

22

Manual

Scripts

DevOps Tools

(UrbanCode Deploy, Chef, Puppet)

Admin Center

Liberty Commands

Liberty Collective

Admin Center

Many ways to execute commands…

To yield the same results

Presenter
Presentation Notes
Reference other DevOps sessions.
Page 23: Was liberty at scale

Use in Continuous Persona (CP) • WAS Liberty is executing continuous persona (system test)

− New initiative

• CP uses collectives to run system-level tests in a mixed runtime

level

− Does not include 10k scale

• CP uses mixed-version runtimes in collective for continuous update and test

− Uses A/B testing practices to ensure newer versions do not regress behaviour or function

23

Page 24: Was liberty at scale

A work in progress • Member failover to another controller

− Jan'15 Beta: Small-scale failover tested @ 600 member collective (125 members per controller)

− Feb'15 Beta: 5,000 member collective fail over tested @ 1,000 per controller

− Failover does not impact application work load

• Multiple concurrent server joins can result in incomplete requests.

• Member registration time increases as we approach very large scale

24

Presenter
Presentation Notes
This means highly available replica set for large topologies does not work. Small topologies HA works ok. This can lead to a non-recoverable error that requires a collective rebuild.
Page 25: Was liberty at scale

In Summary • Minimal tuning required to get to large scale

• Controller JVM and OS tuning required to accommodate large data set

• Collective size does not impact application workload

• Best practice: no application workload on controllers

• Large scale collective is stable for mixed management operations and application workload

• On-going improvements for management operations performance and failure scenarios

• Management failover does not impact application workload

25

Page 26: Was liberty at scale

Further Reference material • Building a large scale WebSphere Application Server Liberty

collective topology (white paper) http://www.ibm.com/developerworks/websphere/library/techarticles/1309_yu/1309_yu.html

• Tuning the Liberty profile (Knowledge Center)

http://www-01.ibm.com/support/knowledgecenter/SSD28V_8.5.5/com.ibm.websphere.wlp.core.doc/ae/twlp_tun.html

• Best Practices for Large WebSphere Topologies http://www.ibm.com/developerworks/websphere/library/techarticles/0710_largetopologies/0710_largetopologies.html

26

Page 27: Was liberty at scale

Live from Raleigh, NC

Presenter
Presentation Notes
https://rsklnx46.rtp.raleigh.ibm.com:9443/adminCenter/#explore https://rsklnx47.rtp.raleigh.ibm.com:9443/adminCenter/#explore Login: a / a
Page 28: Was liberty at scale

Questions?

Page 29: Was liberty at scale

Related Sessions – Tuesday

29

AAI-3281 Smarter Production with WebSphere Application Server ND Intelligent Management Tues, 24-Feb 05:30 PM - 06:30 PM, Mandalay Bay - Surf Ballroom A

AAI-2827 Problem Determination Tools and Strategies for Liberty and Full Profile WAS Tues, 24-Feb 05:30 PM - 06:30 PM, Mandalay Bay - Mandalay Ballroom B

Page 30: Was liberty at scale

Related Sessions – Wednesday

30

AAI-1445 Managing Dynamic Workloads with WAS ND and in the Cloud Wed, 25-Feb 09:30 AM - 10:30 AM, Mandalay Bay - Reef Ballroom E

AAI-3228 DevOps Tools and WebSphere Application Server Wed, 25-Feb 09:30 AM - 10:30 AM, Mandalay Bay - Surf Ballroom A

AAI-3590 Best Practices for Configuring and Managing Large WebSphere Topologies Wed, 25-Feb 02:00 PM - 03:00 PM, Mandalay Bay - Reef Ballroom E

AAI-3218 Production Deployment Best Practices for the IBM WebSphere Liberty Profile Wed, 25-Feb 05:30 PM - 06:30 PM, Mandalay Bay - Surf Ballroom F

Page 31: Was liberty at scale

Related Customer Feedback Roundtables

31

AAI-3319 Shaping the Future of WebSphere Liberty Admin Center

Tue, 24-Feb 05:30 PM - 06:30 PM, Mandalay Bay - Coral A Wed, 25-Feb 09:30 AM - 10:30 AM, Mandalay Bay - Coral A Thu, 26-Feb 09:00 AM - 10:00 AM, Mandalay Bay - Tropics B

AAI-2810 Problem Determination and Troubleshooting Full Profile and Liberty Servers Wed, 25-Feb 09:30 AM - 10:30 AM, Mandalay Bay - Tropics B Wed, 25-Feb 03:30 PM - 04:30 PM, Mandalay Bay - Tropics B

Page 32: Was liberty at scale

Notices and Disclaimers Copyright © 2015 by International Business Machines Corporation (IBM). No part of this document may be reproduced or transmitted in any form without written permission from IBM.

U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM.

Information in these presentations (including information relating to products that have not yet been announced by IBM) has been reviewed for accuracy as of the date of initial publication and could include unintentional technical or typographical errors. IBM shall have no responsibility to update this information. THIS document is distributed "AS IS" without any warranty, either express or implied. In no event shall IBM be liable for any damage arising from the use of this information, including but not limited to, loss of data, business interruption, loss of profit or loss of opportunity. IBM products and services are warranted according to the terms and conditions of the agreements under which they are provided.

Any statements regarding IBM's future direction, intent or product plans are subject to change or withdrawal without notice.

Performance data contained herein was generally obtained in a controlled, isolated environments. Customer examples are presented as illustrations of how those customers have used IBM products and the results they may have achieved. Actual performance, cost, savings or other results in other operating environments may vary.

References in this document to IBM products, programs, or services does not imply that IBM intends to make such products, programs or services available in all countries in which IBM operates or does business.

Workshops, sessions and associated materials may have been prepared by independent session speakers, and do not necessarily reflect the views of IBM. All materials and discussions are provided for informational purposes only, and are neither intended to, nor shall constitute legal or other guidance or advice to any individual participant or their specific situation.

It is the customer’s responsibility to insure its own compliance with legal requirements and to obtain advice of competent legal counsel as to the identification and interpretation of any relevant laws and regulatory requirements that may affect the customer’s business and any actions the customer may need to take to comply with such laws. IBM does not provide legal advice or represent or warrant that its services or products will ensure that the customer is in compliance with any law.

Page 33: Was liberty at scale

Notices and Disclaimers (con’t)

Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products in connection with this publication and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. IBM does not warrant the quality of any third-party products, or the ability of any such third-party products to interoperate with IBM’s products. IBM expressly disclaims all warranties, expressed or implied, including but not limited to, the implied warranties of merchantability and fitness for a particular purpose.

The provision of the information contained herein is not intended to, and does not, grant any right or license under any IBM patents, copyrights, trademarks or other intellectual property right.

• IBM, the IBM logo, ibm.com, Bluemix, Blueworks Live, CICS, Clearcase, DOORS®, Enterprise Document Management System™, Global Business Services ®, Global Technology Services ®, Information on Demand, ILOG, Maximo®, MQIntegrator®, MQSeries®, Netcool®, OMEGAMON, OpenPower, PureAnalytics™, PureApplication®, pureCluster™, PureCoverage®, PureData®, PureExperience®, PureFlex®, pureQuery®, pureScale®, PureSystems®, QRadar®, Rational®, Rhapsody®, SoDA, SPSS, StoredIQ, Tivoli®, Trusteer®, urban{code}®, Watson, WebSphere®, Worklight®, X-Force® and System z® Z/OS, are trademarks of International Business Machines Corporation, registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at "Copyright and trademark information" at: www.ibm.com/legal/copytrade.shtml.

Page 34: Was liberty at scale

Thank You Your Feedback is

Important!

Access the InterConnect 2015 Conference CONNECT Attendee Portal to complete your session surveys from your smartphone, laptop or conference

kiosk.