system automation, integration and recovery
DESCRIPTION
WebMD Health Corp.: agile system automation, integration and recovery using HP Server Automation and HP Operations OrchestrationTRANSCRIPT
1 ©2010 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice
WebMD Health Corp.: agile system automation, integration and recovery using HP Server Automation and HP Operations Orchestration
– Derek Chang Manager, WebMD
– Roger HsuManager, WebMD
2Introduction & Agenda
TopicsTopics
Who are we and where we standInfrastructure LayoutMiddleware IntegrationHP OO preparationApplication Administration AutomationBuild Deployment AutomationUnattended WebMD Content BackupMaintenance Free SystemResults from HP SA/OO Implementation Q/A
3
cmsopscmsopsResponsibility– Provide Maintenance and 24x7 support of CMS applications and their
subsystems in production environment– Perform production system patches, bug fixes or software releases and other
build deployments.– Support ongoing releases and developments in non-production environments– Define/document production support requirements, escalation procedures,
issue tracking and guidelines for troubleshooting and build deployments.Resource: 4.5 headcounts*Universe– 300+ internal users– SDLC environments: dev/devint/qa00/qa01/qa02/perf/production– 130 servers– 4.4 TB of NAS storage for raw contents and site contents– Infrastructure: Zenoss, HPSA, HPOO, Serena teamtrack, MOSS, MSSQL/Oracle
Core technology– EMC Documentum– Proprietary applications
4
DocumentumDocumentum
An enterprise content management platform, now delivered by EMC Corporation, as well as the name of the software company that originally developed the technology.Flexible, versatile, powerful yet complex platformImplementation in WebMD– 2 major portal sites– 6 Documentum products– Proprietary content editor for advanced features– Proprietary page transformer– Proprietary utilities: 15 applications
5
ChallengesChallenges
Documentum is a new technologyDocumentum is a rare expertiseComplexity of the CMSCmsops support users within the companyWebMD is a fast growing company
6
Life in cmsopsLife in cmsops
Sampling duration: Oct 11,2007 – Jul 24, 2009653 days/426 working daysSource: customized teamtrack reports and emailsSummary1772 teamtrack requests479 email requests*5.3 tickets/working day
7
Our ApproachOur ApproachDevelop and utilize process templatesStandardize and adopt the development modelIdentify what processes to be automated– Routine/mundane activities– Human interactions cause
error/failure– Much longer
Lifecycle/Service time than development time
8
Infrastructure Layout
9
Infrastructure LayoutInfrastructure Layout
Opsware OO
Central and RAS
SAS OCLI client
Scheduler engine
NRAS
Web interface
Workflow engine
Repository
JRAS
SAS web services client
Build server
RHEL4u4_32BIT VM
Opsware agent
NAS/Build repository
OCLI 1.0
Web interface
Rpm/msi package tools
Opsware SAS
Central and RAS
Twister
OCLI engine
Web services engine
Web interface
Software repository
Server repository
JAVA API
Opsware agent engine
Middleware integration
Jboss 5.0
XML module
Email adapter
Email sender
Web interface
OO client
Teamtrack client
LDAP module
Data modeling
Corporate infrastructure
Active Directory
Teamtrack
Win2K3
Web interface
Business mashup engine
Web services
PAS LAB
RHEL4u6_64BIT VM
Opsware agent
App server(s)
Code base
QA/DEV Clients
Exchange server
10
Middleware Integration
11
Middleware IntegrationMiddleware Integration
Description– The core of the automation system– Connections among ticketing, monitoring, and system
administration tools within WebMD operations.– Providing operation tools without users accessing
underlying systems/tools
12
Middleware IntegrationMiddleware Integration
Ticketing system integration– Use web services to connect Serena Business Mashup
(TeamTrack)– Pull information from tickets and pass data to other
systems such as HP OO– Update tickets after automation operation
13
Middleware IntegrationMiddleware Integration
System Administration (HP SA/OO) integration– Java bean uses OO library to trigger OO workflow
– Parse the workflow result (XML format) to get:• OO flow id and report URL• Start time and end time• OO flow response and result
RSFlowInvoke rsf = new RSFlowInvoke();rsf.setUrl(url+flowName+paraString);rsf.setUsername(user);rsf.setPassword(pw);result = rsf.invoke();
14
Middleware IntegrationMiddleware Integration
Web Application– Allows users to use the automation tools via a web
browser over network to prevent access to underlying systems/tools such as HP OO directly
– Uses Ajax and Richfaces technologies to provide dynamic and intuitive user experiences
– Developed under JBoss Seam framework– Adopts Hibernate as Database layer framework
15
Middleware IntegrationMiddleware Integration
Security and User Authorization– Integrates with WebMD LDAP servers that allows users to
access the system with their WebMD id/password– JBoss Rules engine provides access control based on
WebMD LDAP groups of each user
16
HP OO Preparation
17
HP OO PreparationHP OO Preparation
Identify basic/out of the box OO operations– SSH– Windows Remote Command Execution– Change IIS status– Change Windows service status– OCLI to access HP SA– Iterator, Email CDO, …etc– Database operations (oracle/mssql)
Modulization and utility workflows– Use OO operations to build up utility workflows that will
be re-used frequently
18
HP OO PreparationHP OO Preparation
HostsSSH: run Linux commands in a list of hostsGiven a list of hosts to Iterator (PAS out-of-box operation)
SSH Command (PAS out-of-box operation)
Call Error Notice flow
19
HP OO PreparationHP OO Preparation
HostsWinCommand: run Windows commands in a list of hosts
Given a list of hosts to Iterator (PAS out-of-box operation)
SSH Command (PAS out-of-box operation)
Call Error Notice flow
20
HP OO PreparationHP OO Preparation
IIS Flows:– HostIISSites: control multiple IIS Sites on single host– HostsIISSites: control multiple IIS Sites on multiple hosts
Multiple hosts, multiple sites Single host, multiple sites
Given a list of hosts Given a list of sites
21
HP OO PreparationHP OO Preparation
Window Services flows:– HostWinSvcsCtrl: control multiple services on single host– HostsWinSvcsCtrl: control multiple services on multiple hosts
Multiple hosts, multiple services Single host, multiple services
22
Application Administration Automation
23
Application Administration AutomationApplication Administration Automation
Goal: Develop OO workflows to stop/start WebMD applications and sites
Workflow key features– Identify target servers– Windows: stop/start windows svc and IIS sites– Linux: stop/start applications and run any script if needed– Send error/success email notices
24
Application Administration AutomationApplication Administration Automation
Users pick available host type and environment based on the permission given to their LDAP groups
Login as consumer QA user
Consumer users are NOT allowed to pick professional hosts
QA users controls QA environments only
25
Application Administration AutomationApplication Administration Automation
Users hit one of the action buttons
User hits “Query Servers”
26
Application Administration AutomationApplication Administration Automation
Web application then triggers corresponding HP OO workflowOO workflows connect HP SA with OCLIHP SA takes actions on target hosts
27
Application Administration AutomationApplication Administration Automation
The OO workflows sends the result back to middleware in XML formatMiddleware parses the XML and display the result in GUI
dmas qa00 server
28
Application Administration AutomationApplication Administration Automation
Users receive email notices
29
Application Administration AutomationApplication Administration Automation
Application Administration workflows:– Documentum Content Servers– Documentum Application Servers– ATS: WebMD proprietary content transformer– PATS: WebMD proprietary content transformer– Page Builder: WebMD proprietary content editor
30
Application Administration AutomationApplication Administration Automation
WebMD Content Servers
Decision: start or shutdown
Stop when query servers only
OCLI Query Servers based on portal, product, host type, and environment
Initiate variables based on portal
Start/stop SCS (HostsSSH)
Start/stop JMS (HostsSSH)
Start/stop doc base (HostsSSH)
Clean up doc base (HostsSSH)
Send email notice when finishes
31
Application Administration AutomationApplication Administration Automation
WebMD Application ServersOCLI: query server list against SAS
HostsSSH: run commands in each host in the list
for i in `/opsw/api/com/opsware/server/ServerService/method/.findServerRefs:i filter='${filterString}'`;
do /opsw/api/com/opsware/server/ServerService/method/getServerVO self:i="$i";
done
{device_servergroup_name equal_to "${portal}"} & {device_servergroup_name equal_to "${product}"} & {device_servergroup_name equal_to "${hostType}"} & {device_servergroup_name equal_to "${environment}"}
Filter String
OCLI command
HostsSSH: run commands in each host in the list
35
Build Deployment Automation
36
Build Deployment AutomationBuild Deployment Automation
Goal: Develop an OO workflow to build RPM and deploy it to target servers
Workflow key features:– Identify target servers, software policy and RPM in HP SA– Build RPM and upload it to HP SA– Stop/start applications in target servers– Detach/attach software policies and remediate target
servers– Update RPM in software policies
37
Build Deployment AutomationBuild Deployment Automation
Workflow inputs:– Portal– Product– Host Type– Application– Environment– Build Version
38
Build Deployment AutomationBuild Deployment Automation
Identify target servers– Setup server groups in HP SA: portal groups, product
groups, host type groups, and environment groups; then assign servers to appropriate groups
Host type group
Product group
Portal group
Environment group
39
Build Deployment AutomationBuild Deployment Automation
Identify target servers (Cont.)– Use OO SSH operation to execute OCLI command to get
SAS server list• OCLI: findServerRefs and getServerVO in server service• Filter: Use aforementioned server groups as filter
{device_servergroup_name equal_to "${portal}"} & {device_servergroup_name equal_to "${product}"} & {device_servergroup_name equal_to "${hostType}"} & {device_servergroup_name equal_to "${environment}"}
Filter String
for i in `/opsw/api/com/opsware/server/ServerService/method/.findServerRefs:i filter='${filterString}'`;
do /opsw/api/com/opsware/server/ServerService/method/getServerVO self:i="$i";
done
OCLI command
40
Build Deployment AutomationBuild Deployment Automation
Identify software policy & RPM– Software Policy naming in HP SA:
{Application} – {Environment}– Use findSoftwarePolicyRefs OCLI command to identify
software policy– Use findRPMRefs OCLI command to identify RPM
41
Build Deployment AutomationBuild Deployment Automation
Build RPM and upload it to HP SA– Required parameters: application and build version– A Perl application on Apache to build RPM– Client sends HTTP request with parameters to trigger the
Perl application– Upload the RPM to HP SA with OCLI 1.0– Get the result back to the client
42
Build Deployment AutomationBuild Deployment Automation
Stop/start applications in target servers– Use “HostsSSH: run Linux commands in a list of hosts”
utility workflow to run stop/start command on target hosts
Detach/attach software policies and remediate target servers– Use OO out-of-box operations
Update RPM in software policies– Use OCLI update command in software policy service to
replace RPM in target software policy
43
Build Deployment AutomationBuild Deployment Automation
Put it all together!
Build and upload RPM
Identify SP, RPM, and target servers
Start/stop application in target servers
Detach/attach SP, replace RPM in SP, and Remediate
53
Unattended WebMD Content Backup
54
Unattended WebMD Content BackupUnattended WebMD Content Backup
Goal: Develop two OO workflows: 1. shutdown all components and backup WebMD contents. 2. bring all components up
Workflow key features:– Identify target servers– Windows: stop/start windows svc and IIS sites– Linux: stop/start applications and run any script if needed– Send error/success email notices– Utilize OO scheduler to trigger cold backup– The workflow needs to setup another schedule to trigger
another flow to bring up all components
55
Unattended WebMD Content BackupUnattended WebMD Content Backup
Workflows OverviewFlow 1:
1. Shut down all components2. Run file back up3. Run DB backup4. Schedule another flow (flow 2) to start all components
Flow 2:1. Check backup status2. Start all components
60 min
62
Maintenance Free System
63
Maintenance Free SystemMaintenance Free System
Goal: Proactively maintain the health of our applications without shutting them down
Workflow key features:– Automatically clear cache and stale data without
shutting down or restarting applications– Purge outdated publishing data and logs– Ensures that the most relevant information is retained.– Improves both system-level and publishing performance.– Minimize the need for frivolous restarts. – Keep our applications online longer
64
Maintenance Free SystemMaintenance Free System
Workflow details– Single SSH Node– Runs a script to purge data/log files older than 3 days– Runs on OO scheduler once a day
65
Results from HP SA/OO Implementation
66
Better Life in cmsops - 1Better Life in cmsops - 1
Sampling duration: Oct 11,2007 – Jul 24, 2009653 days/426 working daysSource: customized teamtrack reports and emailsSummary1772 teamtrack requests479 email requests*5.3 tickets/working day
Sampling duration: Jul 25,2009 – Dec 10, 2009135 days/93 working daysSource: customized teamtrack reports and emailsSummary:248 teamtrack requests35 email requests (reduced by 35%)3.1 tickets/working day285 cmsai request (self-service)
67
Better Life in cmsops - 2Better Life in cmsops - 2
Non-prod environments are self-serviceable 15% of build deployment is automatedAutomatic/Scheduled data/log purging Scheduled/unattended cold backup*
68
Q/A
69 ©2010 Hewlett-Packard Development Company, L.P.
To learn more on this topic, and to connect with your peers after the conference, visit the HP Software
Solutions Community:www.hp.com/go/swcommunity
70