contemporary report - nm doit share... · web viewcompletion of installation and configuration of...

74
. . . . . . . . . . Attachment 1A Infrastructure Optimization . Recommendations for:

Upload: buiphuc

Post on 14-Mar-2018

221 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

. . . . . . . . . .

Attachment 1AInfrastructure Optimization .

Recommendations for:

SHARE: New Mexico Statewide Human Resource, Accounting, and Management REporting System

Page 2: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

..........

Table of Contents

Executive Overview.............................................................................3Introduction.......................................................................................10Technology Recommendations.......................................................11

Discussion of Current Environment ..........................................................11Performance.................................................................................................13Business Continuity.....................................................................................21Manageability................................................................................................26Information Lifecycle Management............................................................29Data Security................................................................................................31

Conclusion.........................................................................................34Appendix A - Oracle Solutions and Services.................…………..35Appendix B - Completed Discovery Questionnaire ......…………..43Appendix C - Draft Database Architecture......................................55

Page 2 of 55 New Mexico and Oracle Confidential

Page 3: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

Executive Overview

The State of NM (SoNM) SHARE project team and Oracle have jointly engaged in a technology assessment. This is a complimentary program designed to help our strategic customers fully leverage their investment in technology and utilize it according to industry best practices. The purpose of this document is to present findings and high-level recommendations to help the SoNM meet its business objectives.

In June and July of 2008, the Oracle and SoNM teams jointly analyzed the SHARE infrastructure in an effort to optimize the environment and create a short, medium, and long-term management strategy.

The Assessment focused on the following areas:

Performance Business Continuity (High Availability & Disaster Recovery) Manageability Information Lifecycle Management Data Security

Oracle performed a comprehensive assessment of the current environment, then identified and prioritized a list of actionable and executable recommendations for SoNM to improve their infrastructure strategy. This effort provides detailed technology and architecture recommendations, and an implementation roadmap.

The Assessment team spent one day conducting in-person interviews with SoNM’s subject matter experts to fully understand architecture, technology, policies and practices of the current PeopleSoft environment. Following the discovery workshop, the Oracle team, with the assistance of the SHARE team, spent the next 4 weeks analyzing the discovery findings, researching solutions, and formulating recommendations. Our methodology towards solutioning included a 3-sided approach:

Including a perspective of the trends, best peer practices, regulation and compliance issues.

Leveraging SoNM’s currently owned products, existing systems and processes, and in-house technology skills.

Providing a phased implementation roadmap that allows for modular successes and sustainable long-term growth.

The following tables summarize the recommendations for each technology area, which pains they address and highlights of some of the key business drivers and values gained.

Page 3 of 55 New Mexico and Oracle Confidential

Page 4: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

..........

SHARE Solution Recommendations

Current Pains Solution / Technology Recommendations Business Value

Performance

Performance is unacceptable

Most calls come from users not being able to login – application is unavailable

NMDOT expansion concerns

Application Deadlocks are common

Storage configuration is a concern (currently RAID 5)

Configuration disparity between Development / Test / QA / Production

Processes that run efficiently in test take many hours in production

Dev / Test / QA / Production environments exist but they’re not used in the traditional way

A Training environment does not exist and is needed

No formalized Change Management Process

Database testing is problematic due to resource constraints

Batch processing takes long

NM owns Partitioning but it is currently not being used

Reports often snowball into a queue of long running reports that have to be manually canceled

Citrix environment is unstable,

Investigate and re-design storage architecture

Utilize holistic tools to proactively determine bottlenecks from the end-user desktop to the backend database

Formalize and document Service Level Agreements

Establish a Configuration Management Database and establish system baselines

Upgrade to PeopleTools 8.48 or above

Implement a complete Production Support infrastructure – dedicated Development / Test / QA / Training environments

Establish an Enterprise-wide Change Management Process including thorough testing

Implement

SHARE is running at optimal rates increasing user productivity and end-user satisfaction

Ability to quickly respond and proactively fix problems, increasing user productivity

Enable SHARE to accommodate growth and new user requirements (NM DOT)

Page 4 of 55 New Mexico and Oracle Confidential

Page 5: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

often not accessible or slow (being addressed by ABBA technologies)

Juniper load balancers can be, and have been, a bottleneck (being addressed by ABBA technologies)

Partitioning

Business Continuity

Most help desk calls come from users not being able to login, application is unavailable

No coordinated effort for backing up the complete system

A clear uptime SLA does not exist: The expectation is 24 x 7, 365, but realistically, the model is “6 – 16’s;” 6a.m. to 10p.m, Monday – Saturday

roswell-a was intended to be a Q/A server but is now serving a mission critical service as a file-based report repository

It is unknown how long it would take to recover from a complete service loss

Single Points of Failure (SPFs): DS4800 Roswell-a Process Schedulers Tucumcari

Failing process can cause orphan records which have to be manually deleted

Any major datacenter outages are long to recover

At one time, the system was so overloaded that the database crashed

Citrix environment is unstable, often not accessible or slow (The

Formalize and document Service Level Agreements

Implement a modern grid architecture strategy

Utilize automatic data replication and data recovery tools

Eliminate single points of failure

Modernized DR plan with regular testing

Automate error prone and time consuming maintenance operations

Implement a Master Process Scheduler

Ability to meet future high availability business requirements (24/7) and service level agreements

Ensuring that SHARE will quickly be available after a major outage

Page 5 of 55 New Mexico and Oracle Confidential

Page 6: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

..........system is currently being redesigned to address these issues)

Manageability

No automated configuration management

No complete system central control

Developers doing their own code backup

A standard development methodology does not exist

No coordinated effort for backing up the system

Dev / Test / QA environments exist, but they are not used in the traditional way

Limited formalized Change Management Process

Oracle Databases are inconsistent across environments

People can change the system without a tracking mechanism in place

On the technical side, there are no formal documents that travel with the process

NMDOTs system will impact size on Projects (Project tables are larger in general and can grow exponentially)

No monitoring of the Web / Application Tier

Database has not been patched for over a year due to Maximus requirements

.

Utilize DB automation tools that provide enterprise view of the system

Create a configuration management database & leverage current tools to automate configuration tracking

Design and implement a structured approach to software development and release management

Implement Partitioning

Have separate & dedicated environments closely reflect production

Utilize diagnostics and tuning tools that determine where bottlenecks are and proactively monitor performance

Freeing up highly skilled resources to work on higher value tasks

Prevention of user mistakes

Highly trained staff completing manual tasks faster

Prevention of user mistakes

Proactively avoiding potential production problems

Ability to respond and fix problems more quickly increasing user productivity

Maintaining control over the software development lifecycle (SDLC) and ensuring consistency across applications

Enforcing standards results in reduced complexity and troubleshooting costs

Page 6 of 55 New Mexico and Oracle Confidential

Page 7: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

Information Lifecycle Management

No archival system place

The complete database is replicated throughout SHARE’s different environments

1 – 5% a month data growth

NM owns Partitioning but it is currently not being used

NMDOT Project tables will add significantly more data

Future plans for Enterprise Performance Management (EPM) demand an archival strategy

Formally document data retention policies

Implement Partitioning to facilitate and automate archiving

Redesign storage tier to make use of lower cost storage devices for historical data

Define retention policies

Define the Data Classes

Create Storage Tiers for the Data Classes

Create Data Access and Migration Policies

Define and Enforce Compliance Policies

Ensure SHARE’s ability to scale and meet changing business requirements

Enable EPM reporting needs by keeping historical data close to production data but eliminating any performance impact

Lower TCO of data retention at the same time raising the ROI at historical data

Data Security

SQL Developer and customer queries are often run directly against the production database without testing

Production data is replicated to all environments (Dev / Test / QA)

Developers develop in the production database

Implement sensitive information handling policies that include development, storage and transmission

Implement database encryption and backup tape encryption

Implementing a masking policy to hide any sensitive data

Compliance with regulatory and privacy laws

Ensure protection of sensitive data

Performance Recommendations: In-depth study of the i/o subsystem and storage design

Page 7 of 55 New Mexico and Oracle Confidential

Page 8: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

.......... In-depth study of the WebLogic thread issues Implement a tool to measure real-user performance Upgrade to PeopleTools 8.48 or greater Formalize and Document Service Level Agreements Establish holistic monitoring procedures Establish a Configuration Management Database Implement a complete Production Support infrastructure – dedicated

Development / Test / QA / Training environments Establish an Enterprise-wide Change Management Process including

thorough testing Implement Partitioning (you own it, use it)

Business Continuity Recommendations: Formalize and Document Service Level Agreements Design, Test, Implement a modern Business Continuity plan Automate error prone and time consuming maintenance operations Eliminate Single-Points-of-Failure, particularly at the storage tier Implement a Master Process Scheduler Maintain distinct support environments

Manageability Recommendations: Utilize DB automation tools that provide enterprise view of the system Create a configuration management database & leverage current tools to

automate configuration tracking Design and implement a structured approach to software development

and release management Implement Partitioning Have separate & dedicated environments closely reflect production Utilize diagnostics and tuning tools that determine where bottlenecks are

and proactively monitor performance

Information Lifecycle Management Recommendations Formally document data retention policies Implement Partitioning to facilitate and automate archiving Redesign storage tier to make use of lower cost storage devices for

historical data Define the Data Classes Create Storage Tiers for the Data Classes Create Data Access and Migration Policies Define and Enforce Compliance Policies

Data Security: Implement sensitive information handling policies that include

development, storage and transmission Implement database encryption and backup tape encryption Implement a masking policy to hide sensitive data as it is moved from

environment to environment

Based on the aforementioned business priorities of SoNM, pre-requisites, typical implementation timelines and SoNM’s bandwidth capabilities the following roadmap is proposed.

Page 8 of 55 New Mexico and Oracle Confidential

Page 9: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

High-Level RoadmapItems with a black border overlap in various areas

RecommendationsPerformance1. Storage Assessment and Design 2. Investigate and fix WebLogic

bottlenecks3. Define SLAs & KPIs4. PeopleTools Upgrade5. Implement CMDB6. End-to-End Monitoring7. Partitioning8. Maintain distinct support environmentsBusiness Continuity1. Design, Test, Implement BC Plan2. Eliminate SPoF3. Master Process Scheduler4. Automate maintenance proceduresManageability1. Formalize Change ManagementInformation Lifecycle Management1. Define Data Classes, Access, Migration

and Retention PoliciesData Security1. Implement sensitive info handling

policies2. Implement Database Encryption3. Implement Database Masking

Low

Complexity

Hig

hLo

w

Valu

e

High

“Targets”“Secondary

Targets”

3 2

1

1 1

1

1

5

6

7

8

2

3

2 3

44

We recognize implementation of any recommendations discussed in this paper might only be possible after further analysis and support. The Oracle team stands ready to continue working with in the SoNM developing specific project plans, ROI analysis, building proof of concepts and continued partnership and mutual education around key IT areas.

We look forward to receiving your impressions and response to our findings and trust you will find the Assessment process valuable to you.

Page 9 of 55 New Mexico and Oracle Confidential

Page 10: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

..........

Introduction

The State of New Mexico (SoNM) and Oracle have jointly engaged in a technology assessment. This is a complementary program designed to help our strategic customers fully leverage their investment in technology and utilize it according to industry best practices. The purpose of this document is to present findings and recommendations, high-level solutions and roadmaps to help the SoNM meet its business objectives.

The Oracle Team would like to mention that we recognize that vendors are not often invited to participate directly with the SoNM on a strategic level, and we very much appreciate this opportunity to assist with this critically important initiative.

The Oracle Team would like to thank all the employees of the SoNM who spent time with us and provided information necessary to create this document. We fully enjoyed sharing ideas and discussing IT needs in an open, collaborative environment.

State of New Mexico Team members

Name Title / Role Email Phone

John Prihoda SHARE Development Director [email protected] 505.476.7629

Kelly Mentzer

Deputy Director Enterprise

Operations & Support Services

Division

[email protected] 505.827.2105

George Robinson PeopleSoft Development Manager [email protected] 505.827.3917

David Emery Database Administrator [email protected] 505.827.2137

Alex Canett PeopleSoft Administrator [email protected] 505.827.3969

Karla Fernandez PeopleSoft Technical Analyst [email protected] 505.827.3686

Oracle Team members

Name Title / Role Email Phone

Chris LeBaron Account Manager [email protected] 801.803.1602

Page 10 of 55 New Mexico and Oracle Confidential

Page 11: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

Mike Baraiolo Regional Vice President, Midwest [email protected] 702.592.1525

Jeremy Forman Principal Solution Architect [email protected] 602.333.9037

Chris Wilson Consulting Sales Representative [email protected] 916.716.6900

Stephen Stewart Consulting Technical Director [email protected] 865.805.7067

Byron Pearce Consulting Solution Manager [email protected] 817.271.4664

Technology RecommendationsThis section gives detailed analysis and recommendations on how to best address the technological focus areas and topics that lead to a comprehensive approach to SHARE infrastructure optimization. Through the assessment discovery and interview process, the Oracle team captured and analyzed SoNM’s current capabilities, challenges, goals and visions. The recommendations discussed in this section include both technology and policy initiatives. It is important to highlight that investments in technology are not enough, and a robust strategy for infrastructure optimization includes People, Process, and Technology.

Discussion of the current environment:From the SHARE web site:

The SHARE project is the State of New Mexico's PeopleSoft enterprise resource planning (ERP) system implementation.  SHARE will combine the State's accounting, human resource, payroll, and purchasing functions into an integrated statewide system.  The Project's goal is to develop processes for the State's basic financial and human resource functions that are economical, efficient, and effective.

SHARE will enable the State of New Mexico to achieve more timely accountability to the public, more strict compliance with statutes and laws related to fiscal matters, reduction of duplicate effort, and improve relations with those supplying the State with goods and services.

The SHARE system is an implementation of the PeopleSoft Finance (FIN) and Human Capital Management (HCM) modules. Within the HCM module, Payroll and Time and Labor are the mission critical modules, with Recruitment being extremely important but not considered mission critical. The mission critical function within the Finance module is the complete process of paying vendors. Although this is a mission critical function, it does have a little more leeway in terms of downtime than does Time and Labor or Payroll.

The SHARE system is founded on the PeopleSoft Pure Internet Architecture (PIA):

Page 11 of 55 New Mexico and Oracle Confidential

Page 12: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

..........

BusinessLogic Server

Services

WebLogicServer

TUXEDO

OracleServer

Java ServletEngine

DirectoryServices

HTTP

Security

Rules

Integration

Presentation

PortalServlet

PresentationServlet

IntegrationServlet

InstanceApplication

Tables

MetadataSQL

BackgroundProcess

Scheduler

HTML

WML

XML/SOAP

Browsers

SOAP Sync

SyncServlet

Browser

PeopleSoft Pure Internet Architecture (PIA)

The SHARE system is owned by the Department of Finance and Administration (DFA), but infrastructure and database support are provided by the Department of Information Technology (DOIT). Clear Service Level Agreements do not exist in terms of system performance, business continuity, or from the SHARE system to the end user, however the original project plan does call out for an SLA requiring 99.7% System Availability for data center network, power, and environmental controls. This SLA is open to interpretation because it does not specify the PeopleSoft system components themselves, but it does imply that the SHARE system will be available for just less than 364 days per year (or 363 days, 21.6 hours). This SLA is between DOIT and DFA, but, again, nothing is specific to the SHARE end user and their experience.

At present, the Department of Transportation has requested specific changes and augmentations to the SHARE system to support their unique business requirements. At the time of this writing, it is believed that the SHARE system will acquire and incorporate a PeopleSoft solution developed for and used by the Wyoming Department of Transportation to support specific transportation business processes. The criticality of this project is extremely high due to the Federal dollars associated with the success of the provided functionality.

The current SHARE environment, including all nodes, physically consists of approximately 17 IBM servers divided into 22 logical servers, one IBM DS4800 Storage System (other storage infrastructure was not provided), and four Juniper DX 3250-S-2C load balancers.

The production PeopleSoft server environment is as follows:2 IBM p5 550Q – Application & Web Servers (taos & gallup)1 IBM x336 – File Server (nambe)2 IBM x336 – Citrix Servers (cerrillos & chama)2 IBM p5 570 with 4 Logical Partitions (LPAR)

2 Database Servers (santafe-a, santafe-b)1 Report Repository (roswell-a)1 Enterprise Manager Grid Control (hatch)

Page 12 of 55 New Mexico and Oracle Confidential

Page 13: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

The Development / Test / QA PeopleSoft environments are as follows:2 IBM p5 550Q – Application & Web Servers (clovis & deming)2 IBM x336 – Non-Production Files Servers (pecos & pojoaque)2 IBM x336 – Non-Production Citrix Servers (chimayo & cordova)1 IBM x336 – UPK Server (alamos)1 IBM x336 – Upgrade Server (vegas)2 IBM p5 570 with 3 Logical Partitions (LPAR)

3 Database Servers (roswell-a, roswell-b, hobbs)

Recommendations

1) Performance (& Scalability)

The SHARE system supports approximately:

24000 Time Entry Users4000 Full-time Financial (FIN) Users1000 Full-time Human Resource (HR) Users

It is projected that over time, the number of HR users will remain flat but the number of FIN users will increase; considerable growth is not expected. In general, aside from a couple days a week, FIN experiences a higher level of concurrency than does HR.

Performance bottlenecks have been an ongoing issue. Since the go-live date in July of 2006, performance has been, according to end users and SHARE staff, decidedly unacceptable. The majority of help desk calls originate from users who are not able to log in to the system due to system overload. Throughout the assessment, Performance has been the major focus area.

The team took a holistic approach to identifying potential problems and areas of concern. Every tier of the system, except for the client, was considered, including the Application/Web Servers, the Database Servers, and the Storage system. Although an exhaustive hands-on evaluation was not part of this assessment’s scope, we are confident we have identified crucial bottlenecks across the system.

Disk Utilization AnalysisThe SHARE team is currently in the process of redesigning the storage subsystem and was not able to provide detailed storage configuration information on the current or proposed architectures. Since the storage subsystem is a critical piece to any IT system and considered by both SHARE and Oracle to be a likely culprit of the ongoing performance issues, the Oracle team is concerned with the fact that the old and new architectures were not part of this assessment.

With the support of the DOIT database administration staff, we were able to run some utilities, specifically sar reports, in an attempt to identify disk problem areas. The sar data from santafe-a disk utilization provides insight as to the fact that there are, in deed, storage design issues. The issues may not be straight forward and easy to diagnose and resolve as specific disk configuration data was not supplied. Figure 1 includes a chart of the sar data for hdisk 11 (similar issues were found on other disks, specifically hdisk 2 and hdisk 12) for the 3:15 pm time frame as indicated by the chart:

Page 13 of 55 New Mexico and Oracle Confidential

Page 14: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

..........

Notice from the chart that the disk utilization is always upwards of 90 % with an average of 91% over this time frame. Also, note that the average service time is very small in the range of 4-15 milliseconds. This alone is indicative of the fact the i/o subsystem is forced to perform some type of overhead operations driving the utilization so high.

Additionally, the number of Read/Writes/sec and Blocks/sec is not sufficient to drive the disk utilization to 90 % without some obvious storage design issues. There are numerous possibilities for this design issue, from stripe size design, to operating system, RAID and SAN configuration. These issues require an in-depth study of the i/o subsystem and storage design by a qualified technical resource; part of any study should include a detailed plan to resolve these issues.

Application & Web ServersAn Oracle PeopleSoft expert was brought into the Oracle team to review the SHARE Application and Web Server configurations and identify any irregularities or areas of concern. For instance, lack of memory resources on the application server is the most common cause of serious online performance issues. If users are experiencing more than 4 seconds of response time for every web page refresh, it is very likely a problem with application server memory. These problems are most likely caused by "memory swapping". These are the same problems you experience when trying to run Microsoft PowerPoint, Word and Excel on a PC with too little available memory.

After reviewing the files provided by the SHARE team, it was concluded that the overall configuration and setup of the Application and Web Servers appear to be appropriate and in-line with other PeopleSoft implementations of like size. However, a major problem was identified in the web server logs. The log files contain evidence

Page 14 of 55 New Mexico and Oracle Confidential

Page 15: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

that the WebLogic threads (processes) are getting stuck, most likely consuming most, if not all, of the CPU and memory resources on the server; this issue can be caused by numerous issues and further research is required. The fact that these threads are getting stuck leads the Oracle team to believe that this could be a major contributing factor to users not being able to login to the SHARE system and an overall poor end user experience.

Database TierBased on the supplied Remote Diagnostic Assistant (RDA) output that was shared with the Oracle team, there were a number of areas where Oracle found opportunities for increasing performance:

A number Kernel Parameters require fine-tuning to enhance system performance. For instance tcp_sendspace, tcp_recvspace, and minperm% are some of the parameters that are set incorrectly and further analysis needs to be conducted to establish the most optimum setting.

The FIN and HCM databases are averaging around 40% of their time waiting for disk I/O operations to complete. This is a substantial number and is very alarming.

Row-level lock contention is occurring. There are a few possibilities here, but there is a distinct possibility that this is a result of the application layer and the hanging WebLogic threads as well as service times on the I/O layer not allowing a lock to be freed up by process A before process B needs it. This is occurring about 2% of the time but could be serializing numerous transactions, forcing transactions to run slower by single-threading.

SQL commands were found that are consuming a fair amount of database time, 5+% on a single execute, that should be examined to determine if there are tuning opportunities in their operations.

Wait on the network interconnect between santafe-a and santefe-b is occurring to a limited extent; this needs to be researched to determine if there are genuine interconnect issues or if this Wait is reactive based on the aforementioned Application and disk I/O issues.

Performance MonitoringWith the aforementioned performance issues across the system’s multiple tiers, there is little monitoring being conducted. Although the database administration team is proactively monitoring the database environment, it appears that the Application and Web Server tier as well as the disk tier are left mostly unmonitored. The ability to provide root-cause analysis across the tiers and provide real insight as to the cause of a problem or a bottleneck and provide quick resolution does not exist. SoNM currently lacks a consolidated, single view of the SHARE ecosystem. Database Administrators at SoNM currently use tools that manage individual databases, but the tools necessary to manage the total environment are not currently in place.

It is recommended that SoNM implement a centralized management environment, and in particular a Configuration Management Database, where all components of the PeopleSoft architecture are readily accessible for monitoring. A Configuration Management database will form the centerpiece of SoNM’s ability to manage configurations and automate IT processes. It captures and centralizes the

Page 15 of 55 New Mexico and Oracle Confidential

Page 16: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

..........information about all hardware and software resources in the enterprise, thereby facilitating the diagnosis of problems, automation of processes and compliance with regulatory and industry standards. The proactive evaluation of configuration against best practices, aided by comprehensive reporting and powerful analytics, ensure SoNM’s efficient use of IT resources, faster problem resolution and, ultimately, an improved Quality of Service.

Diagnosing a slowly performing system, such as a payroll report, is a time consuming task often surrounded by myths and legends, few of them based on fact. Being able to answer questions such as “Why is the system slower today than it was last week or last hour?” is difficult when only database statistics are available. A tool that provides SoNM administrators with a self-diagnostic engine to automatically diagnose performance problems will liberate the administrators from this complex and arduous task. Once performance issues are detected by the monitoring tool, it is important that the findings be presented to SoNM administrators in a convenient and intuitive fashion, and guide them step-by-step to quickly resolve performance problems.

Another feature of a comprehensive diagnostics and tuning tool is the ability to provide SoNM advanced event notification. Early problem detection is the key to ensuring timely resolution. When administrators are away from their desks, the tool needs to be able to send event alerts via modern communication channels (i.e. Email, pager etc…). Once such an infrastructure is in place, SoNM’s IT staff will have at their disposal a wealth of tools and procedures that will help them manage the end-to-end, holistic SHARE environment.

Measuring Real-User PerformanceTypically, in order to control the performance and success of a PeopleSoft implementation like SHARE, a number of Key Performance Indicators (KPIs) are defined. The most commonly used KPIs are:

End-to-end response times of pages and individual objects (URLs) Server and internet/network response time Error messages (both technical and functional) End-user behavior, number of aborts, and so on…

To follow up on the above mentioned KPIs, and to obtain a realistic view of the status of the Web applications, SoNM needs to measure real-user experience. To make this possible, a number of requirements can be identified.

Page 16 of 55 New Mexico and Oracle Confidential

Page 17: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

RealUser

Insight

First, traffic from all users must be monitored. This is very important because the information is used to cross reference SHARE performance data and user groups which, in turn, can be grouped on location, department, type, status, and so on. Secondly, the data collection method must be 100% non-intrusive. It is not acceptable to disturb the current service by adding extra load on a Web server, or by installing software agents that will affect performance. Furthermore, SHARE does not want to change the current application or infrastructure. When an application change is deployed, or when an additional Web server is added, there must be no (or very limited) changes to the monitoring environment.

Process Scheduler PerformanceMaster Scheduler for Load Balancing:A Master Scheduler enables load balancing of workload by automatically routing requests to available Process Scheduler servers, ensuring the maintenance of optimal processing at all times. In addition, an active Master Scheduler manages and controls all Process Scheduler server domains that are on the same PeopleSoft database. It enforces all of the rules that are specified in either the process or job definitions, and monitors the running of all processes. It becomes the centralized control as it checks the Process Request table looking for any queued requests to run, and then dispatches them to an appropriate available Process Scheduler server.

PeopleTools UpgradeThere are many new performance enhancing features in PeopleTools 8.48 or above that can greatly enhance the SHARE system. Specifically, there are three that would provide SHARE immediate performance improvements:

Jolt Session Pooling Jolt is a Java class library and API that provides an interface to Tuxedo and WebLogic from remote java clients an an integral part of the SHARE infrastructure. In 8.48, Jolt Session Pooling is enabled by default. Previously, each user session required a dedicated Jolt session between web server and application server, consuming valuable system resources. Under high-load conditions, this could exhaust the syste. With Jolt Session Pooling, the Jolt sessions are shared among the users, reducing system resource usage. In an internal test on a 2-CPU Windows server, the system was able to run 500 users with Jolt Session Pooling versus only 280 users without Jolt Session Pooling.

Page 17 of 55 New Mexico and Oracle Confidential

Page 18: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

..........

Average Response Time (s)

3.88

3.10 2.96

4.21

3.503.19

CustomerSearch

VendorSearch

OrderSearch

With PoolingNo Pooling

Various Searches

Strict Failover and Weighted Load-BalancingPrior to 8.48, when a host fails, the load is redirected to the next host on the server list. This could cause system load to increase significantly on that host and make it unstable. With 8.48, you can specify a failover target for each host so SHARE can distribute the failover load. SHARE administrators can also specify a weight for each host so more load will be directed to more powerful machines.

Application Server Dynamic RecycleApplication server restarts are a leading cause of inconsistent response times. If recycle count is set too low, then there will be too many psappsrv restarts. If set too high, there could be instability or crashes as the processes grow too big. In 8.48, the new Dynamic Recycling feature allows the application server to monitor its own memory growth to recycle when it grows above a given threshold.

Once dynamic recycle is enabled, current 8.48 releases of tools get the virtual size of the PSAPPSRV process every 100 service requests. If any new metadata had been loaded since the last check then the maximum memory threshold is reset to be N% larger than current memory usage. Otherwise a check is done to compare the process's current usage to the threshold. When the threshold is exceeded a check is then done to be sure the process has completed at least as many requests as the "recycle count" setting. If enough requests have not been processed then a message about "delaying recycle" is printed, otherwise the server recycles.

Lack of Consistency between Support Environments and ProductionCurrently, SoNM’s production and support environments (Development / Test / QA / Training) are significantly different in configuration, hardware and computing power. As upgrades, patches, and new features are applied on the support systems and accepted, it is difficult to determine how they will perform in the production environment. When users are experiencing performance degradation working in the production system, it is difficult to replicate the issue on the test system due to the discrepancy in computing power and overall configuration. Although different environments do exist, it became clear from our discovery session that these environments have drifted in configuration and scope and are now being used for many other purposes besides their intended design.

Therefore, it is advised that separate and distinct support environments for Development, Test, QA, and Training be established that closely align with the production environment, including a Real Application Cluster configuration. Following a grid approach, if the servers are the same in test and production, a test server

Page 18 of 55 New Mexico and Oracle Confidential

Page 19: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

could be easily moved into the production system for added computing resources during high demand times, such as when SoNM’s payroll is run, and then moved back into the test environment when the demand diminishes, ultimately adding to SoNM’s lower TCO in a Grid architecture.

Change ManagementThe Wikipedia defines change management as "The objective of Change Management is to ensure that standardized methods and procedures are used for efficient and prompt handling of all changes to controlled IT infrastructure, in order to minimize the number and impact of any related incidents upon service." The importance of Change Management cannot be overstated as the SHARE system evolves to accommodate new business requirements and growth.

As can be seen in the discrepancy between support systems and production on all layers, performance problems can be difficult if not impossible to determine when systems and processes get out-of-sync. Also, development and ad-hoc user queries are being run against the production environment without the requisite systematic and rigorous testing that is needed before moving to production.

It is, therefore, recommended that SoNM implement a formalized enterprise-wide change management system that can capture code changes, evaluate the impact of the changes, and propagate the changes to the appropriate system, ultimately facilitating the software development lifecycle.

Data growth impact on Performance SHARE data growth is estimated to be approximately 1-5% per month. Although exact disk capacity and utilization was not provided, it is believed that anywhere from 70 to 90% of current production storage is already utilized. With the added capacity needs that will be introduced from the NM DOT enhancements and specifically the large size of Project tables, a modern and flexible approach to storage that offers a lower TCO is highly recommended.

FY08Payroll

ArchivePayroll

FY05Payroll

FY06Payroll

FY07Payroll

All Payroll

A complete discussion of Information Lifecycle Management can be found in the Technology Recommendations Section 4.

Page 19 of 55 New Mexico and Oracle Confidential

Page 20: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

..........Partitioning for PerformanceBasics of Partitioning: Partitioning allows a table, index or index-organized table to be subdivided into smaller pieces. Each piece of the database object is called a partition. Each partition has its own name, and may optionally have its own storage characteristics. From the perspective of a database administrator, a partitioned object has multiple pieces that can be managed either collectively or individually. This gives the administrator considerable flexibility in managing a partitioned object. However, from the perspective of the application, a partitioned table is identical to a non-partitioned table; no modifications are necessary when accessing a partitioned table using SQL DML commands.

By limiting the amount of data to be examined or operated on, partitioning provides a number of performance benefits. These features include:

Partitioning Pruning: Partitioning pruning (a.k.a. Partition elimination) is the simplest and also the most substantial means to improve performance using partitioning. Partition pruning can often improve query performance by several orders of magnitude. Looking at the HCM database, suppose an application contains a PAYROLL table containing an entry for each employee across the state, and that this table has been partitioned by PAID date. A query requesting Payroll for a single month would only access a single partition of the PAYROLL table. If the table has millions of records, this query would access one partition instead of hundreds. This query could potentially execute 100x faster simply because of partition pruning. Partition pruning works with all of Oracle's other performance features. Oracle will utilize partition pruning in conjunction with any indexing technique, join technique, or parallel access method.

Partition-wise Joins: Partitioning can also improve the performance of multi-table joins, by using a technique known as partition-wise joins. Partition-wise joins can be applied when two tables are being joined together, and at least one of these tables is partitioned on the join key. Partition-wise joins break a large join into smaller joins of 'identical' data sets for the joined tables. 'Identical' here is defined as covering exactly the same set of partitioning key values on both sides of the join, thus ensuring that only a join of these 'identical' data sets will produce a result and that other data sets do not have to be considered. Oracle is using either the fact of already (physical) equi-partitioned tables for the join or is

Page 20 of 55 New Mexico and Oracle Confidential

Page 21: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

transparently redistributing (= “repartitioning”) one table at runtime to create equi-partitioned data sets matching the partitioning of the other table, completing the overall join in less time. This offers significant performance benefits both for serial and parallel execution.

Performance Recommendations: In-depth study of the i/o subsystem and storage design In-depth study of the WebLogic thread issues Upgrade to PeopleTools 8.48 or greater Formalize and Document Service Level Agreements Establish holistic monitoring procedures Establish a Configuration Management Database Implement a complete Production Support infrastructure – dedicated

Development / Test / QA / Training environments Establish an Enterprise-wide Change Management Process including

thorough testing Implement Partitioning (you own it, use it)

2) Business Continuity Business Continuity means many things to many people. For the purposes of this document and the Assessment, the working definition for Business Continuity will be:

The ability of SHARE to maintain operations and services in the face of a disruptive event.

As mentioned in the Current Environment section, clear Service Level Agreements (SLA) do not exist in terms of Business Continuity. However the original project plan does call out for an SLA requiring 99.7% System Availability for data center network, power, and environmental controls. This SLA is open to interpretation because it does not specify the PeopleSoft system components themselves, but it does imply that the SHARE system will be available for just less than 364 days per year (or 363 days, 21.6 hours). This SLA is between DOIT and DFA, but, again, nothing is specific to the SHARE end user and their experience.

Currently, the SoNM is at risk due to many single points of failure and the absence of a modern business continuity plan. Presently, the SHARE environment is not protected through redundancy on many layers. From a single-point-of-failure (SPoF) perspective, there are three areas of concern:

Storage System Production server roswell-a (designed only as a test server) Process Schedulers

If any of the above components were to fail, the supported business will experience costly downtime ranging anywhere from hours to days. In the event of a power outage or a major disaster in the data center, SHARE would not be able to provide continuous service; robust technology to ensure business continuity is presently not part of the architecture.

Disaster RecoveryLack of a modern Business Continuity PlanDowntime that affects SoNM could be either unplanned or planned. Unplanned downtime may be due to hardware or system failures, data/storage failures, human

Page 21 of 55 New Mexico and Oracle Confidential

Page 22: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

..........errors, computer viruses, software glitches, natural disasters and malicious acts. SoNM may also have to undergo planned downtimes because of scheduled maintenance such as system upgrades.

System Failures

Data Failures

System Changes

Data Changes

UnplannedDowntime

PlannedDowntime

Storage Failures

Human Errors

Corruptions

Site Failures

In designing its business continuity strategy, SoNM must create a business continuity plan (BCP) that can effectively deal with these challenges. One of the critical requirements of the BCP is that it must protect business data, because data is one of SoNM’s most critical assets – whether it is payroll/employee information, historical data, or generated from the Finance system. If SoNM loses its data, it is not easily replaced and rebuilding or regenerating that data will likely be an extremely expensive, if not impossible task, critically affecting SoNM’s ability to perform business functions.

Some of the steps involved in a complete and modern Business Continuity Plan are:

Management Awareness: Management Awareness is the first and most important step in creating a successful disaster recovery plan.

Identify Possible Disaster Scenarios Build Management Awareness Obtain Management Sign-Off and Funding

Disaster Recover Planning Process: In the disaster recovery planning stage, identify the mission-critical, important, and less-important processes, systems, and services in the SHARE environment and put in place plans to ensure these are protected against the effects of a disaster.

Establish a Planning Group Perform Risk Assessments and Audits Establish Priorities for Systems Develop Resiliency Design and Recovery Strategy Prepare Up-to-Date Inventory and Documentation of the Plan Develop Verification Criteria and Procedures Implementation

Page 22 of 55 New Mexico and Oracle Confidential

Page 23: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

Lack of Automated DR StrategyIn order to protect SoNM’s systems from disaster, an automated standby database solution is recommended. A standby database solution will provide the management, monitoring, and automation software infrastructure that creates, maintains, and monitors one or more standby databases to protect enterprise data from failures, disasters, errors, and data corruptions.

This scenario maintains standby databases as transactionally consistent copies of the production database. These standby databases can be located at remote disaster recovery sites thousands of miles away from SoNM’s production data center, or they may be located in the same city, same campus, or even in the same building. If the production database becomes unavailable because of a planned or an unplanned outage, the automated standby database solution can switch any standby database to the production role, thus minimizing the downtime associated with the outage, and preventing any data loss. Also note that every tier of the SHARE system is redundant, including storage. The capacity at the Disaster Recovery site does not need to be exactly the same as the production site. Once SoNM defines the mission critical business functions and defines SLAs for these processes, it can then be determined what the necessary capacity and service level is demanded of the DR site. The following is an example of such an architecture:

Application Servers

Application Servers

Transactions

Untested DR Plan

Testing is essential to the success of any Disaster Recovery/Business Continuity plan. Once a draft plan has been created, SoNM should create a verification process to prove the disaster recover strategy and, if the strategy is already implemented, review and test the implementation.

It is important that SoNM test and review the plan frequently. It is recommended that SoNM document the verification process and procedures, and design a proof-of-concept-process. The verification process should include an experience cycle; disaster recovery is based on experience and each disaster has different rules.

Page 23 of 55 New Mexico and Oracle Confidential

Page 24: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

..........

Single Point of Failure: StorageThe Storage sub-system is an area of Business Continuity concern. Almost all tiers of the SHARE system depend on the IBM DS-4800 for service. In the event of an outage to the storage system, the complete SHARE system would be unavailable. Please see the above disaster recovery discussion. With a fully redundant Disaster Recovery site in place, the SHARE system would maintain continuous operations even in the event of a storage failure.

Single Point of Failure: roswell-aRoswell-a was designed to be a test database server. Over time, this test server has become a reports repository, providing a mission critical function to the SHARE system. It is highly recommended that this functionality be moved to a production server, like Nambe, and return to its previous design as a test database server.

Single Point of Failure: Processor SchedulersA Master Process Scheduler offers fault tolerance to SHARE’s batch environment. In the event of a server failure, a Master Scheduler can redistribute queued requests among the remaining active Process Scheduler servers. In addition, an active Master Scheduler manages and controls all Process Scheduler server domains that are on the same PeopleSoft database. It enforces all of the rules that are specified in either the process or job definitions, and monitors the running of all processes. It becomes the centralized control as it checks the Process Request table looking for any queued requests to run, and then dispatches them to an appropriate available Process Scheduler server.

User ErrorAccording to many studies, 40% of application outages are caused by operator or user errors. Part of being human is making mistakes. But these errors are extremely difficult to avoid and can be particularly difficult to recover from without advance planning and the right technology. Furthermore, the SHARE team identified in our discovery session that failed batch processes can cause orphaned records in the database which have to be manually deleted. Such errors can result in "logical" data corruption, or cause downtime of one or more components of the SHARE infrastructure. While it is relatively simple to rectify the failure of an individual component, detection and repair of logical data corruption, such as accidental deletion of valuable data, is a time consuming operation that could cause SoNM an enormous loss of business productivity. Typical user-errors may include accidental deletion of valuable data, deleting the wrong data, and dropping the wrong table.

Flashback architecture leverages the unique technological advances in the area of database recovery due to human errors. Flashback Technology provides a set of features to view and rewind data back and forth in time. The Flashback features offer SoNM the capability to query historical data, perform change analysis, and perform self-service repair to recover from logical corruptions while the database is online.

Planned Downtime: PatchingThe SHARE system production databases have not been patched for over a year due to restrictions from the system implementer. Patch management alone can be a time consuming process. Identifying patches relevant to SoNM’s environment, applying patches, and keeping track of what patches have been applied when and where is a difficult process that can be simplified with the appropriate tools.

Page 24 of 55 New Mexico and Oracle Confidential

Page 25: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

Patch management is part of a broader set of requirements for managing the entire life cycle of a system. With the appropriate tools, not only is the patching process made simpler, but so is the process of configuring and cloning systems as depicted in the above diagram.

One of SHARE’s main challenges with patching is the system downtime involved to apply the patch. The SHARE production database presently runs in a Grid architecture, with all database nodes actively processing transactions on the behalf of database clients. To take advantage of the flexibility of this architecture, it is recommended that SHARE automate the provisioining of patches as can be seen in the figure above. Step 1 of the patch application procedure is to quiesce the first instance to which the patch is to be applied. In step 2 a patch tool is used to apply the patch to the quiesced instance. In step 3 the patched instance is reactivated and rejoins the Grid. The system is now running with one instance at a higher maintenance level than the other nodes in the cluster. The SHARE system can run in this mixed mode for an arbitrary period to ensure the patch corrects the original problem, and has not introduced additional problems. This procedure is then repeated for the remaining nodes in the cluster. When all nodes in the cluster have been patched the rolling patch update is complete and all nodes are running the same database version.

“To RAC or not to RAC”There has been much discussion within the state in regards to Oracle Real Application Clusters. The benefits of RAC are numerous and there are 10s of thousands of customers across the globe successfully running this technology, including many PeopleSoft implementations. From the Oracle team’s assessment, specific RAC issues were not found. The difficulty in answering such a question though, stems directly from the lack of clear Service Level Agreements. If SHARE is truly to supply 99.7% either on a 24 x 7 bases or “6 – 16’s,” Real Application Clusters provides an industry leading solution which eliminate single-points-of-failure due to planned or unplanned downtime as well as the benefits of flexible workload management. It should also be noted that when one of the production servers crashed a few months ago, the RAC system performed its duty by providing continuous service to end users. With proper design of other components (storage, process schedulers, and fixing application server issues) it is believed that the RAC design and implementation will provide SHARE with seamless operations.

Business Continuity Recommendations: Formalize and Document Service Level Agreements Design, Test, Implement a modern Business Continuity plan Automate error prone and time consuming maintenance operations Eliminate Single-Points-of-Failure, particularly at the storage tier

Page 25 of 55 New Mexico and Oracle Confidential

Page 26: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

.......... Implement a Master Process Scheduler Maintain distinct support environments Incorporate Disaster Recovery capabilities within chosen Management

Tool

3) Manageability

The SoNM currently lacks a consolidated, single view of the SHARE ecosystem across all of its tiers. The SHARE DBA staff currently use tools that manage individual Oracle databases, but the tools necessary to manage the total SHARE environment are not currently in place.

It is recommended that SHARE implement a centralized management environment, and in particular a Configuration Management Database (CMDB), where all components of the SHARE system’s eco-system are readily accessible for monitoring. The diagram below shows how such an environment would be structured, with a centralized repository, agents on each host machine being monitored, and access to this information readily available via secure inter/intranet connections.

Once such an infrastructure is in place, The SHARE system’s IT administrators will have at their disposal a wealth of tools and procedures that will help them manage SHARE in a common and secure environment across all tiers.

A Configuration Management database will form the centerpiece of SHARE’s ability to manage configurations and automate IT processes. It will capture and centralize the information about all hardware and software resources in the enterprise, thereby facilitating the diagnosis of problems, automation of processes and compliance with regulatory and industry standards. The proactive evaluation of configuration against best practices, aided by comprehensive reporting and powerful analytics, ensure SHARE’s efficient use of IT resources, faster problem resolution and an improved Quality of Service.

Page 26 of 55 New Mexico and Oracle Confidential

Page 27: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

Patch management can be a time consuming process. Identifying patches relevant to SHARE’s environment, applying the patches, and keeping track of what patches have been applied when and where is currently a manual process that can be simplified with the appropriate tools.

Key components of a Configuration Management Database are:

1) The ability to manage complex software configurations:A configuration management solution will provide SHARE a real-time or near-real-time view of configuration items, services and their dependencies within and across each system. SHARE will be able to manage configuration drift through comparison with “gold configurations” and saved baselines. SHARE will also be able to track, analyze and report on configurations while capturing configuration data that is used for the administration of the entire change management process, including change automation and active system diagnostics.

2) Faster problem resolution and root cause analysis:SHARE will be able to detect, document, alert and continuously maintain system configuration “shift and drift” resulting from planned and unplanned events. The CMDB will reduce SHARE’ risks involved in rolling out changes to production environments by identifying the impact of changes on deployed applications and users. SHARE will experience faster mean-time-to-repair through root cause analysis by isolating and correlating problems to the exact infrastructure or application component that is causing failure and by auditing change history for all targets and parameters.

3) Provides demonstrable control over the IT environment for IT governance and compliance

Difficult to troubleshoot bottlenecksProactively diagnosing a slowly performing system, such as a payroll report, is a time consuming task often surrounded by myths and legends, few of them based on fact. Being able to answer questions such as “Why is the system slower today than it was last week?” is difficult when only raw database statistics are available. A tool that provides The SHARE system administrators with a self-diagnostic engine to automatically diagnose performance problems will liberate the administrators from this complex and arduous task. Once performance issues are detected by the self-diagnostic engine, it is important that the findings be presented to SHARE administrators in a convenient and intuitive fashion, and guide them step-by-step to quickly resolve performance problems.

Many DBA Tasks are manual:Page 27 of 55 New Mexico and Oracle Confidential

Page 28: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

..........In order to maximize SHARE’s potential investments in enterprise compute grids, SHARE should implement a holistic tool to help the DBA staff and IT managers deliver higher service levels to end users on their databases while helping reduce management costs.

The key benefits of a holistic tool for database administrators are:

Self-tuned and self managed databases operating at optimal performance Automation of complex or routine, manual tasks which are often error-prone Rapid root cause analysis through established standards and compliance

enforcement

Manage many as one: Through standardized configuration, deployment, monitoring and tuning of databases, DBAs are able to achieve unprecedented availability, reliability and performance.

Proactive management: With a holistic management interface, the SHARE DBA staff will be able to receive alerts from potential problems before they cause serious downtime or performance degradation. Through monitoring templates that are configurable to the SHARE environment, DBAs are able to respond quickly and take action before operating conditions threaten service levels. Patch management is part of a broader set of requirements for managing the entire life cycle of a system. With the appropriate tools, not only is the patching process made simpler, but so is the process of configuring and cloning.

Partitioning for ManageabilityThe Oracle Partitioning option allows tables and indexes to be partitioned into smaller, more manageable units, providing SHARE database administrators with the ability to pursue a "divide and conquer" approach to data management. With partitioning, maintenance operations can be focused on particular portions of tables. For example, an SHARE database administrator could back up a single partition of a table, rather than backing up the entire table. For maintenance operations across an entire database object, it is possible to perform these operations on a per-partition basis, thus dividing the maintenance process into more manageable chunks.

A typical usage of partitioning for manageability is to support a 'rolling window' load process. Suppose that a DBA loads new data into a table on weekly basis. That table could be range-partitioned so that each partition contains one week of data. The load process is then simply the addition of a new partition. Adding a single partition is much more efficient than modifying the entire table, since the DBA does not need to modify any other partitions. The same is true for purging data from a partitioned table. You simply drop a partition, a very cheap and quick data dictionary operation, rather than issuing a DELETE command, using lots of resources and touching all the data to be deleted.

There are many meaningful benefits of partitioning to enhance manageability in a relational database solution. These benefits include the ability to:

Backup data on a partition-by-partition basis Restore data on a partition-by-partition basis Load a partition at a time Store data in different tablespaces on a partition-by-partition basis

Page 28 of 55 New Mexico and Oracle Confidential

Page 29: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

Rebuild indexes on a partition-by-partition basis Store indexes in different tablespaces on a partition-by-partition basis Merge and split partitions to consolidate or further break apart partitions Exchange partitions with tables Exchange partitions and their indexes with tables and their indexes

Manageability Recommendations: Utilize DB automation tools that provide enterprise view of the system Create a configuration management database & leverage current tools to

automate configuration tracking Design and implement a structured approach to software development

and release management Implement Partitioning Have separate & dedicated environments closely reflect production Utilize diagnostics and tuning tools that determine where bottlenecks are

and proactively monitor performance

4) Information Lifecycle ManagementAn approach to the end-to-end data lifecycle for the SHARE project currently does not exist. It was noted throughout our discovery session that an archiving strategy is needed to efficiently handle the 1-5% per month data growth. Data within the SHARE system has not been archived. In lieu of the upcoming NM DOT extension and the Project team’s awareness of the potential for large jumps in data growth and table size, a strategy for Information Lifecycle Management is needed.

Information Lifecycle Management (ILM) is concerned with everything that happens to data during its lifetime. Although most organizations have long regarded their stores of data as one of their most valuable enterprise assets, how this data was managed and maintained varies enormously. Originally, data was used to help achieve operational goals, run the business and help identify the future direction and success of the company. However new government regulations and guidelines, such as Sarbanes-Oxley, HIPAA, DOD5015.2-STD in the US are a key driving force in how and why data is being retained, as they are now requiring organizations to retain and control information for very long periods of time.

The challenge now before the SoNM, is to understand how their data evolves, determine how it grows, monitor how its usage changes over time, and decide how long it should survive. Today there are two additional objectives the SHARE team needs to satisfy: to store vast quantities of data for the lowest possible cost; and to meet the new regulatory requirements for data retention and protection.

As SHARE’s database footprint grows and the amount of data increases over time, it is essential that an approach to Information Lifecycle Management be incorporated into the future direction of database administration, including a comprehensive data archival strategy.

The SHARE Database is ready today for Business ILM offering key features to enable an ILM solution:

Fine-grained: managing data at individual row level Application Transparency; data classification is transparent Low-Cost: uses low cost storage to reduce costs

Page 29 of 55 New Mexico and Oracle Confidential

Page 30: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

.......... Enforceable Compliance Policies

Implement ILM in 4 StepsILM can be implemented using the SHARE Database by following these four steps:

1. Define the Data Classes 2. Create Storage Tiers for the Data Classes 3. Create Data Access and Migration Policies 4. Define and Enforce Compliance Policies

Once data is classified, i.e. on frequency of usage or a date range, storage tiers can be created for the identified data classes. During the lifetime of the data, it will migrate between data classes and access to the data will be transparently controlled. Eventually, based on data classification, the data may be archived or, possibly, remain inside the database indefinitely, residing on an appropriate storage tier.

ILM AssistantThe ILM Assistant is a GUI based tool for managing the SHARE ILM environment. It provides the ability to create lifecycle definitions, which are assigned to tables in the database. Then based on the lifecycle policy, the ILM Assistant advises when it is time to move, archive or delete data. It will also illustrate the storage requirements and cost savings associated with moving the data. Other capabilities include the ability to show how to partition a table based on a lifecycle definition and to simulate the events on a table if it were partitioned.

Information Lifecycle Management with Oracle PartitioningToday's challenge of storing vast quantities of data for the lowest possible cost can be optimally addressed using Oracle Partitioning. By understanding how data is accessed, the independence of individual partitions is the key enabler for addressing the online portion of a “tiered archiving” strategy. Specifically, in tables containing historical data, partitioning enables individual partitions (or groups of partitions) to be stored on different storage tiers, providing different physical attributes and price points.

For example an Orders table containing 2 years worth of data could have only the most recent quarter being stored on an expensive high-end storage tier and keep the rest of the table (almost 90% of the data) on an inexpensive low cost storage tier. Through Oracle Partitioning, the storage costs are reduced by factors (cost savings of 50% or more are not uncommon), without impacting the end user access, thus optimizing the cost of ownership for the stored information.

The Oracle Database is ideal for implementing ILM because ithere are no specialized data stores to manage and it operates independent of any hardware. It has proven performance; therefore all information can be quickly retrieved. The security features in the database ensure that data is secure from unauthorized access and data is always consistent across transactions. The Oracle Database provides total flexibility, therefore it can rapidly adapt to any change in requirements, which is extremely important due to the continuing evolution of emerging business requirements and government regulations.

Information Lifecycle Management Recommendations Formally document data retention policies Implement Partitioning to facilitate and automate archiving

Page 30 of 55 New Mexico and Oracle Confidential

Page 31: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

Redesign storage tier to make use of lower cost storage devices for historical data

Define the Data Classes Create Storage Tiers for the Data Classes Create Data Access and Migration Policies Define and Enforce Compliance Policies

5) Data SecuritySensitive information in database is not encryptedA major concern of the Oracle team is the need to safeguard sensitive data. The reason that this topic was raised within Oracle team internal meetings is the fact that production data is replicated multiple times throughout the SHARE environment. Furthermore, with developers coding directly against the production system, the system becomes exposed, with no malicious intent, to security vulnerabilities. A key strategic aspect is the ability of a database to protect private data in its most common storage unit, columns within database tables via encryption techniques.

There are two basic ways to encrypt data at rest in a database. One technique is to programmatically encrypt the data via a series of triggers and stored procedures. This is a code-based approach with manual encryption key management in a private database keystore.  The encryption is performed via database-resident SQL & PL/SQL code contained in record INSERT and UPDATE triggers and decryption is performed in View objects. The data is encrypted within the database and prevents unauthorized individuals from viewing sensitive data in clear text in the event a database backup tape is lost in transit. It relies on a manual process for effective and secure key management.

There are disadvantages with programmatically encrypting the data. First, the procedures to write the encrypt/decrypt routines must be written. Second, there are some limitations on the nature of the data to be encrypted that makes using the using these packages difficult or not impossible. Finally, it is not an easy matter to retrofit this encryption technology into existing business applications. All application code that writes data to be encrypted, or that expects to view decrypted data, may need to be modified.

What is needed is an encryption process that occurs at the database level, not the application level. Such encryption techniques could be easily implemented without changing any lines of business code.

Page 31 of 55 New Mexico and Oracle Confidential

Page 32: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

..........

This better approach provides the ability to selectively encrypt individual data attributes within the database on a declarative basis. There are no coding changes necessary to the application; the encryption is done declaratively and decryption is done transparently when the data is accessed. Encrypted data remains in encrypted format on backup tapes. Encryption keys are stored and managed external to the database in a secure encrypted data file called a Wallet. A password is necessary to unlock the Wallet in order allow access to encrypted data. Even if the wallet and database were both lost in transit, an unauthorized individual would still need to unlock the wallet in order to access the encrypted data. Since the encryption/decryption process is integrated with the database the performance implications introduced with code-based approaches are greatly reduced.

The following screen shots shows database data being viewed in a hex editor prior to encryption and after encryption was applied:

Encryption keys are stored and managed external to the database in a secure encrypted data file called a Wallet. A password is necessary to unlock the Wallet in order allow access to encrypted data. Even if the wallet and database were both lost in transit, an unauthorized individual would still need to unlock the wallet in order to access the encrypted data. Since the encryption/decryption process is integrated with the database the performance implications introduced with code-based approaches are greatly reduced.

Backup data is not encryptedAfter discussing the backup process with the SHARE team, another security concern was raised internally to the Oracle team in regards to backup data not being encrypted. It is essential to backup security that, when performing a backup, encryption is performed before the data leaves the database. This eliminates any chance of data being stolen while in-transit to tape.

Additionally, the data on tape should be stored in encrypted form. The database should transparently manage the encryption keys and automatically decrypt backups during the restore process. The secure backup implementation should also feature certificate-based authentication of host systems participating in a backup or restore to ensure that outside parties cannot impersonate an authorized host.

Page 32 of 55 New Mexico and Oracle Confidential

Page 33: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

Data Security Recommendations: Implement sensitive information handling policies that include

development, storage and transmission Implement database encryption and backup tape encryption Implementing a masking policy to hide any sensitive data

High-Level Implementation Plan & Conclusions

Page 33 of 55 New Mexico and Oracle Confidential

Page 34: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

..........High-Level RoadmapItems with a black border overlap in various areas

RecommendationsPerformance1. Storage Assessment and Design 2. Investigate and fix WebLogic

bottlenecks3. Define SLAs & KPIs4. PeopleTools Upgrade5. Implement CMDB6. End-to-End Monitoring7. Partitioning8. Maintain distinct support environmentsBusiness Continuity1. Design, Test, Implement BC Plan2. Eliminate SPoF3. Master Process Scheduler4. Automate maintenance proceduresManageability1. Formalize Change ManagementInformation Lifecycle Management1. Define Data Classes, Access, Migration

and Retention PoliciesData Security1. Implement sensitive info handling

policies2. Implement Database Encryption3. Implement Database Masking

Low

Complexity

High

Low

Valu

e

High

“Targets”“Secondary

Targets”

3 2

1

1 1

1

1

5

6

7

8

2

3

2 3

44

The above chart is meant as a guideline for prioritizing and planning SHARE system improvements across the five focus areas. Many recommendations, such as End-to-End Monitoring and a Configuration Management Database (CMDB), enable improvements across all focus areas and are highlighted by a black outline.

The infrastructure technologies discussed in this paper are all foundational, and will be reused throughout future State of New Mexico SHARE initiatives, including Enterprise Performance Management and the Department of Transportation expansion. Therefore, the return on the investment will compound as more functionality and applications take advantage of the common infrastructure. Oracle is available to assist SoNM in determining the selection of technologies that would be most advantageous to pursue for specific near, medium, and long-term initiatives.

We further realize implementation of any recommendations discussed in this paper might only be possible in an extended timeframe and after further analysis. The Oracle team stands ready to continue working with the State of New Mexico in developing specific project plans, providing ROI analysis, jointly defining architecture, building proof of concepts, and sharing best practices and other information as chartered through the Insight program.

On behalf of the entire Oracle Assessment Team, we thank you for the opportunity to work with you on the State of New Mexico SHARE Infrastructure Optimization project.

Respectfully,

The Oracle Assessment Team

Appendix APage 34 of 55 New Mexico and Oracle Confidential

Page 35: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

Oracle Enterprise Manager Real User Experience InsightOracle Real User Experience Insight will enable SHARE to maximize the value of their business critical applications by delivering insight into real end user experiences. It can help identify lost productivity from frustrated users, reduce support costs by lowering call center volumes, accelerate problem resolution of poorly performing applications, and help the SoNM adapt to changing needs by providing insight into business trends and user preferences. It integrates performance analysis and usage analysis into a single offering, enabling business and IT stakeholders to develop a shared understanding into their application user’s experience.

Oracle Enterprise Manager PeopleSoft Management PackThe Application Management Pack for PeopleSoft Enterprise delivers comprehensive management capabilities for the PeopleSoft environment.By combining PeopleSoft management and monitoring with Oracle’s rich and comprehensive system’s management solution, Enterprise Manager Grid Control significantly reduces the cost and complexity of managing PeopleSoft environments.

Manage PeopleSoft Targets In A Centralized, Browser-Based InterfaceThe Application Management Pack integrates seamlessly with Oracle Enterprise Manager and provides the most advanced PeopleSoft management solution available. From a central, browser-based console, PeopleSoft administrators can now monitor and manage PeopleSoft components in a manner far superior to previously offered solutions. Support is provided for the following: Application Server Domains, Process Scheduler Domains, Search Server Domains, PIAs and Web Sites, and Application Databases. In addition to Enterprise Manager monitoring capabilities that can determine the operational status of the specified target, the solution provides administrators with a comprehensive set of management tasks.

Domain Management: Users can now perform domain management action such as start, stop, and purge cache as well as configuration management from the centralized interface.

Advanced Configuration Management: Configuration management is much more user friendly with a meta-data driven user interface that guides the administrator as configuration attributes are defined.

Centralized Log Management Capabilities: Administrators have the ability to quickly view log files based on a chosen target with the option of

Page 35 of 55 New Mexico and Oracle Confidential

Page 36: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

..........searching, viewing or exporting them to the local machine. Logs may be purged or archived eitherindividually or via a scheduled job on an administrator-defined interval.

System Level Management Capabilities: The Application Management Pack can automatically create a graphical topology display of a PeopleSoft environment that clearly displays the relationships between the various PeopleSoft targets. Context-sensitive actions can also be performed on specific targets from within the topology viewer to help streamline management tasks.

Service Level Management: Through Service Level Management, administrators can monitor services from the end-users’ perspective using service tests or synthetic transactions, model relationships between services and underlying IT components, and report on achieved service levels

Oracle Enterprise Manager Change Management PackOracle Change Management Pack for Oracle Database 10g provides an integrated solution for database administrators and application developers to manage database changes. Using the data dictionary metadata management APIs in Oracle Database 10g, Oracle Change Management Pack allows developers and administrators to rapidly compare schema objects before and after an application upgrade, identify out-of-band changes made to the database as well as track changes to database initialization, authorization and storage settings.

There are many causes of downtime that SHARE database administrators have to deal with, ranging from hardware failures to performance problems. However, one of the hardest problems that DBAs face is to identify out-of-band changes made to production environments, such as initialization parameters, or ad hoc additions of new access structures, such as indexes or materialized views. These changes are more insidious because they are often made by well-meaning but inexperienced administrators in order to fix production problems.

However, the net outcome to end-users is more harm than good, such as slow application performance. The troubleshooting DBA needs to make a rapid assessment of the database changes to find the problem and restore the application quickly to its optimal operating levels. The Change Management Pack for Oracle Database 10g is specifically designed to support today’s complex enterprise applications like SHARE. With a few mouse clicks, SHARE DBAs can easily capture and version gold definitions of large application schemas, such as PeopleSoft, which can contain thousands of database objects. Compare this with the numbers of steps needed to capture and maintain the object definitions manually. These gold definitions captured by the Change Management Pack, also called baselines, represent the ideal schema, authorization and initialization settings for the database.

SHARE Database administrators can now quickly and easily identify out-of-band changes made to application schema objects, such as a new index, or database initialization parameter changes or user-object privileges, by comparing the current database with the previously captured gold baseline

Page 36 of 55 New Mexico and Oracle Confidential

Page 37: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

versions. Change Management Pack allows SHARE DBAs to analyze the root cause of this type of application downtime and identify the changes needed to restore the application to expected service levels.

Oracle Enterprise Manager Provisioning PackThe Provisioning Pack automates the deployment of software, applications, and patches. It makes critical data center operations easy, efficient and scalable resulting in lower operational risk and cost of ownership. The ability to provision the entire software stack that includes the operating system, the middleware and the database, supplemented by comprehensive reporting tools make Provisioning Pack an extremely significant entity in overall System Management space.

Oracle Enterprise Manager Data Masking PackSHARE has the need to share production data with internal and external users for a variety of business purposes, such as application testing. With the Data Masking Pack, the SoNM can now share information securely internally and with business partners and stay compliant with government regulations. SHARE security and database administrators can improve their productivity by eliminating manual processes and by applying consistent information security policies.

SHARE has sensitive, confidential or personally identifiable information stored in a variety of columns in database tables. These sets of information are widely different and yet they are all candidates for data masking. By having a central definition for common masking formats in the Data Masking Pack, SoNM can apply data privacy rules consistently to all production data and thus, ensure compliance with regulations. The Data Masking Pack can accommodate a variety of data through the use of masking formats.

The Data Masking Pack provides out-of-the-box mask primitives for various types of data, such as random numbers, random digits, random dates, constants. SoNM can use other built-in masking routines, such as shuffling, which shuffles the value in a column across different rows. This is useful when the range of values in a column is often not known and the shuffling of values in the same table provides a sufficient degree of privacy protection. For departments that require the masked value be realistic but not based on the original data, the Data Masking Pack can replace the original data, such as names and addresses, with data containing fictitious names and addresses derived from external data sources.

Page 37 of 55 New Mexico and Oracle Confidential

Page 38: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

..........

Oracle Database Advanced Security OptionOracle Advanced Security is a security option for the Oracle Database 11g. Introduced in Oracle8i, Oracle Advanced Security combines network encryption, database encryption and strong authentication together to help customers address privacy and compliance requirements:

1. Transparent Data Encryption 2. Network encryption and data integrity 3. Strong authentication

Transparent Data Encryption provides an easy and effective protection of stored data by transparently encrypting data (using 3DES or AES with up to 256 bits) at the column or tablespace level.

Network encryption and data integrity: When information travels to and from the Database, Oracle Advanced Security Option provides a high level of security by offering support for the following encryption standards:

* RC4 (40, 56, 128, and 256 bits) * DES (40 and 56 bits) * 3DES (2 and 3 keys) * AES (128, 192, and 256 bits)

Oracle Advanced Security also protects the integrity of information, making sure the message hasn't been modified since it left the source by adding an encrypted digest to the message, using either SHA-1 or MD5.

Strong authentication: Passwords alone are often not secure enough, and are known for their high TCO. Combined with the weak security they provide, they should not be used on their own. Two-factor (or "strong") authentication is based on something the user has (a smart card, token, etc.) and something she knows (a PIN or passcode). Oracle Advanced Security supports the following industry-standard authentication methods:

* Kerberos * RADIUS (Remote Authentication Dial-In User Service) * Secure Sockets Layer (with digital certificates) * PKI

Page 38 of 55 New Mexico and Oracle Confidential

Page 39: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

Oracle Data GuardOracle Data Guard offers SHARE the management, monitoring, and automation software infrastructure that creates, maintains, and monitors one or more standby databases to protect SHARE data from failures, disasters, errors, and corruptions, thus enabling a modern and automatic Business Continuity plan.

Data Guard Key Benefits:

Disaster recovery and high availabilityData Guard provides SHARE an efficient and comprehensive disaster recovery and high availability solution. Automatic failover and easy-to-manage switchover capabilities allow quick role reversals between primary and standby databases, minimizing the downtime of the primary database for planned and unplanned outages.

Complete data protectionStandby databases also provides an effective safeguard against data corruptions and user errors. Storage level physical corruptions on the primary database do not propagate to the standby database. Similarly, logical corruptions or user errors that cause the primary database to be permanently damaged can be resolved. Finally, the redo data is validated at the time it is received at the standby database and further when applied to the standby database.

Efficient utilization of system resourcesA physical standby database can be used for backups and read-only reporting, thereby reducing the primary database workload and saving valuable CPU and I/O cycles. This feature would greatly enhance and standardize SHARE currently fragmented scenario. In Oracle Database 10g Release 2, a physical standby database can also be easily converted back and forth between being a physical standby database and an open read/write database. A logical standby database allows its tables to be simultaneously available for read-only access while they are updated from the primary database. A logical standby database also allows SHARE users the ability to perform data manipulation operations on tables that are not updated from the primary database. Finally, SHARE can create additional indexes and materialized views in the logical standby database to optimize reporting performance.

Flexibility in data protection to balance availability against performance requirementsOracle Data Guard offers the maximum protection, maximum availability, and maximum performance modes to help SHARE balance data availability against system performance requirements.

Protection from communication failuresIf network connectivity is lost between the primary and one or more standby databases, redo data cannot be sent from the primary to those standby databases. Once connectivity is re-established, the missing redo data is automatically detected by Data Guard and the necessary archive logs are automatically transmitted to the standby databases. The standby databases

Page 39 of 55 New Mexico and Oracle Confidential

Page 40: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

..........are resynchronized with the primary database, with no manual intervention by the SHARE database administrators. 

Centralized and simple managementData Guard Broker automates the management and monitoring tasks across the multiple databases in a Data Guard configuration. SHARE administrators may use either Oracle Enterprise Manager Grid Control or the Data Guard Broker’s own specialized command-line interface (DGMGRL) to take advantage of this integrated management framework.

Integrated component of the Oracle databaseAvailable as a feature of the Enterprise Edition of the Oracle Database, SHARE can use Data Guard in combination with other Oracle High Availability solutions such as Real Application Clusters, Oracle Flashback, and Oracle Recovery Manager.

Oracle Data Guard Architecture

NetworkNetwork BrokerBroker

ProductionDatabase

ProductionDatabase

Logical StandbyDatabase Open for

Reports

SQLApply

Transform Redo to SQL

AdditionalIndexes & MVs

Physical StandbyDatabase

DIGITAL DATA STORAGE

DIGITAL DATA STORAGE

Backup

Redo Apply

Sync or Async Redo Shipping

Description of Services and Deliverables

A. Services

1) Stabilization Assessment Review a) Analyze the current 10gRAC on AIX configuration including clusterware, storage,

interconnect, and database systemsb) Prepare and execute operational (“resiliency and outage”) testing to identify

instabilities in the clusterware, database, operating system and networkc) Provide a Stabilization Assessment Report based on the above activitiesd) Provide up to 40hrs of assistance in resolving the identified areas of instability as

described in the Stabilization Assessment Report

2) Installation & Configuration of Oracle 10g Products a) Install and configure Oracle Partitioning on two tablesb) Install and configure Oracle Tuning, Diagnostics, and Configuration Packs on the

Management Server

3) Enterprise Technical Architecture Review

Page 40 of 55 New Mexico and Oracle Confidential

Page 41: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

a) Perform a high-level workload analysis of the current systemb) Identify possible database and applications tier configurations and performance

issuesc) Document system baseline configuration and determine the necessary

Architectural design requirements to improve performance and stabilityd) Provide a Technical Architecture Report based on the above activitiese) Provide up to 40hrs of assistance in resolving the identified areas of contention

as described in the Technical Architecture Review Report

4) Performance Assessment Review a) Analyze the operations of the Oracle database and its interaction with existing

applicationsb) Identify and document workload characteristics in an effort to understand the

database, network, and client baseline user workloadc) Analyze WebLogic/Websphere configuration settings and logsd) Analyze servers: memory, processors, OS-level settingse) Analyze domain-level settings and Tuxedo logf) Analyze Process Schedulerg) Analyze network, file servers, batch servers, and reporting serversh) Analyze end user usage patternsi) Analyze the current hardware and software technical architecture for

performance constraintsj) Identify and document additional technology areas of Oracle10g that may benefit

service levels; areas include high availability, disaster recovery, application and data connectivity to Oracle

k) Identify and document the application and technical architecture operational and management procedures

f) Provide a Performance Assessment Report based on the above deliverable four activities highlighting findings that include strengths and weaknesses in proposed areas, recommended technologies, and proposed future work plans

g) Provide up to 40hrs of assistance in resolving identified areas of performance constraints as described in the Performance Assessment Report

Deliverables and Applicable Acceptance Procedures, if any.

Deliverable 1: Stabilization Assessment Report

Completion of stabilization assessment review as outlined in Section 1.A.1) above.

Acceptance Criteria for Deliverable 1:Delivery of Stabilization Assessment Report.

Deliverable 2: Installation and Configuration of Oracle 10g Products

Completion of Installation and Configuration of Oracle 10g Products as outlined in Section 1.A.2) above.

Acceptance Criteria for Deliverable 2:

Page 41 of 55 New Mexico and Oracle Confidential

Page 42: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

..........Completion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning, Diagnostics, and Configuration Packs on the Management Server.

Deliverable 3: Enterprise Technical Architecture Report

Completion of Enterprise Technical Architecture Review as outlined in Section 1.A.3) above.

Acceptance Criteria for Deliverable 3:Delivery of Enterprise Technical Architecture Report.

Deliverable 4: Performance Assessment Report

Completion of Performance Assessment Review as outlined in Section 1.A.4) above.

Acceptance Criteria for Deliverable 4:Delivery of Performance Assessment Report.

Page 42 of 55 New Mexico and Oracle Confidential

Page 43: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

Appendix B

NM SHARE Technical Architecture/Infrastructure Questionnaire

1. Architecture Foundation

1. Do you have the conceptual Architecture Diagram? Maximus Diagram (sent from Bryan Emory)(John Prihoda to review any diagrams)

2a. Can you provide a diagram describing the current application architecture? (George to provide)

2b. Can you provide diagrams describing the network and hardware infrastructure for the enterprise? Supplied by Kelly

3. Are detailed layouts and specifications for each system available for review?*** This is key.

Any detail about the Web Logic and Tuxedo systems will be helpful. (LPAR sizes, RAM, versions etc…)

4. How many data centers are you currently utilizing? What are their locations and primary function?

1 - Physically at DOIT

5. Are the operations of any of these data centers outsourced? If so, to which organizations? No (As of June 30th). Maximus is currently doing very little in terms of database and mainly

supporting the PeopleSoft application. Minimal database tuning is happening.

6. Do you have any outsourcing arrangements for items such as departmental functions or web hosting? Maximus for just application. Will end on June 30th.

7. What are the mission critical applications (Which PeopleSoft Modules)? NM SHARE PeopleSoft implementation consists of two main modules, HR and Financials.Within those two modules, the mission critical functions are broken out as follows:HR (paying employees is the most critical function SHARE performs):

Payroll Time and Labor Recruitment (important, not necessarily mission critical, but important to

consider in terms of a wholistic high availability plan. Every state hire goes through this system – ideally 24 x 7)

FIN: The complete process of paying vendors (invoice to check is essential). A little more leeway in downtime than HR, but still mission critical.

Page 43 of 55 New Mexico and Oracle Confidential

Page 44: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

..........

8. Do you have a standard toolset, platform and methodology for development? Describe. Operating Systems are AIX and Windows A standard development methodology does not exist DOIT has a development team and uses a small number of consultants

(ACRO, Accutech) PeopleTools 8.46 Citrix sits behind a firewall and is used as a gateway for the development to

access common tools (PeopleTools, SQL Developer) in a secure and controlled fashion; this also alleviates maintenance issues on individual developer machines. Citrix is also used for some offsite people

Citrix farm consists of two servers used by 6 users. Citrix environment is unstable, often not accessible or slow. The system is

currently being redesigned to address these issues. SQL Developer is used to build custom queries and perform routine

application development like adding indexes. SQL Developer and customer queries are often run directly against the

production database. Prakash Bhakta from SPO mentioned problems between HR and FIN and the

Integration Broker but I didn’t capture detail.

9. Are backup operations for distributed functions centrally controlled? Nothing centrally controlled. Developers doing their own code backup. No coordinated effort for backing up the system. Centrally controlled from DOIT, but performed by different groups

Neil is responsible for all Windows Server backups (I might need clarification on whether Neil is part of Kelly’s team)

Kelly Mentzer’s team is responsible for UNIX server backups

10. What is the timeframe that IT services and applications are required to be online during a given day?

(John to look into the SLAs) The expectation is 24 x 7, 365 Realistically, the model is closer to “6 – 16’s;” 6a.m. to 10p.m, Monday –

Saturday SLA specifies that the system can be taken down for maintenance and

backups once every two weeks Alex bounces the Financial web servers once a week on Saturday night to

clear the cache (HCM is challenging due to continuous processes) (Please clarify whether or not this is only for Financials and not inclusive

of the HCM servers) Gary Graves schedules the HCM batch processes Alex schedules the Financials batch processes

11. Who are the primary server vendors (manufacturers)? IBM Juniper Cisco Switches 3Com SAN is an IBM DS4800

Page 44 of 55 New Mexico and Oracle Confidential

Page 45: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

12. Approximately how many servers in production are mission critical (database and application tier)?

Database (3): i. santafe-aii. santafe-biii. roswell-a (maintains a Reports Repository)

Application/Web Tier (2):i. taosii. gallupNFS Server (1):i. tucamcari

File Servers (2): i. nambeii. alamos (was training - UPK)

a. both servers run psnt for FIN and HCM

13. Do all of the mission critical servers have redundant or backup power to meet availability requirements? Describe.

Yes. Redundant power with a generator backup.

14. Do you have a documented set of application development standards? Describe Standards were part of the implementation. Bruce Walker and his team use them for documentation. There is a migration process, but not from a database perspective. Most updates are extracts (DBAs do not do this) Methodology: “DR” -> Development Request.

a. Initially outlines the user requestb. A Technical assessment and a Business Function assessment are

conducted which also includes an impact assessment. c. Requires signoff at the business and technical level. d. There is a signoff before any change is moved to production. e. The strategy falls down at the Q/A step as acquiring resources to actually

perform the Q/A are scarce and many changes go into production without thorough quality controls

f. A methodology exists, execution is problematic

15. Do you maintain separate development, test, QA, and production environments? Describe The environments exist, but they’re not used in the traditional way (Mike Neitzey) A Training does not exist and is needed (the state owns UPK) The one consistency between the development, training, and production

databases is that the data is the same Development database (Hobbs) is 10.2.0.3 Production and Q/A databases (santafe-a, santafe-b, roswell) are 10.2.0.2 Testing is problematic from developers not running programs. Be nice if dbas

could perform this themselves. Q/A servers (roswell-a, roswell-b) run RAC, development (hobbs does not). Development: Prod and Demo are accurate, the rest of the systems are

development HR Environments:

Prod Qa Test Dev

Page 45 of 55 New Mexico and Oracle Confidential

Page 46: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

.......... qb, qc, qz, qo (qo is for demo – vanilla PeopleSoft HR)

FIN Environments: Prod Qa Test Dev qc, qz, qo (qo is for demo – vanilla PeopleSoft Financials)

George will track down the rest

16. Is the test environment capable of providing test scenarios that accurately represent production?

No. Processes that run efficiently in test take many hours in production roswell-a was intended to be a Q/A server but is now serving as a mission critical

service as a report repository for production The data in all environments is a full copy of the production database (Is using a

subset of production an option for certain development or test environments). Oracle Databases are inconsistent across environments. Initialization parameters

across instances are different.

17. What operating system(s) do these applications and databases run on? 5.3 AIX, different servers run different patchset levels but all base is 5.3 File Servers are Windows

18. Are formal project management processes used to manage operations? Describe. 3 Project Mgrs -> Mike Neitzey PMP for SHARE, Kelly Mentzer is the operations

manager responsible for infrastructure

19. Describe the change management system? No formalized Change Management Process. People can change the system without a tracking mechanism in place. Application Change Management is controlled through DR process. Leo Hunt approves any changes in Financials and Payroll before they move to

production Rick Byrnes approves any changes in HR Kelly Mentzer approves any technical changes. On the technical side, there are no formal documents that travel with the process. The Technical Review team reviews any changes and then they go to the steering

committee.

20. What is the current data storage capacity? What percentage of that capacity is currently being utilized?

? 2TB total (should be more like 20TB) 1 TB for production 70% of production storage is being used < 10% of all storage is left? (Chris Rodriguez to supply further details)

21. What best describes the current mission critical disk storage (i.e. JBOD, RAID, SAN/NAS, etc….)

RAID 5 on IBM DS4800 SAN. Oracle Automatic Storage Management used for all databases except for the

Oracle Enterprise Manager repository on the hatch serverPage 46 of 55 New Mexico and Oracle Confidential

Page 47: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

21a. Please describe the storage architecture (vendor, number of disks, number of hba’s per server) and firmware patch levels.

IBM DS4800 SAN Production has ≈ 27 disks involved for Production 2 HBA’s per server. P570s are up to latest firmware patch. Brocade switches and SAN box were to be upgraded this weekend but this will

now probably happen in July

22. Describe the company's current backup and recovery process (i.e. server based or network based, media, frequency, etc.) Issues:

RMAN - Weekend Level 0 (full) to an NFS shared filesystem, 1a.m. RMAN - Weekdays Level 1 / 2 (incremental), 1am NetBackup copies the RMAN backups to tape on a daily basis Daily tapes are picked up and taken to the Records Library nightly No cold backups except for maintenance Full RMAN 0 takes around 4 hours Level 1 & 2 takes around 1 to 2 hours Once in awhile there are production impacts, but this is rare. Every 2nd Wednesday morning (1am) the backups are turned off so they will not

interfere with Payroll. All backups are managed by Grid Control

23. Does the backup and recovery process support full and incremental file and database backups?

Yes

24. Describe the archival system and processes for inactive data or aged data access? No archival system place 1 – 5% a month data growth. NMDOTs system will impact size on Projects because Project tables are larger in

general and can grow exponentially. Project tables are often compressed, but currently no data is compressed NM owns Partitioning but it is currently not being used

25. How often do you archive data? No archival system in place Important to consider future plans for the PeopleSoft EPM solution in the future

and the need for an archival strategy (EPM will be used for Financials and Supply Chain)

26. Does adequate off-site storage for backup and archival media exist? All tapes get shipped nightly to the Records and Archive building. DOIT has a successful method for moving hard copy of tape backups. Tape backups are kept for one month. Design needs to be approved to go back 13 months.

2. High Availability

1. How critical is high-availability in regard to the line of business systems? Describe for critical systems: A - Not critical; downtime of 24 hours or more is of little concern. B - Low criticality; downtime of 4-24 hours may affect productivity, but will not have a major business impact. C -

Page 47 of 55 New Mexico and Oracle Confidential

Page 48: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

..........Moderate criticality; downtime of 1-4 hours could result in unacceptable costs, deferred or lost sales, etc. D - High criticality; any downtime could lead to significant financial harm, exposure to litigation, customer loss, etc.

In practice it is C. SLA may say D (John Prihoda is looking into this) Realistically, there are certain times throughout processing that are higher mission

critical than others. If the system is down for ½ an hour it is acceptable which goes back to the 6am –

10pm processing time by 6 days. Saturday 8 – 5. “16 x 6” is the operating window as opposed to 24 x 7. If FIN is down, not a disaster as long as data is not lost and recovery is not

necessary Anthony Armijo says 1 week is an acceptable window for Financials downtime An SLA does not exist from the DOIT to the end user

2. Based upon the current configurations, how quickly could you realistically recover from a mission critical system failure: A - More than 24 hours. B - Within 4-24 hours. C - Within 1-4 hours. D - In less than 1 hour

santafe-b went down for a week, but santafe-a stayed running Impact to user was slower response. Junipers handled balancing users to santafe-a ABBA Technology is reconfiguring Juniper, SAN Storage, and the Citrix Servers It is unknown how long it would take the state to recover from a complete service

loss A complete system outage has never happened An RFP is currently in process to evaluate Business Continuity solutions like

SunGuard to provide Cold and Hot Disaster Recovery The Business Continuity Unit is running the RFP When santafe-b when down, it took about 2 hours to move the Process Scheduler

over to santafe-a The answer to the above question is ultimately “B” at this time. A major outage

would take about 24 hours to recover. With the enterprise SAN this would take about 1-4 hours.

Single Points of Failure (SPFs): DS4800 roswell-a tucumcari

In the current production RAC environment, OLTP users are load balanced and protected from SPFs, but the Process Schedulers are dedicated to either santafe-a or b and are not protected from SPFs.

3. What technology is the company currently utilizing for system high availability (clusters, standby servers, etc., manufacturers)?

RAC Oracle Clusterware Juniper Mirrored web/app servers

4. Which statement best describes the current use of high-availability technology: A - None; if a server fails, a new one must be purchased or an existing one redeployed. B - Cold failover; standby server must be booted, volumes loaded, database mounted, etc. C - Warm failover; new server automatically brought online, users must reconnect. D - Hot failover; immediate failover, users transferred to new server automatically with no downtime.

Page 48 of 55 New Mexico and Oracle Confidential

Page 49: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

Interactive/OLTP users (time entry) are load balanced across the santafe-a and santafe-b

Batch processing: HCM runs on santafe-a Financials runs on santafe-b

On a daily basis, FIN is heavier use, but every two weeks HCM (payroll) is heavier.

Juniper load balancers can, and have been, a bottleneck at times.

5. Describe your database archiving strategy. Covered in section 1 Archivelog on production.

6. Describe your backup/ recovery strategy. Covered in section 1

7. How often is your backup/recovery plan been tested?Database has been tested but the system as a whole has not.

8. Describe the database patching strategy. Database has not been patched for over a year due to Maximus requirements

9. Which statement best describes NM’s current degree of disaster recovery capability: A - None; if we suffer a catastrophic site failure, it's likely we're out of business. B - Slow recovery; databases, etc would be reconstructed from tape. We would be up in a few days. C - Quick recovery; systems would need to be brought online, IP addressing changed, etc. Most likely up the same day. D - Immediate recovery; our DR site would be immediately online with little or no interruption

B. Call records to bring over the tapes. Any major datacenter outages are long to recover.

10. Based upon the above criteria, what is the desired state of disaster recovery capability? Desired state is C, affordable Minimal data loss is a concern.

11. Briefly describe the technologies or vendors used in the DR strategy. NetBackup Tape drive is IBM (2 drives but will hold about 40 tapes)

12. How often do you test DR procedures? If partial tests, describe the differences that are not tested.

DR procedures are not tested

13. Do you have redundant Internet and/or WAN connections? Describe. Yes, both redundant Internet and WAN

14. Do you have Service Level Agreements for availability, performance, and capacity in place? Please describe in detail and any penalties if possible.

The SLA’s exist (John Prihoda will look into specifics). DOIT supports the SLA to DFA An SLA from DOIT to the end user does not exist

3. Capacity, Scalability & Performance

Page 49 of 55 New Mexico and Oracle Confidential

Page 50: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

..........1. What is the current user population? Describe the capacity plan that outlines expected growth of the business and impact to IT systems over the next 18-months, 3 years, 5 years?

Current: 24000 time entry users 4000 FT financial users 1000 HR users HR will stay fairly level FIN will increase but uncertain how much

2. What are the key factors to the IT systems for this growth, user population, web transactions, increased sales, new stores, etc….?  

Considerable growth is not expected

3. Can the systems accommodate this change in their current state?   Actual signon is probably down to about 18000 on the HR side and 6-7000 on the

FIN side. FIN is higher concurrency than HR except for a couple days a week.

4. What are the common user issues with IT and system performance and availability? Is this limited to a particular time or location?  

Performance problems: o Banking system, auto-reconciliation/semi-manual are not working

properly, cannot reconcileo Budget checkingo Financials, generally, has more performance issues than HR

Most calls come from users not being able to login, application is unavailable Batch processing takes long Reports -> snowball into a queue of long running reports that Alex has to manual

cancel Failing process can cause orphan records which have to be manually deleted Application Deadlocks At one time, the system was so overloaded that the database crashed

5. What are the IT department's complaints about system performance?See above

6. What tools do you use to monitor the system performance and system resource utilization? Grid control

7. What tools do you use for system performance modeling? Not used

8. What is the current system utilization and peak processing periods for the critical systems? Every two weeks for payroll FIN monthly and yearly processes

o End of month and year. o Beginning of year for fiscal processing.

Peak time for both FIN and HR is 8-4.

9. Do you utilize load balancing on mission critical systems to maintain performance? If so, please describe.

Two levels of load balancing:

Page 50 of 55 New Mexico and Oracle Confidential

Page 51: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

Database Level: Oracle Automatic Workload Management Web/Application Level: Juniper Redline Load Balancers

10. What the existing network and system architecture growth limitations? Growth is not a concern.

11. Do you conduct regular capacity planning meetings for both the systems and the network? No

12. Who is primarily responsible for performance and capacity planning? Elisa Storie

13. What is the expected response time from the user community? (John Prihoda to see if this is called out in the Project documentation)

16. Are the current performance levels within an acceptable range? Are you meeting all the SLA’s?

No, performance is not acceptable

17. Are key performance parameters tracked and monitored? How frequently? No

18. Is comprehensive historical performance data maintained? Describe how this is used in capacity planning.

No

19. Does heterogeneous system and application performance data exist in a common normalized format?

No

21. Are there any special circumstances (weekly, monthly seasonal activities, etc.) that highly impact system utilization? If yes, what systems are affected and when?

Explained above in question 8

22. How frequent do you receive ad-hoc requests from the business that consume IT staff and compute resources? Describe the prioritization and procedure for managing these requests.

Developers will go to Fernando to run query for them or DDL/DML Alex the PS Administrator will do similar. DBAs will get permission (privilege) requests

23. Have the business units detailed critical business functions, schedules and performance expectations? Describe.

FIN – Budgets, fiscal year open and close. Purchase Req, Order, Voucher and the underlying processes that support that ->

budget approval, timely and accurate reports – SOX compliance is a driving force HR – Payroll, Time and Labor, bi-weekly and annual Recruitment – SPO is charged with a rule to have equipment up and running

24. Which sites/locations generate the majority of usage on your system? Not currently tracked

25. What drives the usage from these sites/locations (geographic location)?

Page 51 of 55 New Mexico and Oracle Confidential

Page 52: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

..........

26. What is your daily global/regional window of on-line operations? Realistically, 6a.m. to 10p.m.

27. What is your daily global/regional window for batch processing? Batch processing occurs all day Budget checking is a batch that runs daily, every hour or so between 8 and 5 (this

consumes less resources than doing it on demand) Everyday between 3:30 and 6:00p.m. checks are cut

4. Security

1. How important is security to the SHARE? Security is very important to the SHARE Project Security is handled by the PeopleSoft system

2. What concerns do you have about the security?

3. Is security monitoring capabilities centralized? Monitored and controlled from PeopleSoft

4. Is security compliance a standard requirement for new applications? Yes. Not specific to industry standards necessarily, but to the PeopleSoft

environment.

5. Have security procedures and measures been reviewed? What is the procedure supporting user maintenance?

Again, defaulted to the PeopleSoft software

6. Has developer’s access been removed to protect the production instance from unauthorized access?

No, Developers can still do development in prod.

7. Which of the statements below best describes the NM's current strategy for user authentication: A - User ID & static passwords. B - User ID & dynamic passwords (password expiration). C - Software based tokens. D -Hardware based tokens, digital certificates, or biometrics

B Security roles are decentralized New Users are centralized Central security team audits the security list

5. Systems Management 1. Are the current systems adequate to support the help desk mission?

TeamTrack is the current HelpDesk system for both DFA and DOIT Currently, the system monitoring tools do not integrate with TeamTrack

2. Do you have defined system monitoring procedures? Only for the database (not documented)

Automatic Database Diagnostics Monitoring run on an hour interval. Statistics are run as a cron job every week.

Page 52 of 55 New Mexico and Oracle Confidential

Page 53: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

3. Is system monitoring done proactively to anticipate potential problems? Describe in detail.

Kindof, using grid control to keep an eye on things. Thresholds and notifications have been established to alert DBA’s in a proactive

manner Nothing implemented on the Web/Application Tier

4. How quickly are you notified of system problems (performance, errors, etc)? Database notifications are in near Real-time for certain events

5. How do you track and manage changes made to the systems including patches, upgrades, databases, applications, etc…? Describe the tools and CM process.

A formal Change Management process does not exist DR system is used for Application changes

6. What tools are used to detect and analyze problems on the systems? Oracle Enterprise Manager Grid Control UNIX Topas

7. How do you track and manage system problems that have occurred on the systems and provide historical trend reporting?

Not currently using

8. Do you currently have a system management suite in place (Grid Control, Unicenter, Tivoli, OpenView, etc.)? Which one?

Oracle Enterprise Manager Grid Control (OEM GC) is in place The state owns many components of the above OEM GC tool that are not being

utilized

9. Describe the level of integration between system monitoring tools, for OS, application, db, hardware, and problem and help desk management systems.

These systems are not currently integrated, but the infrastructure is there with OEM GC

10. What capabilities exist to manage systems remotely? Remote management is available in the current infrastructure (OEM GC)

11. Describe the level of centralization of the system management functions. OEM GC monitors the AIX systems and the databases The Web Logic Application Servers and Tuxedo are not being monitored

12. Do backup or redundant system management capabilities exist? No

12. Are the system management tools able to evaluate events to facilitate early warning of critical situations?

For the database, Yes For the rest of the system, No

14. Are there user complaints related to system management (i.e. system changes take to long, etc.)?

No

Page 53 of 55 New Mexico and Oracle Confidential

Page 54: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

..........15. What are the IT department's complaints about system management?

System management does not exist outside of the database

16. Are all calls for end user support logged at the Help Desk? Yes, using TeamTrack

17. How are end user and system problems escalated to resolution? Describe the prioritization schema for problems i.e. urgent, critical, important, and how the SLA’s relate to the prioritization.

TeamTrack is used for the Help Desk A formalized procedure for database and system problems does not exist once

they are submitted passed the Help Desk.

18. How much time is spent on day-to-day management of production, test, and development environments? On a scale of 1 to 10, how automated are these day-to-day functions?

5 on a scale of 1 – 10 most of the time

19. How much time is spent troubleshooting day-to-day for production, test, and development environments? On a scale of 1 to 10, how automated are these day-to-day functions?

5 on a scale of 1 – 10 most of the time. Maximus still does the majority of the troubleshooting

Appendix CDraft Database Architecture

Page 54 of 55 New Mexico and Oracle Confidential

Page 55: Contemporary Report - NM DoIT SHARE... · Web viewCompletion of Installation and Configuration of Oracle 10g Products, including Oracle Partitioning on two tables and Oracle Tuning,

Page 55 of 55 New Mexico and Oracle Confidential