capricorn - ibm · capricorn - six case studies for priority investments in the corporate server...
TRANSCRIPT
Capricorn - One of a series of white papers on the role of IBM ̂January 2003
CapricornSix Case Studies for Priority Investments in the
Corporate Server Infrastructure
David G. HeapPrincipal IT ConsultantIBM Enterprise Server GroupSomers, NY, [email protected]
Capricorn - Six Case Studies for Priority Investments in the Corporate Server InfrastructurePage 2
2 Executive Summary
3 Introduction
5 The Six Case Studies
6 Case Study 1
10 Case Study 2
14 Case Study 3
18 Case Study 4
20 Case Study 5
24 Case Study 6
28 Next Actions...
ContentsExecutive Summary
This white paper describes six recent, real case studies that show the current
environment, alternative outline solutions and the financial benefits of consoli-
dating and integrating UNIX® and Microsoft® Windows® servers.
These cases include:
1: Microsoft Windows NT® infrastructure to Linux® on IBM ~™ zSeries™
2: Solaris Web portal to Linux on IBM ~ xSeries™
3: Oracle data marts to AIX® on IBM ~ pSeries™
4: Windows intranet to Windows on IBM ~ xSeries™
5: Solaris Web trading to Linux on IBM ~ zSeries
6: Windows to VMware on IBM ~ xSeries
They average a net 3-year saving in IT expense of over 50% compared with the
current “continue as usual” strategy. This saving includes the additional capital
investment required for the target solution.
This does not include any migration project costs. However in all these cases
the transition cost is relatively modest or the target IT cost point is extremely
compelling.
One major lesson learned from these cases is that the type of savings achievable
vary considerably by target platform. In some cases, the hardware is dramatically
reduced, in others the software, in others the people productivity is significantly
increased.
The business cases are based on actual client financial data and do vary signifi-
cantly. But this only serves to emphasize the need to plug in your own actual data
on hardware, software and people costs.
It is also clear from these cases, which were deliberately chosen to be typical and
representative, that there is no “one size fits all” answer. Most organizations will
need a blend of 3 or 4 of these solutions.
This is likely to be a combination of IBM ~ products, Linux and
WebSphere® software, as well as sound IT management process and tools. This
will help these organizations achieve a simpler, optimized IT infrastructure and
provide a sound foundation for future growth.
These six cases average more than
a 50% net 3-year IT expense saving
compared with the “continue as
usual” current strategy
Capricorn - Six Case Studies for Priority Investments in the Corporate Server InfrastructurePage 3
Introduction
Two years ago, the first “Scorpion”1 white paper outlined a method of server data
collection and IT metrics that can be used to establish an initial health check
of the IT server infrastructure. This still provides a valid and robust baseline
for initial analysis and this evolving method is described in the latest “Scorpion
Update ” white paper2.
This white paper, “Capricorn,” includes new findings and outlines solutions and
case studies that show the current, powerful business case for investment in major
consolidation and integration of the server infrastructure.
Pressure on IT expense budgets has intensified. Most user executives are
looking for substantial business cases that show tangible business and IT saving.
This makes it harder to acquire the investment funding for very necessary IT
infrastructure projects.
In addition, many IT budgets have been cut back in relative if not absolute terms,
and the IT skills shortage remains acute. These factors have all increased the
need for automation, simplification and, above all, IT server solutions that reduce
overall service cost while retaining high quality service levels.
The methods of server demographic analysis have become much more focused,
and areas of solution opportunity can now be identified fairly easily by broad type
of server function. In addition several new technology solutions have emerged,
including Linux, blades, partitioning and virtualization, which make the business
case for change even more compelling.
These new technologies can now enable solutions at a radically lower cost. The
3-year business case solutions that follow, have much lower ongoing net IT
expense profiles, typically in the range of 30% to 60% lower, than following
today’s “current course and speed.”
This paper describes six real, but disguised, case studies completed in the last
18 months that show how major savings can be achieved in specific situations. It
is worth noting that all these cases are based on real financial data and realistic
cost allocations.
Highlights
1Scorpion - Simplifying the Corporate IT Infrastructure (October 2000) GF22-5168 on ibm.com 2 Scorpion Update An Evolving Method of Analyzing and Optimizing the Corporate IT Server Infrastructure (January 2003) GM13-0189-00 on ibm.com
The 3-year business case solutions
that follow, have much lower
ongoing net IT expense profiles,
typically in the range of 30% to
60% lower than following today’s
“current course and speed.”
Capricorn - Six Case Studies for Priority Investments in the Corporate Server InfrastructurePage 4
However, readers are strongly advised to look very carefully at their own actual
costs, charges and allocation methods. One of the primary findings of these
studies is that the current cost allocation model in many organizations does
not reflect the “true” economic cost of various solutions, and consequently is
distorting decision making.
The approach first identifies a “continue as usual” scenario, and defines what
hardware, software and people costs will be incurred over the next 3 years if the
current technology solution is pursued.
Alternative, technically viable, outline scenarios are then constructed and costed
for a variety of new technical solutions.
Technical Solutions and Strategies
There are four major technical strategies illustrated in these cases.
Virtualization is a proven technique that has been available in the form of VM
on the mainframe for many years. It is now also emerging as a powerful solution
on Intel® servers. This strategy is especially powerful when lightly loaded virtual
servers can be combined on the same physical server to achieve much higher
load factors. This approach is shown in cases 1, 5 and 6.
Many benefits can also be achieved by moving to the latest hardware technology,
whether these are Pentium 4 chips, blades, Power4 or zSeries 900 (z900) servers.
This strategy exploits the continuing price and performance improvements in
system capability. In addition, new server availability features and tools help
increase IT staff productivity and improve service quality. These advanced
technology strategies are exploited in all these cases.
Third, there are manual techniques that often require a redesign of the applica-
tion, Web or data mart hosting strategy. Other examples include performance
tuning or workload management. Cases 1, 2 and 3 illustrate some of these
techniques.
Finally, there is a powerful move toward more portable code, through Java
2 Enterprise Edition (J2EE) WebSphere, encapsulation or open source Linux
applications. This strategy is illustrated in cases 1, 2 and 5.
One of the primary findings of these
studies is that the current cost allo-
cation model in many organizations
does not reflect the “true” economic
cost of various solutions, and con-
sequently is distorting decision
making.
Capricorn - Six Case Studies for Priority Investments in the Corporate Server InfrastructurePage 5
The Six Case Studies
The six case studies have been selected as a representative and typical set of cases
that illustrate the technical solution areas.
Figure 1: 6 Case Studies: from Solaris or Windows to...
The chart shows the characteristics of each case. The second column shows the
type of application or function being considered for consolidation or integration.
The third column indicates the “current state,” for example, 39 Solaris servers.
The fourth column shows the target environment by platform and operating
system.
The final column shows the potential 3-year cost savings. These savings include
the investment cost in the target platform but not the costs of the actual
migration.
These case studies provide a useful checklist of practical technical strategies for
various target server platforms. In any one enterprise, you would expect to find
that three or four of these six cases present credible technical solutions with
a positive business case for a significant proportion of the current UNIX and
Windows server estate.
In any one enterprise, you would
expect to find that three or four
of these six cases present credible
technical solutions ...
Capricorn - Six Case Studies for Priority Investments in the Corporate Server InfrastructurePage 6
Case Study 1
Windows NT Infrastructure to Linux on IBM ̂ zSeries
This case study illustrates not only an actual solution, but also highlights many of
the considerations in analyzing over 1100 Windows NT servers.
IBM was asked by this enterprise to identify specific target servers that might be
migrated to Linux on zSeries, to build an outline financial case and recommend
a course of action.
Initially it was difficult to get an accurate inventory of the Windows servers.
An historic listing proved to be very inaccurate, showing 1800 server names. A
sample audit estimated the total to be nearer 1100. Some servers had been retired
and others had been renamed and redeployed resulting in many “duplicate
names” for some servers.
This alone illustrates a problem in the IT management process and is an indicator
of the need to tighten the IT management processes for Windows servers.
A key part of the business case was an estimate of the NT support “people
productivity.” The 1100 servers were supported by 68 people. This is 16.2 servers
per person. In this enterprise, a system administrator is also costed at a fully
burdened rate of about $100,000 per year.
Of these 1100 servers, approximately 300 were targets for consolidation to a
central mainframe. These servers were either located in the central data center
or performed NT infrastructure functions such as Web serving (60), campus file
and print serving (140) and database, mail and other functions (100) that could be
moved easily to a more centralized environment.
The 300 target NT servers needed about 18 people to support them, or to put this
another way, each server will cost about $6,000 per year in people support cost.
Other ongoing costs (such as depreciation and maintenance) took the total cost
per NT server to $8,000 per year.
A sizing exercise showed that these 300 physical NT servers could be replaced by
about 100 “virtual” Linux servers running under z/VM™ on a zSeries mainframe
equipped with 8 dedicated Linux engines (IFLs).
IBM was asked to identify specific
target servers that might be
migrated to Linux on zSeries, to
build an outline financial case and
recommend a course of action.
Capricorn - Six Case Studies for Priority Investments in the Corporate Server InfrastructurePage 7
The number of NT servers was expected to continue growing at about 2-3 per
week in the “continue as usual” case. The Linux case also reflects this growth
by including additional engines as additional capacity is required. In addition,
because the 100 virtual servers can be administered by z/VM more efficiently, the
Linux case shows a significant reduction in people support.
The Business Case for Change
The business case compared the current “continue as usual” Windows NT
scenario with the Linux for zSeries solution. This case is worth considering in
detail because a number of techniques are used to simplify and speed up the
evaluation process.
The following chart illustrates many different points. The horizontal axis shows
time, in 6-month periods, from one year ago, to four years from now, the period
of the business case. Note that we are “now” at the point 1hy1. The vertical axis
shows the actual IT expense (in thousands of dollars) in each 6-month period.
Figure 2: IT Expense over duration of business case
The red line at the top shows the “continue as usual” case. It shows the current
“expense run rate” which grows as the NT server farm grows over time.
The number of NT servers was
expected to continue growing at
about 2-3 per week in the “continue
as usual” case. The Linux case
also reflects this growth by includ-
ing additional engines as additional
capacity is required.
Capricorn - Six Case Studies for Priority Investments in the Corporate Server InfrastructurePage 8
The most interesting line is the green line, which we call the “magic wand” line.
This shows the costs that could be incurred if you could wave a magic wand and
replace the existing servers with the new Linux for zSeries solution “overnight.”
This line shows the ongoing costs (people support, hardware maintenance,
software maintenance and any facilities costs).
This green line is very important because it gives an immediate sense of the
“prize” to be gained in terms of a radically new operational cost model. Clearly
if the magic wand line is not very attractive, it is unlikely the final business case
for change will fly.
Obviously in the real world, additional costs are incurred. The most significant
are the investment required in new hardware and software (in this case the
Integrated Facility for Linux (IFL) engines, and z/VM software) and any book
value write-off of any discontinued NT servers. This cost is shown by the orange
line, which also shows the additional investments required in years 3 and 4 to
accommodate growth.
The blue line shows the raw transition expense, as the NT infrastructure is
migrated over an 18-month period to the zSeries solution. This blue line is the
sum of the rapidly reducing support costs of the NT infrastructure and the startup
investment and ongoing support costs of the replacement solution. Usually this
raw cash flow spike is smoothed by leasing or by depreciating the newly acquired
assets over 3 or more years.
Business Case Conclusion
The Linux for zSeries solution shows an overall reduction in the planned 4-year
business case from a total of $14.8 million to $6.1million, a saving of $8.7 million
or 59%. These costs include a total investment of about $2 million in the IFLs.
Remember that these numbers do not include migration costs. This is deliberate
at this stage. Estimating migration costs takes time, so we are using this “gross”
business case as an initial filter to decide whether to launch a more detailed
migration analysis.
However, at this stage we can make an estimate as to whether the migration is
likely to be “easy,” “medium” or “hard.” In this case, “easy” means a relatively low
impact on the overall business case.
The Linux for zSeries case shows
an overall reduction in the planned
4-year business case from a total of
$14.8 million to $6.1million, a saving
of $8.7 million or 59%.
Capricorn - Six Case Studies for Priority Investments in the Corporate Server InfrastructurePage 9
“Moderate” would typically make a significant dent on the gross savings, but the
expectation is that the project is still viable.
“Hard” often implies a major redesign or major migration. These migration costs
will usually make a significant impact on the short-term business case, but if this
project has a compelling longer term business case or is likely to be of strategic
impact, it may be the best-course of action overall.
In this case study, the migration costs were assessed as fairly easy, and a
pilot project was initiated to validate the technical solutions, business case and
migration plan.
Case Study Conclusion
The final outcome of this project is very interesting. The Linux for zSeries case
covered only 300 of the Windows servers. What about the other 800? Four
hundred of the other Windows NT servers, dispersed across about 40 geographi-
cally remote locations, were running simple infrastructure functions such as
remote file/print, as well as some simple applications. These 400 servers had
very low utilization. Linux and some Windows on about 100 local servers (with
VMware on IBM ~ xSeries servers) was an excellent solution for these
remote locations.
The remaining 400 Windows NT servers were a mix of application and database
servers, in production, testing and development modes. These 400 servers were
targets for consolidating to 100 servers using the VMware and technology refresh
strategies that will be covered in later case studies in this paper.
The conclusion is that several strategies, described in these case studies, are likely
to be required to optimize the management of the total server real estate in any
organization. In this case, these combined strategies reduce about 1100 actual
servers to about 200.
The conclusion is that several strat-
egies, described in these case stud-
ies, are likely to be required to opti-
mize the management of the total
server real estate in any organiza-
tion. In this case, these combined
strategies reduce about 1100 actual
servers to about 200.
Capricorn - Six Case Studies for Priority Investments in the Corporate Server InfrastructurePage 10
Case Study 2
Solaris Web Portal to Linux on IBM ̂ xSeries.
This next case study is very different and illustrates a compelling case to move a
Web portal from Solaris to Linux on xSeries.
Current Configuration
The current environment at each site comprises 13 servers. At the front end, in
the public domain, Web serving is handled by 3 servers, with a total of 12 CPs
(Central Processors.)
Figure 3: Large Web Portal: Solaris Production
There are 8 servers (36 CPs) in the trusted domain, which handle real-time data
feeds and Web application serving.
The data domain is handled by 2 servers (12 CPs) which are then linked into the
“back-end” transaction handling systems.
For contingency reasons, this configuration is replicated across 3 sites, making a
production total of 39 servers and 180 CPs. In addition, there are many additional
development, QA, test, and pre-production servers not included in this case.
Most of the existing servers are in the 400-500 Mhz range, with only 9 servers 36
CPs) in the 750 MHz range. The average prime shift processor utilization across
all 39 servers is 13%, and the maximum is 29%.
Capricorn - Six Case Studies for Priority Investments in the Corporate Server InfrastructurePage 11
Here the “maximum” is the sum of the maximum loads on each server, over
a 14-day period, divided by the total capacity of all servers. Clearly individual
servers can have maxima higher than this. For example, one of the 2-ways has
a 14-day peak of 45% and one of the 4-ways has a peak of 55%. This is a tiny
illustration of the complexity of capacity planning for such multi-server UNIX
environments. The bottom line is that these UNIX servers are relatively under-
utilized.
The issue of using common, comparable metrics for measuring server capacity
and load in the UNIX and Windows world occurs time and again. This whole area
can be made very complicated and, in some cases, unnecessarily complicated to
get an initial “first-pass” solution.
Server performance, throughput and capacity depend on many factors such as
workload type (e.g. transaction, Web, mail, database serving etc.), architecture or
operating system (e.g. Solaris or Windows) gross server characteristics (e.g. MHz,
CPs) and actual server configuration (e.g. SMP, Bus speed, RAM, IO bandwidth).
However, for a given workload type (e.g. Web portal) and architecture (e.g.
Solaris), and for reasonably well-configured systems, a relatively simple, very
crude first pass at server capacity is CPs * GHz.
So, for example, an 8-way 400 MHz server would be 3.2 units of capacity.
Although crude, this is a very quick way of getting an initial estimate of capacity
and, when multiplied by actual server utilization, the processor load. Using this
crude measure, the installed capacity of the current Web portal is 31 units and
the load about 4 units.
Clearly if better comparative server performance metrics are available, such as
vendor performance claims, industry benchmarks or actual benchmarks, these
can be plugged into the above formula. This, of course, is essential when translat-
ing to different architectures or operating systems (such as Sparc/Solaris to
Intel/Linux), especially for large SMP servers.
In this case study, the enterprise was interested in identifying the outline costs
of an alternative Linux solution on IBM ~ xSeries servers. The object of
this exercise was not to make a radical change in portal design, or to audit the
existing design. It was merely to identify comparable costs of the current actual
configuration and the outline cost of an alternative configuration based on IBM
~ xSeries servers running WebSphere and Oracle.
The issue of using common, compa-
rable metrics for measuring server
capacity and load in the UNIX and
Windows world occurs time and
again.
Capricorn - Six Case Studies for Priority Investments in the Corporate Server InfrastructurePage 12
The first step in this process turned out to be relatively straightforward. A
box-for-box replacement of the existing 39 servers by 39 xSeries 1.6 GHz Intel
servers.
The suggested alternative is shown in the next chart. Using the very crude metric
described previously, this alternative configuration has an aggregate capacity of
about 64 units, about twice that of the existing configuration.
Figure 4: Large Web Portal: Linux for xSeries Production
Could this be optimized further? The answer is “Yes,” but the outline business
case is so compelling that it was not necessary to add any further detail, other
than to ensure that realistic software stacks and tools on each xSeries server were
available, and that there were no other readily identifiable technical bottlenecks.
Business Case
In order to make the valid business case, the comparisons were made with
WebSphere on both configurations. The chart shows the comparable 3-year cost
case.
This is an overall reduction in the planned 3-year server IT expense from $10.7
million to $2.8 million, a potential saving of $7.9 million or 74%. These expenses
include the investment of about $0.9 million in new xSeries servers. This would
be a project of “moderate” migration complexity.
Could this be optimized further? The
outline business case shown is so
compelling that it was not necessary
to add any further detail...
Capricorn - Six Case Studies for Priority Investments in the Corporate Server InfrastructurePage 13
The most important factors that drive the (high) cost of the current configuration
are:
a. The current chargeout rates and cost allocations that make the costs incurred
by users substantially higher than a realistic alternative cost allocation base
b. The very high hardware cost of the existing servers in terms of depreciation and
maintenance (all depreciation comparisons assume a “rolling” refresh using
current equivalent technology)
c. The reduction in software costs in this case occurs principally because of the
significant reduction in the number of CPs
d. And the reduction in “other” costs occurs principally because of the substantial
reduction in floorspace
Conclusion
This case highlights the reason for the current intense interest in replacing Web
portals and Web application serving with Linux on xSeries servers. It raises two
important observations.
First, the case is especially compelling when compared with many current user
chargeout models. This highlights why many IT organizations are now moving
toward a utility-based model for service delivery.
Second, although Linux is relatively new, and carries some technical risks and
costs of change, it is such a “game-changing” strategy with such attractive cost
characteristics, that many organizations have now implemented substantial pilot
projects.
Figure 5: Large Web Portal: 3 Year Business Case
This case highlights the reason
for the current intense interest in
replacing Web portals and Web
application serving with Linux on
xSeries servers.
This is an overall reduction in the
planned 3-year server IT expense
from $10.7 million to $2.8 million, a
potential saving of $7.9 million or
74%. These expenses include the
investment of about $0.9 million in
new xSeries servers. This would be
a project of “moderate” migration
complexity.
Capricorn - Six Case Studies for Priority Investments in the Corporate Server InfrastructurePage 14
Case Study 3
Oracle and Sybase Data Marts to AIX on IBM ̂ pSeries
The next case study illustrates a compelling case to move Oracle (and Sybase)
data marts from Solaris E450s to AIX on pSeries 680s.
This is a very important illustration of technical consolidation considerations and
the implementation of a financial utility cost model in a data mart environment.
Current Environment
At the time of the case study (early 2001), this organization had about 100 Sun
E450 4-way servers being used to run about 35 Oracle data marts. These data
marts were all relatively small, less than 80 GB each, with 40-50 concurrent users
using each mart. Although not in the scope of this case study, there were also
many Sybase marts.
These are all important business-critical databases and each primary production
server requires hot failover and a remote “contingency” server. The primary
site is in a major financial district, with expensive floor-space, and the second
contingency site less than 40km away.
Based on historical growth, the expected increase in data marts was about 36 data
marts per year. Using the current implementation model, this would require an
additional 108 actual servers per year. With a projected 3-year total cost of $55
million for this “continue as usual” scenario, there was a significant incentive to
find alternative solutions.
In addition, it is also worth noting that this base scenario would require an
additional 72 rack spaces, about 150 square meters of expensive floor space; that
the provisioning of a new data mart took several weeks; and there were significant
issues of high chargeout rates for these services with a desire to move to a more
“utility-based” service model.
Alternative Technical Solutions
Two major separate technical considerations are combined in the alternative IBM
solution to make very substantial savings and service improvements.
The next case study illustrates a
compelling case to move Oracle
(and Sybase) data marts from
Solaris E450s to AIX on pSeries
680s.
Capricorn - Six Case Studies for Priority Investments in the Corporate Server InfrastructurePage 15
The first step is to define a “cluster of 6.” This single cluster locates 5 production
images at the primary site and one contingency image at the second site. (Other
service levels can be met by dividing a 6-way cluster into 3 production and 3
contingency images, and then by mirroring the entire cluster.) These strategies
can provide a robust primary production site with contingency capacity ranging
from 20% - 100% at the second site.
Clustering was achieved by using the IBM High Availability solution on pSeries
(HACMP). The net effect is to reduce the number of required servers from 324
to 189 (a 42% reduction) with a corresponding reduction in 3-year cost from $55
million to $32 million.
Data Mart Aggregation
The second step was to aggregate multiple Oracle data marts on a single AIX
image running on a large 24-way p680 server.
Data marts traditionally have very “lumpy” workloads. Averaged across most of
the day (e.g. 8am to 8pm) many data marts have an overall utilization of less than
5%. Although 90% of the queries to a data mart are usually “pre-canned” reports
or relatively simple queries, 10% of the queries are often very heavy and require
substantial computing capacity to be available “on demand.”
Putting this another way, a typical user may require “100%” of the server for 1
minute every 20 minutes, but use nothing for the remaining 19 minutes. This
gives an average utilization of 5%. The key question in capacity planning for
these marts is how often do these user requests for peak “minutes” coincide?
(and does it matter ?)
In this case, an initial, deliberately very conservative technical evaluation deter-
mined that aggregating 18 database instances to one powerful AIX p680 24-way
was feasible and also provided enough flexibility to allocate sufficient RAM to
individual more heavily-used databases.
This approach was possible for several reasons. The p680 24-way is substantially
more powerful than the E450 4-way (each p680 engine is 2-3 times more
powerful than an E450 engine and there are 6 times as many).
The net effect of this first step is to
reduce the number of required serv-
ers from 324 to 189 (a 42% reduction)
with a corresponding reduction in
3-year cost from $55 million to $32
million.
Capricorn - Six Case Studies for Priority Investments in the Corporate Server InfrastructurePage 16
Second, the total and average processor and I/O load on the p680 can now be
higher. The p680 needs to be able to handle the “average” load of each mart
(less than 5% of an E450 for each mart) plus some estimate of the simultaneous
coincident user peaks.
The conclusion of this technical modelling was that it was very prudent to start
with a “rule of thumb” of 18 data marts per p680.
3-Year Business Case
Figure 6: IT Expense and Stages of growth
The red line shows the “continue as usual” case. This results in a total of 108 data
marts and 324 servers being installed over 3 years. Note that this total does not
include any Oracle software costs, which are substantial and usually based on the
number of servers or engines installed.
The blue line shows the total cumulative cost of following the “6:1 cluster”
approach. This reduces the number of failover and contingency servers consider-
ably (by 42%). The “upticks” in the cost line in Stages 3 and 5 are associated with
the provision of additional contingency capacity in the second site that needs to
keep pace with the growth in the production data marts at the primary site.
The green line shows the total cumulative cost of the IBM p680 solution with 18
data marts per p680 and 6:1 clustering.
Capricorn - Six Case Studies for Priority Investments in the Corporate Server InfrastructurePage 17
Business and IT Benefits
The bottom line of the business case is a substantial saving of 60% in the
projected IT expense, dropping from $55 million to $22 million. This is based
principally on two changes in the technical strategy; failover and contingency by
6:1 clustering, and 18 data mart instances on one large p680.
This initial business case was compelling enough to trigger a more detailed
technical design and business case. This was followed by a decision in mid-2001
to implement three p680 servers with a target of 18 data marts on the production
p680.
The Outcome
There had been an early concern that mixing 18 data marts on 1 (very powerful
24-way) server was too optimistic. At the end of the first rollout phase, over 50
data marts were easily handled on one p680. At this time, 18 months later, there
are now well over 200 data marts on one p690 server.
This highlights the difficulty of capacity planning in the data mart environment.
It is very important to understand the 90:10 rule, and the big difference between
spiky peaks and average utilization in these marts.
Now the revised business case to the end of stage 3 (18 months into this project),
with the huge benefit of hindsight, shows a stunning difference between the “as
usual” case at $24.7 million and the actual p680 costs at $5.8 million (a 76% cost
avoidance). It is worth restating that this does not include Oracle costs on 648
engines vs. 72 on three p680s.
In addition, end users now see much faster provisioning of data marts, now
measured in hours rather than weeks, and a utility pricing model that enables
them to buy server capacity (in easy units of server engines and disk GB) at a
standard and very attractive cost.
At the end of the first rollout phase,
over 50 datamarts were easily han-
dled on one p680. At this time, 18
months later, there are now well over
200 datamarts on one p690 server.
The bottom line of the business case
is a substantial saving of 60% in the
projected IT expense, dropping from
$55 million to $22 million.
After 18 months, the revised busi-
ness case, with the huge benefit of
hindsight, shows a stunning differ-
ence between the “as usual” case at
$24.7 million and the p680 case at
$5.8 million, a 76% cost avoidance.
Capricorn - Six Case Studies for Priority Investments in the Corporate Server InfrastructurePage 18
Case Study 4
Windows Intranet Portal to Windows on IBM ̂ xSeries
An important class of solutions not considered so far is a technology refresh
within a specific architecture. For example, by carefully replacing existing 900
MHz Intel servers in an intranet portal with more powerful 2.4 GHz Intel servers,
significant savings can be achieved.
Current Environment
The current application is an intranet which enables employees to access com-
pany policies, information, news and simple HR applications. The data is largely
read-only and the servers are relatively underutilized. Currently there are 81
servers: 59 Windows 2000 production servers split across two sites and 22
nonproduction NT servers.
Figure 7: Windows Intranet to Larger Windows Servers
The current servers are shown above, and are typically 2-way Intel Pentium®
III processors running at 700-900 MHz. This chart also shows the level of
initial detail needed for each major server function type; in this case the Web,
application, SQL, file, backup and systems management servers.
Outline Solution
In this case the target solution is a simple technology refresh and consolidation of
various functions. Using IBM 2-way 2.4 GHz Pentium 4 servers we can shrink the
number of servers in each category by a ratio which varies between 5:1 and
By carefully replacing existing 900
MHz Intel servers in an intranet
portal with more powerful 2.4 GHz
Intel servers, significant savings can
be achieved.
Capricorn - Six Case Studies for Priority Investments in the Corporate Server InfrastructurePage 19
2:1. It is obviously important to analyze each server category (e.g. the 10 SQL
servers), the current utilization, geographic separation needs, and service quality
requirements in order to develop a credible target solution.
Web servers can be reduced substantially, and the file servers will be replaced by
a NAS solution, and some “infrastructure” can be retained. The result is that the
total number of servers can be reduced from 59 to 21. This drives a significant
reduction in cost.
3-Year Business Case
Figure 8: Intranet Services: x-Windows Solution
This chart shows the impact of this simple technology refresh:
a. The ongoing hardware costs (depreciation and maintenance) are significantly
reduced as 40 “old” Web, application and SQL servers are replaced by 15
‘new’ ones
b. The ongoing software costs are also significantly reduced because of fewer
server footprints and CPs
c. Ongoing people costs are reduced, because there are significantly fewer server
images, applications and databases instances to support.
The bottom line is an approximate range of IT expense savings of 40%-45%
over 3 years. The next action was to initiate a Proof of Concept to establish the
migration costs, which were expected to be small.
This is an overall reduction in the
planned 3-year server IT expense
from $1.71 million to $970 thousand,
a potential saving of $740 thousand
or 43%. These expenses include the
investment of about $400 thousand
in new xSeries servers.
Capricorn - Six Case Studies for Priority Investments in the Corporate Server InfrastructurePage 20
Case Study 5
Solaris Web Trading to Linux on IBM ̂ zSeries
This next case study compares three alternative scenarios of the same WebLogic,
Oracle based Web trading application; on Solaris; Linux on xSeries, and Linux on
zSeries. The study requirement was for a realistic cost comparison of these three
very different server platforms.
Current Environment
Figure 9: Web Trading Application
This application is an online Web trading system handling about 300 concurrent
users at peak. The application outline is shown in the chart, and comprises four
primary servers, each duplicated locally with active load sharing and failover.
At the time of the case study (early 2002) the application was running on eight
domains of two Sun E10000 servers, each domain having six 400 MHz CPs and
8 GB RAM.
The first server handles the session with the external user-trader. Each domain
has one instance of WebLogic and the third-party application. These two domains
have an average weekday utilization of 35%, and a 30-day peak of 75%.
Capricorn - Six Case Studies for Priority Investments in the Corporate Server InfrastructurePage 21
The second server handles “real-time” data feeds and stock analysis for the user.
Each domain has an instance of the third-party application. These two domains
have average weekday utilization of 5%, and a 30-day peak of 15%.
The third server, an Oracle database, stores customer profile data and triggers
for outbound email. Each domain has an instance of Oracle 8.1 HA and Veritas
cluster server. These two domains have average weekday utilization of 7%, and
a 30-day peak of 20%.
Finally, the fourth server connects trading requests to the backend mainframe.
Each domain has an instance of WebLogic, Apache and Oracle 8.1. The two
domains have average weekday utilization of 9% and a 30-day peak of 20%.
The first point to note is the tradeoff between a “standard” server (e.g. a 6 CP
domain), which simplifies server support costs, and the individual workloads,
which vary considerably by function on each server, some of which have very
low loads.
This means that all the domains have to be sized to meet the highest peak (in
server #1) as no truly dynamic resource sharing is possible between the domains.
This is true of all systems which do not have true virtualization or dynamic
LPAR capabilities.
It is also worth noting that the average utilization across all eight production
domains is 14%, a very typical load factor. In this enterprise, the mainframe
average daytime load factor is 72%, about five times higher.
Alternative 1: Linux on xSeries
It was relatively simple to identify an alternative configuration based on Linux on
xSeries. Each of the four logical servers would be handled by three IBM xSeries
(350) 900 MHz 4-way Pentium III servers with the same basic software stack.
For example, server # 1 would run on three x350s, each with Linux 2.4, WebLogic
and the third party application, instead of two Solaris 2.7 domains running the
same software.
Capricorn - Six Case Studies for Priority Investments in the Corporate Server InfrastructurePage 22
The resulting configuration was 12 x350 servers, each with identical 4-way,
900 Mhz, and 2 GB RAM running the same software stack as the current
configuration.
This is a total of 48 xSeries CPs. It is important to note the software stacks are a
very significant driver in the total business case, and that specifically the number
of CPs on which Oracle and WebLogic run is a major cost driver.
Alternative 2: Linux on the Mainframe
Figure 10: 8 Virtual Servers, 2CPs, 1 Location Linux on z900 Configuration
Again it was relatively simple to design the above functional configuration that
maps the current servers into virtual servers running as VM guests in a separate
partition on the mainframe.
Although each of these Linux virtual servers runs the same software stack as the
Linux on xSeries solution, the big differences are:
a. Only two physical engines (IFLs) are required to run the whole application,
because resource virtualization, using z/VM resource sharing, means that the
utilization factor can be much higher
b. Communication between Linux virtual servers and with the z/OS™ applications
can be through HiperSockets™. This means inter-server communication at
memory speeds and better response time
c. In practice this system would be implemented with one IFL on two separate
z900s. This provides failover and access to IMS™ data in the sysplex locally
or remotely
Capricorn - Six Case Studies for Priority Investments in the Corporate Server InfrastructurePage 23
3-Year Business Case
Figure 11: Web Trading Application Costs
The 3 year business case is shown in this chart. Each of the major cost categories
shows significant reduction in both Linux solutions, with Linux on zSeries
delivering the most cost-effective solution in this case.
The major cost drivers are as follows:
Firstly there are substantial hardware cost savings. The current hardware and
its maintenance is very expensive. Both target solutions, especially the xSeries
solution, reduce the hardware cost considerably.
Second, one of the major cost drivers is software, especially the Oracle and
WebLogic costs per engine. The xSeries solution makes no significant change
in the software costs (there are the same number of engines), but the zSeries
solution gives a significant software cost reduction.
Third, people costs are also significantly reduced. A significant number of people
are required to support the current application, with an inefficient IT manage-
ment process. Productivity can be significantly improved in the Linux on xSeries
environment, and the zSeries environment already has the IT process and
administration tools to run very efficiently.
The bottom line: after considerable “what-if” analysis, the conclusion for this
situation was that the current Solaris environment is approximately 3-4 times the
3-year cost of the Linux on zSeries solution.
This is an overall reduction in the
planned 3-year server IT expense
from $7.4 million to $1.5 million, a
potential saving of $5.9 million or
79%. These expenses include the
investment of about $0.5 million
in new zSeries IFL hardware. This
would be a project of “moderate”
migration complexity.
Capricorn - Six Case Studies for Priority Investments in the Corporate Server InfrastructurePage 24
Case Study 6
Non-Production Windows Servers to VMware ESX on IBM ̂ xSeries.
The next study considers the case for virtualization in the Windows environment.
Current Environment
The focus of this solution is 144 non-production Windows servers, specifically
development, test, QA and less critical infrastructure, shown in this chart.
Figure 12:Windows non-Production Infrastructure
The average current server is a 2-way, 600 MHz Pentium III with 900 MB RAM.
The typical average weekday processor utilization is less than 5%, with peaks
around 10%. None are business critical and a recovery time of 2-3 hours is
acceptable.
VMware ESX
VMware ESX is a software hypervisor which runs “on the metal” of an Intel
server. Multiple “virtual guests” (such as instances of Windows 2000, Windows
NT and Linux) run on VMware which acts as a resource scheduler and can
improve server utilization.
VMware is ideal for low utilization functions, such as file/print, domain, Web,
development and test servers, which individually require less than one CP of
compute power. (2 CPs will be supported by VMware in 1q2003).
The focus of this solution is 144 non-
production Windows servers, spe-
cifically development, test, QA and
less critical infrastructure...
Capricorn - Six Case Studies for Priority Investments in the Corporate Server InfrastructurePage 25
For these functions, it is reasonable to assume around 4 guests per CP. In other
words, a 2-way Intel server can typically support 8 guests, and an xSeries 440
(x440) 8-way can typically support around 32 guests.
Alternative: VMware on x440 8-way
Using VMware ESX, the current 144 2-way 900 MHz servers, each running one
function, can be moved to 4 x440 8-ways running 144 guests. This is a significant
footprint and floorspace reduction, although there is no change in the number
of server there is images.
An additional x440 8-way was included to provide additional contingency and
backup in the event of significant hardware or VMware failure. Although very
unlikely, the impact of such an outage would be felt across 32 guests. Although
none of the individual guests is business critical, running 32 guests on a single
server increases the criticality of the server itself.
3-Year Business Case
Figure 13: VMware ESX Consolidation
Capricorn - Six Case Studies for Priority Investments in the Corporate Server InfrastructurePage 26
The primary cost driver here is people productivity. VMware stores the operating
system as a file, which can be easily copied, moved and restored. This means
that developers and testers can keep a copy of the server guest image on a
server and only activate it when needed. This reduces significantly the need for
dedicated servers.
But more importantly, standard images can be developed, stored and used
throughout the development, test and preproduction process — increasing staff
productivity significantly.
Conclusion
These six case studies illustrate the significant cost savings achievable using a
variety of different technical solution strategies.
“Server Consolidation” is often used very simplistically to imply a mere technol-
ogy refresh to more powerful servers on the same operating system. While this is
viable for some simple classes of function, even this approach requires a wider
view to be taken of the technical context of the application and the actual current
costs. Case study 3 (Intranet migration) is an example of how to address this.
But, as we have shown in the five other case studies, the biggest savings typically
involve a shift to what have been called ‘game-changing’ strategies, such as
virtualization, Linux, blades, data mart consolidation and so on.
These new strategies have compelling financial cases and many other major
user benefits, such as speed of deployment. However they also raise very impor-
tant considerations of technical risk and migration costs that also need to be
thoroughly addressed typically in a pilot or proof of concept project.
Having said this, many organizations have now completed their pilot evaluations
and are very actively pursuing several of these strategies to reduce the cost and
complexity of their server infrastructure.
This is an overall reduction in the
planned 3-year server IT expense
from $8.8 million to $5.4 million, a
potential saving of $3.4 million or
39%. These expenses include the
impact of a 20% improvemnt in
people productivity. This would be
a project of “easy” migration
complexity
These new strategies have compel-
ling financial cases and many other
major user benefits, such as speed
of deployment.
Capricorn - Six Case Studies for Priority Investments in the Corporate Server InfrastructurePage 27
Next Actions…
We strongly recommend a starting point based on a thorough review of the
current UNIX and Windows server environment.
Step one is an “eyes wide open” review of the actual current server demographics,
sample server utilization measurements, the quality of service delivered and the
effectiveness of the IT management process.
One of the most significant components of such a review is a real understanding
of the current actual hardware, software and people cost base, and a realistic cost
allocation to various services and servers.
In particular we suggest an objective analysis of the incremental costs of deploy-
ment on various platforms and a mapping of these costs into the current
chargeout algorithms. Many of these issues and methods of handling them are
addressed in the companion white papers.
Step two is to carve out the most likely areas of potential solutions. This can often
be quite tricky as many applications and functions are linked in a spaghetti-bowl
of interfaces and data transfers.
But in most organizations it is relatively easy to identify the top 10% of servers
which usually account for more than 70% of the processing load, and the top 10%
of the major applications or functions which often account for more than 75% of
the servers. These metrics provide valuable pointers to areas that may be good
targets for integration and cost reduction.
Step three is to follow the approach suggested by this white paper; identify
alternative solutions, build the business investment case for a change in strategy
and obtain the investment funding…
IBM IT Consultants can assist in this objective review and outline solution
process through the use of this method in a short intensive joint project.
This can usually complete the whole of the review and outline solutioning in an
elapsed time of about 8 weeks.
© Copyright IBM Corporation 2003
IBM CorporationSoftware CommunicationsRoute 100Somers, NY 10589U.S.A.
Produced in the United States of America1/03All Rights Reserved
The e-business logo, IBM, the IBM logo, IBM eServer, HiperSockets, IMS, VM/ESA, WebSphere, xSeries, z/OS, z/VM and zSeries are trademarks or registered trademarks of Inter-national Business Machines Corporation of the United States, other countries or both.
Java and all Java-based trademarks and logos are trademarks of Sun Microsystems, Inc. in the United States, other countries or both.
UNIX is a registered trademark of The Open Group in the United States and other countries.
Intel is a trademark of Intel Corporation in the United States, other countries or both.
Linux is a registered trademark of Linus Torvalds.
Other company, product and service names may be trademarks or service marks of others.
Information concerning non-IBM products was obtained from the suppliers of their products or their published announcements. Questions on the capabilities of the non-IBM products should be addressed with the suppliers.
IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply.
IBM may not offer the products, services or fea-tures discussed in this document in other coun-tries, and the information may be subject to change without notice. Consult your local IBM business contact for information on the product or services available in your area.
All statements regarding IBM’s future direction and intent are subject to change or withdrawal without notice, and represent goals and objec-tives only.
Performance is in Internal Throughput Rate (ITR) ratio based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput that any user will experience will vary depending upon consid-erations such as the amount of multiprogram-ming in the user’s job stream, the I/O configu-ration, the storage configuration, and the work-load processed. Therefore, no assurance can be given that an individual user will achieve through-put improvements equivalent to the performance ratios stated here.
GM13-0190-00