data center relocationtake one! - · pdf filedata center relocationtake one! joseph e. ford,...

40
Data Center Relocation…Take One! Joseph E. Ford, RCDD Craig A. Lowe, RCDD/OSP,LEED AP Robert G. Hall, MCSD Bala Consulting Engineers, Inc.

Upload: buikhuong

Post on 23-Feb-2018

222 views

Category:

Documents


0 download

TRANSCRIPT

Data Center Relocation…Take One!

Joseph E. Ford, RCDDCraig A. Lowe, RCDD/OSP,LEED AP

Robert G. Hall, MCSDBala Consulting Engineers, Inc.

Target Enterprise• One or more of the following:

– Data center over about 100 devices– Enterprise with 24x7 operations– Dealing with life safety or finances– Complex applications

• Growing well and improving IT management to match– Setting goals– Measuring progress– Managing to expectations

• Smaller DC need less time but the same steps

Project Plan

Benchmark View• Start the Project• Build New DC Facility• Information Transport System• Establish LAN / WAN / SAN• Move Equipment /

Server Waves• Decommission Old DC• Close the Project

Project Management View• Charter• Discover Requirements• Design (and Budget)• Validation• Build / Execute• Test• Review and Close

5%±

Project Charter

• Establish relationship with champion and PM• Select initial core team• Kickoff meeting• Logistics and meeting schedule• Start Project Plan and Presentation

Prepare for Data Center Relocation• As you do discovery, you will find systems that

cannot be moved as they are• This is a placeholder for projects that must be

monitored to assure they are on track for the move• Examples:

– Eliminate equipment that is too fragile to move (but probably running very important programs that nobody living knows how to support)

– Systems that depend on hard coded IP addresses that will not transfer to the new Data Center

– Applications that must be virtualized so they can move electronically

Sidebar Projects• These are subprojects that may develop in

addition to moving equipment• These are usually separate because they are

funded separately• Examples:

– Confirming that the Operation Center will function as expected

– Projects to design future changes to be compatible with the new data center

Discovery for Facility• Establish Inventory of hardware units• Extend into power, cooling, and cabling estimate• Extrapolate technology and growth changes• Add summary to Presentation• Client signs off Basis of Design

Discovery for Facility• Assist with floor plan for cabinets and racks• Add floor plan to Presentation• Monitor MEPS design process of DC, MDF, IDF,

NOC• Review fire suppression, security, and

environmental design• Review new building access for services and

docks• Add summary to Presentation• Help with budget approval(s)

Build and Commission Facility• Monitor construction schedule• Guest internet at new DC• Confirm walls, floors, ceilings, electrical,

cooling, etc.• Commissioning• Add summary to Presentation

Information Transport System• Discovery

– Add connection information to inventory • Design

– Design prototype cabinet(s) for topology modeling– Assist in topology decisions– Design patch and switch elevations– Add prototype and elevations to Presentation– Manage bidding and leveling– Help with budget and purchase orders

Single Line Diagram• Pull

Schedule• Plan View

Data Center Plan• Tray Plan• Cabinet

Elevations

Information Transport System• Build

– Coordinate schedules for cabinets, racks, trays, UTP and fiber

– Connect PDU to power strips– Confirm demarcation points

• Test– Test inventory and labeling of cables– Test switch patching

Design WAN LAN SAN & Storage• WAN connects other business locations and the two

Data Centers• LAN and SAN must be operational at both locations

– If refreshing equipment or changing technology, you may buy new for the new data center

– Otherwise, you may rent equipment for the old data center so you can move your equipment to the new data center

• Long lead times for WAN LAN & SAN Communication services

• New storage equipment may be burned in with the network

• Design must allow a couple of devices to provide a period of solid service before real hardware waves

OK Where Are We?• HW inventory is verified with power and service

connection fields noted• Location is chosen and facility changes are underway• Service connections are enumerated and ordered• Network topology has been selected• Information Transport System is at least being designed

and is probably out to bid• WAN, LAN and SAN design is settled and equipment

ordered• Cabinets and racks are ordered along with power strips,

security and KVM equipment

OK Where Are We?

What Can Go Wrong From Here?• You can shut down critical systems in error• You can forget part of a system• A piece of equipment may not come up• You can lose a truck • You can move too much at one time• You must know how to connect each device• Networks must be ready for applications • We must finish discovery and plan a safe move

during facility outfitting …

What Can Go Wrong From Here?

Discovery – Beyond Hardware• Get application blueprinting• Match applications to inventory• Identify 'Fragile Artifacts‘• Identify unique parts risks• Map telephony• Check both the routes and detours

– Walk and ride and walk

What’s an Application Blue Print?• An application

centered view of how devices connect and service business transactions

• Includes:– Devices (on

inventory)– IP addresses and

network rules– OS/DB versions

and patch levels– Virtual and

physical attributes– Recovery and DR

concepts

Collecting Application Blueprints• AppOwner: The Application Owner

– May be in one of several places in IT or be an internal customer depending on the enterprise– Think about who approves scope changes in the application

• BPT: The BluePrint Team– Members from Architecture, Application Development, DBA, SA, networking and business relationship– Think about all groups that make a change to install or change an application– Think about who will be the scribe

Prepare• About 2 weeks• Identify AppOwner• Request an

interview• AppOwner agrees

on resources• Invite the team and

work out invitation counters

Capture• Interview (1-4hr)

• White board• Capture data• Identify gaps• Assign tasks

• Follow-up (1-2day)• BPT documents &

formats data• BPT creates

diagrams

Validate• BPT sends to

interviewee• Interviewee reviews

diagrams and data sheets (2-3day)

• Interviewee &BPT reconcile changes

SignOff

• BPT sends proposed diagram and data sheets to AppOwner

• AppOwner reviews for accuracy (1 week)

• AppOwner signs off

Distribute

• BPT provides copies to Operations, Application Development, Architecture, and Operations teams

Iterations

Applications Move Together• Cross reference devices and applications

– Every device should have known applications– Every applications’ devices should be known

• Reduce unnecessary cross dependency– Migrate storage, over time an enterprise tends to

develop spider webs of applications sharing resources with each other

Design the Actual Relocation• Decide on optimal waves size• Plan physical and virtual servers in waves• Publish the wave plans• Arrange for smart hands and movers• Assign locations in new DC• Assign all patching• Print initial copies of documents for waves

Design the Waves• Rehearsal

– Make sure players understand the process– Include essential services:

Domain controller, DNS, etc. Base equipment for a VM farm

• Low hanging fruit– Power and IP (UTP or Fiber) only– Test case on each service

• Complex– SAN connections, telephony, unique parts

• The rest of the story– Fragile artifacts, redundant gear, retries– But do NOT save the worst for last

Preparing for a Wave• Confirm pre-test compliance

• Schedule a fix and test– OS patches and software upgrades– IP address and telephony changes– Power down and power up

• Freeze machine

• Schedule backup, SA, DBA, testers

• Confirm core and edge patching

• Update, print and post (communicate!)– Machine sheets, posters, elevations

Doing a Wave• Day of move:

– Team checks– Go/no-go meeting Weather, emergencies, road blocks, illness, etc.

– Food

• Zero hour:– Manage “war” rooms– Shared video and telephone conference lines– Track every system

Doing a Wave - Continued• Confirm backups• Management approval to power down• Ping to confirm down• Uncable, unrack, shock sensor, load• Loadmaster confirms each truck• Unload, check for shocks, rack, patch• Power up in sequence• Ping to confirm up

Wave Need: Machine Sheets• One sheet for each device

– Tape to device early on move day– Travels with device

• Border indicates wave and truck• Large print has

– Make, model and unique identifier– Old rack and RTU number– New rack and RU number– Date truck and bin number

• Accurate rendition of attachment points • Graphic and table of patch information

– IP patches with color, length and port– ILO & console patches with same– Fiber patches – Power cord length, receptacles and ends– Telephone lines

• Text boxes for overweight or special connections like USB or heartbeats

• Verified by smart-hands before and after

Wave Need: Posters• Pull order at old

site

• Mount order at new site

• Red to green criticality

• Group lines

• Post at least two copies

• “Red” system out last; in 1st

for shortest down time

Wave Need: Elevations• New Data

Center only• Elevations by

row posted on each row at new site– Color coded by

waves (avoid red)

• Single cabinet strips posted front and rear– Current wave

white

Testing Behind a Wave• Confirmation from application test team

– Regular tests approach 95% accuracy– A passed test means no errors were found– A failed test might not be a new target flaw

• Be prepared to think• Morning after walk both DC

– Check power – Check patching– Check placement

• Add summary of the wave to Presentation

Decommission Old DC and Wrap

Decommission Old DC and Wrap• Return leased and rented equipment• Dispose of other equipment• Get leaser sign-off

Final Thoughts• Use external communications – you will move

the systems that do your communication• Statistics show you will still have about a 1%

hardware failure rate – reserve your focus for those

• Testing can give false positives

www.bala.com