interim major incident management process it helpdesk responsibilities
TRANSCRIPT
Interim major incident management process
IT Helpdesk responsibilities
RMIT University©2011 Information Technology Services 2
Objective Restore service with minimal downtime
Effective communication and co-ordination
Task Is New?
Create an incident, identify patterns/multiple calls as potential major incident Current
…or… create suspected major incident as reported by Support Team (via phone)
New
Categorise incident(s) as major (MI) by comparing with priority matrix and critical services list (Gold/Silver/Bronze)
New
Apply Solution Template “MAJOR INCIDENT” and assign MI to Major Incident Manager (MIM) via phone (+61 3 992 52777) and infra call ticket (…. also notify primary support team to begin investigation while MIM engaged)
New / Current
Co-ordinate with Support Team to get updates New
Update the progress to MIM via infra call updates and phone (+61 3 992 52777)
New
Responsibilities
RMIT University©2011 Information Technology Services 3Information Technology Services 3
Hours of coverage
•Major Incident Manager operates from 8am to 8pm (core business/IT Helpdesk hours)
•Quality Assurance Services (QAS) and IT Helpdesk will share responsibility to provide 12 hour coverage
•Use +61 3 992 52777 to contact the Major Incident Manager (note: call may divert to IT Helpdesk depending on time of day)
•Out of hours, current “as is” process for each support team remains:
– Technical “on call” person notified by alerts
– “On call” person fixes
– Escalate to line manager if necessary
Note: Out of hours “as is” process varies between teams (to be standardised at a later date)
RMIT University©2011 Information Technology Services 4
Process flow diagramMajor Incident Management - Team Roles & Responsibilities
MI C
losu
reP
lan
Exe
cutio
n
MIM
Tea
m &
R
espo
nse
Pla
n
Inci
dent
Cate
goris
ation
Inci
dent
User Call IT Helpdesk MI Manager (MIM) Support Team
Prepare root cause and Incident closure
report
MI Team to execute the response.
Is Major?
Continue with Standard Incident Management &
fix Process
Create Incident
Closure
<<Trigger>>- - Monitoring Tool, etc.,
<<ServiceRestored>>
Monitor the impacted services
Start
Update Incident Record
Prepare report within 5 days from Major incident resolved
date
Get Approval
Problem management(notify, if applicable)
<<YES - Preliminary Assessment>> Phone
Initiate follow up with appropriate teams (if
applicable)
<<Apply Solution Templates
Email + Phone>>
Report Incident
Is it a Major?<<use priority
matrix>>
Create Incident
Conduct initial assessment to
determine impacted services and users
groups
Incident Found
Get response approval from MIM.
MI Team to conduct thorough assessment
and develop response
MIM to establish Major Incident
response
Progress update to IT Helpdesk/ MIM at a regular interval as agreed with MIM
Use priority matrix
Update Progress to MIMRepeat at a frequency of 30
minutes or as agreed with MIM
NONO
YES
RMIT University©2011 Information Technology Services 5
Evaluation
Urgency x Impact = Priority
P1 = Major Incident
ImpactUniversity wideCampus wideBuilding (50+)/FacultyFloor/Lab (10+)/CourseIndividual
Priority MatrixLow Medium High
Individual P4 P4 P3Floor/Lab/Course P4 P3 P2Building/Faculty P3 P2 P1Campus Wide P3 P2 P1University Wide P3 P1 P1
UrgencyHigh Gold - Top "10" services or coreMedium Silver - Mid tier servicesLow Bronze - Other services
* Refer to “Critical Services List” on the http://www.rmit.edu.au/its/majorincident for latest information
Top 10 *•Peoplesoft•SAP•myRMIT / Learning Hub•Blackboard•Google Mail•Staff Groupwise Mail•VOIP•Teaching Spaces (AV)•Teaching Spaces (IT)•EOL / STS
Core•Network•Load Balancers•Storage•NDS/AD•DNS•DHCP•Firewalls•etc…