Major Incident and Problem Manager

πŸ“Œ LOCATION

Pretoria β€” Onsite, Monday to Friday, 07:30 to 17:00 (availability after hours during Major Incidents required per SLA)

πŸ“Œ REPORTS TO

Head of Service Delivery

πŸ“Œ START DATE

01 December 2025

πŸ“Œ SEND ALL APPLICATIONS TO:

info@purpleblue.co.za

1. PURPOSE OF THE ROLE

The Major Incident & Problem Manager is responsible for leading, coordinating, and managing all Major Incidents and Problem Management activities to ensure rapid restoration of services, minimise business disruption, identify root causes, and implement long-term corrective actions in alignment with contractual SLA requirements.

2. KEY RESPONSIBILITIES

A. Major Incident Management

  • Ensure 100% of Major Incident calls are logged within 30 minutes of declaration.
  • Lead and chair all Major Incident war-room/SWAT calls.
  • Provide stakeholder updates every 2 hours until resolution.
  • Ensure escalation, coordination and technical team participation 90–100% compliance with SLA.

B. Problem Management

  • Conduct proactive problem management by analysing incident trends.
  • Ensure Root Cause Analysis reports are completed within SLA timeframes.
  • Implement approved RCA recommendations within agreed timelines.

C. Reporting & Governance

  • Produce all required reports including:
    • Number of major incidents
    • Trend analysis
    • Open problems
    • Service performance insights
  • Publish meeting minutes and action logs within one hour after SWAT sessions.

D. Knowledge & Configuration Management

  • Update Knowledge Base with known errors, solutions, and lessons learned.
  • Ensure CMDB entries are updated for every Major Incident/Problem closure.

E. Third-Party & Vendor Management

  • Manage and coordinate third-party supplier activities related to Major Incidents and Problems.

3. MEASURABLE PERFORMANCE INDICATORS (SLAs)

DeliverableTarget
Call logging of Major Incidents100% within 30 minutes
SWAT Team MobilisationWithin 1 hour of incident declaration
Communications UpdatesEvery 2 hours until full resolution
RCA delivery100% within SLA
Implementation of RCA Recommendations100% within agreed timelines
Stakeholder Action Item ListsDistributed within 1 hour after SWAT meetings

Source: SLA performance requirements.

4. REQUIRED EXPERIENCE & QUALIFICATIONS

βœ” Minimum 5–7 years’ experience in:

  • Major Incident Management
  • Problem Management
  • ITIL-aligned environments
  • High-availability ICT operational environments

βœ” Experience coordinating cross-functional technical teams and war-room scenarios.

βœ” ITIL certification preferred.

5. KEY SKILLS & COMPETENCIES

CategorySkills Needed
TechnicalITIL v4, CMDB usage, ServiceNow/Remedy or equivalent
CommunicationExecutive-level reporting, crisis communication
LeadershipSWAT chairing, escalation authority, decision making
AnalyticalTrend analysis, RCA methodologies
GovernanceSLA enforcement, compliance tracking, reporting

6. BEHAVIOURAL EXPECTATIONS

  • Calm under pressure and confident decision-maker
  • Strong leadership and stakeholder influence
  • Ability to drive accountability across multiple teams
  • Focus on continuous improvement and prevention culture

7. WORKING CONDITIONS

  • Must be available for after-hours escalation for Priority 1 (Major) incidents.
  • May be required to work extended hours during critical service disruptions.

Visit Us!