RBC: 2004 Computer Outage: Corey Chamberlain
RBC: 2004 Computer Outage: Corey Chamberlain
Corey Chamberlain
Background
June 2, 2004 system wide failure
Software upgrade blamed
Bank accounts not reflecting transactions
2.5 million customers impacted
Class action lawsuit mounting
Experts peg lawsuit to be worth $1 billion
Branches extend service
Marketing apologies country wide in newspapers
245 IT staff members working to correct glitch
Executive Summary
Perform SWOT analysis to determine
current state of IT department
Analyze the risk at hand
Develop DAP
Perform SWOT analysis on DAP ensure
meets criteria
IT Department SWOT Analysis
Strengths
◦ Staff
◦ Equipment Quality
Weaknesses
◦ Upgrade Policy
◦ Mirrored System
◦ Response Time
◦ Senior Management Support
◦ Interdepartmental communication
IT Department SWOT Analysis
Opportunities
◦ Education
◦ Technology
◦ Skilled Workforce
Threats
◦ Industry Regulations
◦ Crackers/Black Hats
◦ Customer Volume
Business Impact Analysis
Recognition
Software Failure
◦ Can we function without services?
Yes, but limited functionality
◦ Will we lose customers?
Without a doubt
◦ Will our reputation be affected?
Certainly
Critical Risk
Human Risk
Classification
Joint Ownership
◦ ICT owns action
◦ Retail unit owns need
Business Impact Analysis
Costing
Cost of Downtime
◦ Customer Service Fees/Interest
$500 Million
◦ Lawsuits
Quebec Lawyers suggest customers should each receive $250 in damages.
$250 Million
◦ Branch staff overtime
Extra hours to accommodate customers
$350,000/Day
◦ ICT staff overtime
Round the clock to team of 245
$130,000/Day
◦ Total $152 Million/Day
No opportunity to start from scratch must repair
No opportunity to substitute system
Business Impact Analysis
Timeline
◦ Immediate action required
◦ Temporary solution might allow so systems to be restored
◦ Permanent solution required in the long term
Fit
◦ Front line staff depend on technology for day-to-day
activities
◦ Current upgrade policy does not fit business needs
Implementation
◦ Divde and conquer to find problem
◦ Converge on defined problem, brainstorm
◦ Communicate with application stakeholders
Business Impact Analysis
Testing
◦ Test before implementing in production
◦ Isolated lab
◦ Standardize upgrade policy
Duration
◦ Until new technology is available
◦ Constant review of alternatives
DAP
Hotsite
◦ Supports continuous backup, mirrored
◦ Real-time synchronization
◦ Expensive
Policy
◦ Plan upgrades at appropriate times
◦ Use transaction history metrics to find traditionally busy periods
◦ Test upgrades in isolated environment
Training/Practice
◦ Mock disasters
◦ “all hands on deck” testing, get as many stakeholders involved as
possible
◦ Continuous education for staff
◦ Detailed outage handbook
DAP SWOT Analysis
Strengths
◦ Interdepartmental communications
◦ Improved application testing
◦ Redundancy
◦ Backup
◦ Prevents extended outage
Weaknesses
◦ Policy slows implementation of new patches
◦ Expensive
◦ More equipment to maintain
DAP SWOT Analysis
Opportunities
◦ Technological Advancement
◦ Public Perception
Threats
◦ Human Error
◦ Government Regulations
Lesson Learned?
Video
http://www.wkrg.com/alabama/article/banki
ng_blunder/23636/Feb-13-2009_6-35-pm/
Thank You
Questions?