4 INTRODUCTION Audit Objective Background The Office of Audits & Advisory Services (OAAS) completed an audit of Information Technology (IT) Disaster Recovery (DR). The objective of the audit was to provide reasonable assurance that the management control framework in place to support disaster preparedness for information technology systems is adequate and effective. The County of San Diego (County) Information Technology and Telecommunications Service Agreement (IT Agreement) signed in April 2011 assigns Hewlett Packard Enterprise Services (HP) responsibility for providing disaster recovery management services to the County. HP prepared the CoSD-T407 County of San Diego Disaster Recovery Management Plan (DR Plan) dated December 15, 2011 and provided the DR Plan to the County Technology Office (CTO) for review and approval. This plan defines the recovery strategy, high-level procedures necessary to recover the County s IT technical environments at HP and outlines the roles and responsibilities assigned to HP and the County to ensure rapid recovery of the County s IT environment. HP maintains critical County application portfolio information in a centralized database called Apps Manager that is the system of record to support IT DR planning and recovery. County departments assign priority classifications to applications in Apps Manager based on criticality and time sensitivity. The application priority determines the recovery time objective (RTO) 1 and recovery point objective (RPO) 2 for each application as follows: Priority 1 (P1) applications affect Life, Safety and/or Health and must be recovered within 48 hours following a disaster. Priority 2 (P2) applications are Mission Critical affecting critical services provided to other County departments and/or the public and must be recovered within 72 hours following a disaster. Priority 3-5 (P3-P5) applications are recovered within best effort. Priority 1 and 2 applications must have an RPO (restored data) no older than 28 hours prior to the disaster. Audit Scope & Limitations The scope of the audit focused on evaluating whether key controls are designed and operating effectively to support disaster preparedness for information technology systems at the County as of August Recovery Time Objective (RTO) is the maximum tolerable length of time that a business process can be down after a disaster. 2 Recovery Point Objective (RPO) is the maximum tolerable period in which data might be lost from an IT service due to a major event 1

5 The audit was limited to testing DR controls and processes covered in the IT Agreement Schedule 4.3 Section 7.8 Disaster Recovery Management Services. This review focused on the primary HP managed data centers in Tulsa, OK and Plano, TX and the AT&T Point of Presence (POP) data center in San Diego. OAAS also based their assessment on recommended DR controls, and compliance with standards and guidelines from the following: IT Governance Institute s Control Objectives for Information and related Technology 5 (COBIT 5). National Institute of Standards and Technology (NIST) Contingency Planning Guide for Federal Information Systems Special Publication Rev.1. The audit was conducted in conformance with the International Standards for the Professional Practice of Internal Auditing prescribed by the Institute of Internal Auditors as required by California Government Code, Section Methodology OAAS performed the audit using the following methods: Interviewed County and HP stakeholders. Reviewed industry frameworks and best practices guidance (COBIT 5; NIST ). Reviewed the County s DR Plan and the IT Agreement Schedule 4.3 Operational Services to understand County policies, requirements, and processes. Assessed the risks to achieving key DR control objectives independently and with management. Identified, reviewed, and tested DR controls for design and operating effectiveness to verify that: - Organizational oversight and governance is adequate. - The HP Apps Manager and Application Run Books 3 are complete and accurate and provide information needed to recover critical applications for business continuity. - The DR Plan sufficiently documents plan details, recovery procedures, communications/network environment, hardware 3 As outlined in the CoSD-T407 DR Plan, Application Run Books serve as an application s full operations support manual. Run Book s outline all operational and physical requirements in the application environment that are needed to meet the goals of the Application services agreements, including hardware, software and configuration. The Run Books stand to support the operations of the environment in the event that an emergency occurs. 2

6 AUDIT RESULTS configuration, software applications and supporting platforms, data recovery, facilities, staff, and third-party vendors. - The DR Plan is distributed to key stakeholders and updated regularly. - The DR Plan testing and training is administered annually, test results are reviewed and approved by County management, and corrective action is implemented in a timely manner according to the IT Agreement Schedule 4.3. Summary The management control framework to support disaster preparedness for information technology systems needs improvement. Opportunities for improvement were identified in areas related to: Compliance with DR standards and County requirements. IT vendor DR risk management. DR system of record. To strengthen current controls and improve the effectiveness of DR controls and processes, OAAS presents the following findings and recommendations. Finding I: Compliance with DR Standards and County Requirements Needs Improvement A review of the management control framework in place to support DR identified issues related to compliance with DR standards and County requirements as described below. DR Plan for the AT&T POP is Not Fully Completed. The DR Plan provided by HP to the County on May 13, 2013 does not include recovery of the AT&T POP data center. At the time of the audit, a plan to create redundancy for the AT&T POP was in progress, but not fully completed. Since 2008, the County and HP have been researching a feasible DR solution for the AT&T POP. The IT contract transition from Northrup Grumman to HP in April 2011 further delayed the remediation. Lack of a complete and tested DR Plan for the AT&T POP increases the risk of loss of network connectivity if a disruptive event at the AT&T POP occurs, potentially resulting in disruption of network communications and preventing County end-users from accessing the network and required information and applications. Inconsistent DR Plan Approval. County approval of the DR Plan is not consistently retained. The CTO did not retain the conditional 3

7 acceptance sent to HP evidencing their review and approval of the December 15, 2011 DR Plan. COBIT 5 DSS04.03 states that executive business approval of the DR Plan should be obtained. Undefined DR Test Plan. HP has not developed a DR test plan or performed a comprehensive test of the County DR Plan. The CTO sent a request to HP on April 10, 2013 to provide a DR test plan initiating this process; however, at the time of the audit, there was no estimated time of completion. HP performed an application recovery exercise from backup media for two County applications on December 5, One of the applications tested, JCATS, is not a P1/P2 application. The County was not involved in the recovery exercise and there was no evidence that test results were reported to or approved by County management. Per the IT Agreement Schedule 4.3, HP is responsible for annually producing and submitting a DR test plan, performing DR testing, submitting DR test results, and performing corrective action identified during testing. The County is responsible for annually reviewing and approving the DR test plan and test results, and following-up to ensure that all corrective action is performed. Per the County s DR Plan, this process should be performed at regular intervals not to exceed 12 months. Also, periodic testing of recovery from backup media is an ongoing critical deliverable in the IT Agreement. All elements of the DR Plan need to be tested periodically to ensure that gaps in the plan or issues resulting from the test can be identified and corrected in a timely manner. Failure to test all elements of the DR Plan can mean that disaster recovery arrangements on which the County places reliance may not be recovered timely or completely. Undefined DR Training Plan. DR Plan training has not been administered to key HP and County stakeholders involved in the IT recovery process. Per the County s DR Plan, each framework leader is responsible for reviewing the recovery plans with their employees on a regular basis. Training should be conducted so that members of the application and infrastructure teams can execute the plans if necessary. Without periodic DR training, recovery personnel may lack preparation to quickly execute recovery procedures in a disaster situation. 4

8 Recommendation: To improve compliance with DR standards and County requirements, the CTO should work with HP to: 1. Complete an approved and tested DR Plan for the AT&T POP. 2. Ensure the County DR Plan approval process is formalized and documentation is adequately retained. 3. To ensure DR readiness and effectiveness, DR testing should be in place to test all elements of system recovery as set out in the IT Agreement and DR training administered regularly, as follows: a. Establish a timeline for developing a DR Test Plan and at a minimum perform annual testing to ensure successful coordination and execution of DR procedures among key stakeholders. b. Review and approve DR test results to ensure objectives were adequately met. If not met, implement corrective actions in a timely manner and update the DR Plan and source documents. c. Perform periodic application recovery from backup media for qualifying P1/P2 applications. Involve the County in the exercise during the application selection process and the review and approval of test results. d. Develop and administer mandatory annual DR training to all County and HP personnel who will be directly involved in and responsible for executing the DR Plan. Finding II: HP Apps Manager and Application Run Books are Not Complete and Accurate DR related information documented in Apps Manager and Application Run Books maintained by HP are not complete or accurate as described below. Apps Manager. OAAS tested the completeness and accuracy of critical information maintained in Apps Manager for 92 P1/P2 applications supported by HP. - Three P1/P2 applications (PA2468, PA2237 and PA1058) had missing or inappropriate priorities. PA2468 is a P2 dependency application but is assigned an UNK priority and the remaining two applications have no assigned priority. - Of 92 P1/P2 applications, 11 did not have critical information such as security classification, application platform, operating system, database platform or vendor documented. Application Run Books. OAAS sampled 10 of the 92 (11%) P1/P2 servers listed on HP s Application Server Report and obtained 5

9 Application Run Books for each server. Of the 10 Run Books, 4 (40%) did not document the production server sampled. Per the CTO and HP, Apps Manager and Application Run Books are the systems of record containing County application system configurations, calling trees, dependencies and priority classification. To facilitate successful DR, these documents should be complete and accurate. The application priority rating determines the recovery priority requirements as outlined in the IT Agreement Schedule 4.3 and the DR Plan. Incomplete or inaccurate source information required for DR may adversely impact the County's ability to prepare for and perform essential DR activities. The CTO indicated that the application information was never properly collected and recorded in Apps Manager and the Application Run Books were not up-to-date. Recommendation: To support the effectiveness of the DR Plan, the CTO should work with HP to ensure that critical application information needed for recovery is accurately and completely recorded in Apps Manager and updated in the Run Books. 6

Application / Hardware - Business Impact Analysis Template The single most important thing we can do is help you understand the criticality of each application, supporting hardware/server/pc and the required

CONTINUITY OF OPERATIONS AUDIT PROGRAM EVALUATION AND AUDIT April 16, 2014 INTRODUCTION Purpose The purpose of the audit is to give assurance that the development of the Metropolitan Council s Continuity

Virginia Commonwealth University School of Medicine Information Security Standard Title: Scope: Business Continuity Management Standard for IT Systems This standard is applicable to all VCU School of Medicine

The Office of the Auditor General has conducted a procedural review of the State Data Center (Data Center), a part of the Arizona Strategic Enterprise Technology (ASET) Division within the Arizona Department

STATE OF NORTH CAROLINA INFORMATION SYSTEMS AUDIT OFFICE OF INFORMATION TECHNOLOGY SERVICES INFORMATION TECHNOLOGY GENERAL CONTROLS OCTOBER 2014 OFFICE OF THE STATE AUDITOR BETH A. WOOD, CPA STATE AUDITOR

CENTRAL BANK OF KENYA (CBK) PRUDENTIAL GUIDELINE ON BUSINESS CONTINUITY MANAGEMENT (BCM) FOR INSTITUTIONS LICENSED UNDER THE BANKING ACT JANUARY 2008 GUIDELINE ON BUSINESS CONTINUITY GUIDELINE CBK/PG/14

Information Systems Audit and Control Association www.isaca.org Business Continuity Planning AUDIT PROGRAM & INTERNAL CONTROL QUESTIONNAIRE The Information Systems Audit and Control Association With more

Mission Statement To improve the quality of life in Phoenix through efficient delivery of outstanding public services. Disaster Recovery Planning Information Technology Services December 11, 2012 Project

SOUTH LAKELAND DISTRICT COUNCIL INTERNAL AUDIT FINAL REPORT IT 11-02 IT Backup, Recovery and Disaster Recovery Planning Executive Summary Introduction As part of the 2011/12 Audit Plan and following discussions

Follow-up Audit of Information Technology Services Department CITY OF SAN ANTONIO OFFICE OF THE CITY AUDITOR Follow-up Audit of Information Technology Services Department Project No. AU13-F05 October 25,

FINAL AUDIT REPORT Audit of the arrangements for business continuity and disaster recovery for non- PeopleSoft applications in UNHCR BACKGROUND The field offices of the United Nations High Commissioner

Course 10165A: Updating Your Skills from Microsoft Exchange Server 2003 or Exchange Server 2007 to Exchange Server 2010 SP1 OVERVIEW About this Course There are two main reasons for the course. Firstly,

1464 INFORMATION TECHNOLOGY ENGINEER V NATURE AND VARIETY OF WORK This is senior level lead administrative, professional and technical engineering work creating, implementing, and maintaining the County

Community Bank Auditors Group Taking Your Business Continuity Plan To The Next Level June 9, 2015 By: Tracy Hall MEMBER OF PKF NORTH AMERICA, AN ASSOCIATION OF LEGALLY INDEPENDENT FIRMS 2015 Wolf & Company,

1 IF DISASTER STRIKES IS YOUR BUSINESS READY? DISASTER RECOVERY and BUSINESS CONTINUITY: WHAT YOU NEED TO KNOW Realize the Power of Technology Many business owners put off disaster planning, perhaps thinking

THE UNIVERSITY OF TEXAS-PAN AMERICAN OFFICE OF AUDITS & CONSULTING SERVICES Server Management-Scans & Patches Report No. 14-11 OFFICE OF INTERNAL AUDITS THE UNIVERSITY OF TEXAS - PAN AMERICAN 1201 West

Stepping Through the Business Continuity Plan Audit Doug Menendez Graybar Electric Company Presentation to MidAmerica Contingency Planning Forum February 16, 2012 Introduction Whether it is from internal

Business Continuity Plan October 2007 Agenda Business continuity plan definition Evolution of the business continuity plan Business continuity plan life cycle FFIEC & Business continuity plan Questions

CITY OF DALLAS Dallas City Council Office of the City Auditor Audit Report Mayor Tom Leppert Mayor Pro Tem Dr. Elba Garcia Deputy Mayor Pro Tem Dwaine Caraway AUDIT OF THE SERVICE LEVEL AGREEMENT OF THE

U.S. ENVIRONMENTAL PROTECTION AGENCY OFFICE OF INSPECTOR GENERAL Information Technology EPA Can Better Assure Continued Operations at National Computer Center Through Complete and Up-to-Date Documentation

Disaster Recovery Planning Process By Geoffrey H. Wold Part I of III This is the first of a three-part series that describes the planning process related to disaster recovery. Based on the various considerations

911 Data Center Operations Performance Audit June 2010 Office of the Auditor Audit Services Division City and County of Denver Dennis J. Gallagher Auditor The Auditor of the City and County of Denver is

TUFTS HEALTH PLAN CORPORATE CONTINUITY STRATEGY FREQUENTLY ASKED QUESTIONS July 2015 OVERVIEW The intent of this document is to provide external customers and auditors with a high-level overview of the

SEPTEMBER 16, 2010 AUDIT REPORT OFFICE OF AUDITS REVIEW OF NASA S MANAGEMENT AND OVERSIGHT OF ITS INFORMATION TECHNOLOGY SECURITY PROGRAM OFFICE OF INSPECTOR GENERAL National Aeronautics and Space Administration

Tufts Health Plan Corporate Continuity Strategy July 2015 OVERVIEW The intent of this document is to provide external customers and auditors with a highlevel overview of the Tufts Health Plan Corporate

BCM and DRP - The Supreme Council of Information & Communication Technology ictqatar PUBLICATION DATE Document Reference This document should be used as an example of the contents of an RFP for business

FOLLOW-UP REPORT Change Management Practices May 2016 Office of the Auditor Audit Services Division City and County of Denver Timothy M. O Brien, CPA The Auditor of the City and County of Denver is independently

InForm On Demand Single Trial Services Description Version 7.0 Effective Date: 0 25-Sep-2014 This is the Services Description for Oracle InForm On Demand Single Trial ( Schedule ) to Your Study Order for

` Official Audit Report Issued September 30, 2011 University of Massachusetts Medical School's Data Center Relocation For the period July 1, 2008 through August 31, 2010 State House Room 230 Boston, MA

PHASE 9: OPERATIONS AND MAINTENANCE PHASE During the Operations and Maintenance Phase, the information system s availability and performance in executing the work for which it was designed is maintained.

Building a Disaster Recovery Program By: Stieven Weidner, Senior Manager Part two of a two-part series. If you read my first article in this series, Building a Business Continuity Program, you know that

John Keel, CPA State Auditor The ReHabWorks System at the Department of Assistive and Rehabilitative Services Report No. 12-045 The ReHabWorks System at the Department of Assistive and Rehabilitative Services