4 1 Overview This guide provides an overview of considerations for data recovery and developing a Relativity backup strategy. Often, backup strategies are inconsistent, poorly documented, and misunderstood. This misunderstanding stems from widespread expectations that backups must exist, must be readily available at a moment s notice, and must be restored immediately, regardless of size. When backup and DR practices are not documented, management will believe that not only do they exist, but that they will be good, consistent, and that they are current (possibly up to the minute). It is therefore critical to document your organizations SQL backup strategy and practices. This guide assumes an understanding of the following technologies: Redundant arrays of independent/inexpensive disks (RAID), tape backup, error checking, hash algorithms, and general security strategies. To deliver in-depth support for litigated matters, Relativity uses a very compartmentalized approach to database storage. Each individual workspace is set up with a dedicated database. For strategic performance reasons, these databases are often moved between servers. Relativity also provides processing functionality, which ingests native files into databases. With processing installed, each workspace that uses process has a sister store database. These sister store databases begin with the letters INV and exist on a separate database server. 1.1 Additional resources The following guides provide the information needed to successfully administer a Relativity environment. With them, you can perform most tasks in a linear manner. They re deliberately designed to address various what if situations and provide you with a single course of action. In addition to outlining a strong foundation for your backup strategy, this guide provides information on when to use each of the following documents (you can find these guides on the Relativity Customer Portal): Relativity - DBMT - Workspace Migration Relativity - Setting up a Distributed SQL Server Relativity - Batch DELETE Script (Available by request only) Relativity - Migrating the Invariant and Relativity Imaging Database (Available by request only) Relativity - EDDS Database Migration Guide Relativity - Configuring Relativity after a Disaster Recovery Failover Relativity - Managing Relativity SQL Log Files 1.2 Protecting backed up data After data has been backed up, some system administrators don't worry about data maintenance. They make the assumption that, once backed up, data remains consistent and complete. However, experienced administrators know that databases, and data stored on disk in general, can become Relativity Backup and Data Management Best Practices - 4

5 corrupted just from sitting on disk. Further risk is introduced when third-party data synchronization components and snapshotting technologies are used. For this reason, you must adopt in-depth backup strategies for all business-critical data. All Relativity backups especially those taken offline should have multiple BAK copies in different locations, and they should be periodically checked for consistency. For SQL backup strategies, see Selecting a backup approach. Ultimately, you should protect the data according to business demands. However, consistency checking puts additional strain on both infrastructure and personnel. You must scale these resources according to the acceptable level of data loss based on business requirements. First, you must determine the mean time to failure (MTTF) of a system. Next, based on the business requirements, you need to determine the acceptable data loss tolerances. Once you fully understand these factors, you can implement the appropriate number of redundant disks to provide a first layer of defense against data loss First line defense More disks mean greater redundancy. The ability to have data striped across disks improves redundancy. In RAID 1+0, every disk you add increases redundancy. With bit-level striping, bytes are striped across multiple disks. Storage redundancy in RAID is your first line of defense. When a hard drive starts to become corrupt, the storage controller can take if offline and, once it is replaced, rebuild its contents using the non-corrupt contents of other drives. This occurs automatically online. A successful Data Loss Prevention (DLP) strategy leverages knowledge of MTTF against mean time to repair (MTTR). It has an understanding of disk striping in order to maximize reliability at the online storage layer. For more information on establishing backup maintenance procedures, see Selecting a backup approach. Backups, or offline data, are another way to mitigate risk. An additional strength of the strategy comes from having multiple copies of a file and the ability to detect a failure of the data before it s lost. Other countermeasures, such as non-volatile RAM (NVRAM) cache, also help prevent data loss. Such solutions haven't become mainstream yet. Microsoft SQL Server always keeps the most recent data in RAM and writes the data to disk as load permits. If SQL Server crashes and fails over to a clustered server, the data in RAM on the downed server can t be recovered. If necessary, assess and adopt NVRAM technology. This guide outlines several approaches to backing up data and provides information on redundancy maintenance practices. Ultimately, data retention is a business decision. The cost of doing business, profit requirements, and the potential damage of data loss are all business concerns. This guide doesn t cover disaster recovery options. Relativity Backup and Data Management Best Practices - 5

6 Note: For disaster recovery (DR) options, see the Configuring Relativity after Disaster Recovery Failover document on the Customer Portal. There are many options for replication/mirroring of a site for failover in a DR situation. This document outlines the necessary steps to take after a failover in order to return Relativity to an operating state Second line defense The second line of defense is nearline data storage. Data is nearline if it can be brought online quickly. For example, your most recent backup file saved on a SAN is "nearline" and immediately accessible. It should be free of corruption Third line defense The third line of defense is offline data. Offline data can be removed from nearline data in both time and space. For large data, you can manually ship an offline data backup or move it over the Internet over a period of several days. The time differential depends on the following factors: 1. Cost of storage 2. The value to the business of maintaining large amounts of recent data, offline and far away 3. Logistics For highly mission-critical data, you must synchronize the data on a daily, if not hourly, basis. For example, a disaster recovery data center may be third-line data. To establish a DR site, you can use technologies such as log-shipping over high-speed Internet or mirroring. There also exist mechanisms with SAN and virtual technologies to keep data that's far away almost up to the minute. This is very expensive and requires tremendous expertise to setup and maintain. It will almost inevitably impact production environment performance. 2 Identifying data to back up The following sections provide a comprehensive list of all possible data locations of relevant Relativity files. There may be other areas that relate to custom applications or ISV products. You should also consider these areas when creating a backup inventory. 2.1 Database files A backup of SQL 2008R2/2012 includes the Data file, the Log file, and any additional data files, such as the full-text index catalog. You should preserve and thoroughly document maintenance plans and other SQL configurations to help restore service after an outage. There are many configuration options available with the SQL backup command, and understanding them is critical to properly maintaining backup continuity. Both Relativity workspace and processing databases should be backed up. Relativity Backup and Data Management Best Practices - 6

7 2.2 Relativity file repositories Relativity file repositories are locations that Relativity owns. That is, Relativity creates and deletes files from these locations when requested. The locations of Relativity file repositories are stored as choices in the EDDS Database CodeArtifact table in MS SQL. You can find these locations by running this query: SELECT [Name] FROM [EDDS].[eddsdbo].[Code] where CodeTypeID = ' ' 2.3 File stores Natives and images loaded with pointers are treated differently. Relativity reads files from file store locations but never deletes the locations, even if the document is deleted from Relativity. Locating all file store locations is complicated. File path locations are stored in the file table and must be parsed out from any Relativity file repository paths. If you need assistance identifying these locations, contact 2.4 Relativity Analytics The Relativity Analytics server contains critical information about some configurations and may also be the default location for Analytics indexes. Take care to ensure that this server can be completely restored. 2.5 Configurations The following sections provide possible data locations for configuration information in Relativity Agents Relativity retains agent configurations in the database. Backing up the EDDS database will effectively preserve all agent configuration data Web You can customize certain aspects of the Relativity web application. For this reason, you should back up the website files as needed. You can also customize certain configurations in IIS. IIS provides a way to export and save the IIS configuration Ancillary indexes (dtsearch, Analytics) The configuration of the index specifies the locations of the dtsearch and Analytics indexes. Capture all folders and subfolders. Relativity Backup and Data Management Best Practices - 7

8 2.5.4 Full-text catalogs In SQL Server 2008, the full-text catalog may reside within either the database mdf file, a separate ndf file, or a mixture of both. The distribution location of the full-text catalog may vary depending on the age of the instance as well as the original version of SQL. SQL Server 2005 stored the full-text catalog as separate files. Depending on the upgrade path from 2005 to 2008, some systems may not have upgraded properly. Between SQL Server 2005 and 2008, there are differences in how SQL manages the catalog during backup and restore. Due to these differences and depending on when the workspace was created or converted, Relativity may have ignored the default location and allowed SQL 2008 to merge the fulltext catalog into the MDF data file. In such cases, a 1024 K pointer file (named after the workspace and with no extension) may reside in the full-text catalog folder. These files are critical and should be backed up. They should never be deleted. 3 Selecting a backup approach You can implement one of the following approaches to backing up your data online: Active databases Inactive databases 3.1 Active databases Relativity databases may experience a very high number of inserts and updates and can become corrupt at any time. An "active database" experiences moderate to heavy use. For this type of database, data loss would be catastrophic to business. For your active databases, we recommending using one of the following backup strategies Nightly full database backup Follow these steps to perform nightly full database backups with log backups for point-in-time recovery. Data file: 1. Back up the database nightly. 2. Mirror to remote disaster recovery site. ( Mirror can mean log shipping, SAN snapshots, or replication. Follow the established strategies of your business.) 3. Restore nightly backup to inexpensive equipment each day. 4. Run DBCC CheckDB each day or as often as possible, and meet the business requirements for data loss prevention. 5. Complete the DBCC before the next backup occurs. Relativity Backup and Data Management Best Practices - 8

9 Log file: Follow best practices for managing log files as outlined in the Managing Relativity Log Files guide. Understand the best approach for timely restoration to meet your recovery time objectives (RTO) and recovery point objectives (RPO). Note: A Relativity database log file may occasionally experience a high amount of growth. If the log files fill the log drive, those workspaces become inaccessible. If this happens on the SQL Server that contains the EDDS database, the entire environment becomes inaccessible Weekly full database backup Follow these steps to perform weekly full database backups with nightly differentials and log backups. Data file: 1. Back up the database weekly. 2. Mirror to remote disaster recovery site. 3. Restore nightly backup to inexpensive equipment each day. 4. Run DBCC CheckDB each day. 5. Complete the DBCC before the next backup occurs. Log file: 1. Follow best practices for managing log files as outlined in the Managing Relativity Log Files guide. 2. You should also back up log files as dictated by business needs for point-in-time recovery. This backup should overcome any possibility of filling the log drives. Relativity may write a lot of data to the log files at times. If the business only requires a four-hour increment for point-in-time recovery but the system could write enough data to the log files in four hours to fill the drives, then you must either run log backups more frequently or increase the size of the drives. Be sure that you understand how to restore log files; document the procedure for restoring log files and practice doing it. The following formula determines the higher bound constraint of frequency of log backups: If there is x GB space of free space on the log drive, the system can write data to the system at rate of y GB/hr, and t is time, then the frequency of log backups F is F(t) = b(x/y). b is given buy some integer less than one which is determined by the capability of the system to move data. If this number cannot be maintained at some value b < 1, then additional bandwidth from production disk storage to backup storage is required. This variable is then the ratio of synchronous read/write where it is some value less than one and represents actual demand on the system, not merely the capability of the system. In other words, the assumption is that the value of bwill exist in some domain such that, during normal production activity, the ability of the system to read data from the log disks is not impeded by its write activity. For example, if during a normal hour of production 100 GB are written to the system and 100GB are read from the disk by production systems, and the remaining capacity is only an additional 50GB of sequential read data, then b= 2 and this is not a value of b< 2. Log backups may not complete before the next round is scheduled, or the production Relativity Backup and Data Management Best Practices - 9

10 system performance will suffer because you are operating your system at the upper limit of what your system can handle. Whereas the lower limit of the frequency of log backups is controlled by the need of the business for point-in-time recovery, performing more log may be required by the need to prevent a drive from becoming full No log backups Follow these steps to perform weekly full database backups with nightly differentials and no log backups. 1. Back up the database weekly. 2. Mirror to remote disaster recovery site. 3. Restore nightly backup to inexpensive equipment each day. 4. Run DBCC CheckDB each day. 5. Complete the DBCC before the next backup occurs. 6. Set logging to SIMPLE. Log file: In this configuration, set logging to SIMPLE and size the log files appropriately, as outlined in the Managing Relativity Log Files guide; you can find this guide on the Relativity Customer Portal. At a minimum, you should set the log files to the approximate size of the Document table. Point-in-time recovery takes 24 hours. Note: If you suspect excessive logging at any time, please report it to to identify or determine a root cause. 3.2 Inactive databases Inactive databases also require attention. You may not routinely back up inactive databases, but a stored backup may become corrupted over time for various reasons, such as head crashes, aging, wear in the mechanical storage devices, and so on. Data can become corrupted just sitting on disk. This is called silent data corruption. No backup can prevent silent data corruption; you can only mitigate risk Preventing silent data corruption The consequences of a silent data corruption may lay dormant for a long time. Many technologies have been implemented over the years to ensure data integrity during data transfer. Server memory uses error correcting code (ECC), and cyclic redundancy checks (CRCs) protect file transfers to an extent. So, how can you maintain the integrity of data at rest on disk systems that isn't accessed over a long period of time? Relativity Backup and Data Management Best Practices - 10

11 Without a high degree of protection, data corruption can go unnoticed until it's too late. For instance, a user attempting to access the database may receive the following error when running certain queries: Error 605 Severity Level 21 Message Text Attempt to fetch logical page %S_ PGID in database '%.*ls' belongs to object '%.*ls', not to object '%.*ls'. Then, while running DBCC to try to repair it, the database administrator receives this error: "System table pre-checks: Object ID 7. Could not read and latch page (1:3523) with latch type SH. Check statement terminated due to unrepairable (sic) error." After this occurs, the database administrator checks for backups, and hopefully recovery happens quickly and inexpensively. But, what if the backup is also corrupt? Sometimes, there's no way to recover missing data. For example, the database can t be repaired if complete tables have been destroyed. When DBCC checks are not run against backup files, a corruption such as the previous example may go unnoticed for weeks. For business-critical databases, the following steps should occur after completing every backup: 1. Restore the backup to inexpensive hardware running MS SQL Server. 2. Run DBCC CheckDB. 3. Create a hash of the database BAK file Example strategy Follow these guidelines to prevent data corruption of inactive databases: Perform CheckDB against business-critical databases. Keep one week s work of backups. Run a file hash each night. Ensure that no file hash has changed. (A file that becomes corrupt will have a different hash.) Freeware tools, such as ExactFile, include features that make it easy to test hash values by creating a digest of a directory of files. This digest can be tested and rebuilt daily or on a sensible schedule that meets business obligations. 4 Assessing risk and cost concerns When it comes to backup verification, IT management has two competing concerns: 1. Management of risk, including cost-of-failure and mean time to failure (MTTF) estimates provided to the business owner. 2. Cost to business of maintaining backups. Relativity Backup and Data Management Best Practices - 11

12 If you know the cost of maintaining backups exceeds the value of the data being maintained, then you can relax retention rules. The business should adjust contractual obligations, real or assumed, as needed and followed. If the loss of data significantly compromises the business or would be a business ending event, then analysts should carefully weigh the cost of retention and retention maintenance against contractual obligations to the owner or customer. 4.1 Considerations Consider the following when assessing these concerns: The cost of rebuilding the dataset from scratch. The cost of silent failure prevention. The overall revenue indirectly and directly generated by the data. The profit margin of the data. The cost to business due to loss of reputation in the event of an unrecoverable failure. Any legal obligations regarding data retention and reasonable retention efforts. Insurance policies may cover loss of IP, and all good backup strategies should consider the amount of coverage provided by insurance. In addition, ask yourself these questions: If the data become irrecoverably corrupted, can it be rebuilt? Is the data irreplaceable, such as photographs or videos? Are you dealing with real-time decision data? How many hours of human decision data are invested in creating the data? Assume the worst-case scenario: You have corrupt, irreparable data that can t be rebuilt. What is cost to business of data loss? 4.2 Summary and conclusion Establishing reasonable practices is key to any successful backup and data management effort. First, you must develop practices, document these practices, and ensure that IT personnel follow them. Then, you must understand MTTF and maintain a tolerance. To achieve the highest level of reasonable data retention and silent failure prevention, consider retaining two copies of the data at two geographically distinct locations. Perform weekly MD5 hash checks on the data primary set and monthly checks of the secondary. You can automate these tests and run them in addition to the initial consistency checks performed when the data was created. Developing sound backup and data retention procedures facilitates the safety of data, compliance with business needs for point-in-time recovery, and potentially regulatory compliance. Most database technology includes the ability to restore after a disaster. As such, the performance of the database depends on understanding the way in which logging and backups operate and impact performance. Relativity Backup and Data Management Best Practices - 12

13 Improperly configuring backups can result in unanticipated outages and even loss of data if backups are corrupt. For assistance with any of these configurations, contact Relativity Backup and Data Management Best Practices - 13

14 Proprietary Rights This documentation ( Documentation ) and the software to which it relates ( Software ) belongs to kcura LLC and/or kcura s third party software vendors. kcura grants written license agreements which contain restrictions. All parties accessing the Documentation or Software must: respect proprietary rights of kcura and third parties; comply with your organization s license agreement, including but not limited to license restrictions on use, copying, modifications, reverse engineering, and derivative products; and refrain from any misuse or misappropriation of this Documentation or Software in whole or in part. The Software and Documentation is protected by the Copyright Act of 1976, as amended, and the Software code is protected by the Illinois Trade Secrets Act. Violations can involve substantial civil liabilities, exemplary damages, and criminal penalties, including fines and possible imprisonment kcura LLC. All rights reserved. Relativity and kcura are registered trademarks of kcura LLC. Relativity Backup and Data Management Best Practices - 14

System Requirements Version 8.2 November 23, 2015 For the most recent version of this document, visit our documentation website. Table of Contents 1 System requirements 3 2 Scalable infrastructure example

Version: 1.5 2014 Page 1 of 5 1.0 Overview A backup policy is similar to an insurance policy it provides the last line of defense against data loss and is sometimes the only way to recover from a hardware

CHAPTER 4 Introduction This chapter describes in general terms backing up a Cisco Unity system. When you back up a Cisco Unity server (and one or more Exchange servers) you need to consider the same issues

Published: March 2004 Abstract With the introduction of Volume Shadow Copy Service (VSS) in Microsoft Windows Server 2003 and Windows Storage Server 2003 and the strength of NSI Software Double-Take you

Microsoft SQL Server Guide Best Practices and Backup Procedures Constellation HomeBuilder Systems Inc. This document is copyrighted and all rights are reserved. This document may not, in whole or in part,

SQL Server Database Administrator s Guide Copyright 2011 Sophos Limited. All rights reserved. No part of this publication may be reproduced, stored in retrieval system, or transmitted, in any form or by

Course Syllabus Course 2788A: Designing High Availability Database Solutions Using Microsoft SQL Server 2005 About this Course Elements of this syllabus are subject to change. This three-day instructor-led

Technical White Paper Disaster Recovery for Small Businesses A disaster recovery plan helps you understand what data is critical to your business operations and how to best protect it from unexpected failures.

Business Continuity: Choosing the Right Technology Solution Table of Contents Introduction 3 What are the Options? 3 How to Assess Solutions 6 What to Look for in a Solution 8 Final Thoughts 9 About Neverfail

The Zenith Advanced Recovery and Continuity Appliance (ARCA) is a network-attached storage device for Windows Servers that comes preloaded with all backup, recovery and virtualization software it needs

1-888-674-9495 www.doubletake.com Real-time Protection for Hyper-V Real-Time Protection for Hyper-V Computer virtualization has come a long way in a very short time, triggered primarily by the rapid rate

eguide CREATING A BETTER BACKUP PLAN Copyright 2015 Gallery Systems. All rights reserved. CONTENTS Backing Up is Hard To Do...3 The Front Line of the Backup Plan...4 Holes in the Backup Plan...5 Solving

There are many things to consider when preparing for a TRAVERSE v11 installation. The number of users, application modules and transactional volume are only a few. Reliable performance of the system is

Backup and Recovery What is a Backup? Backup is an additional copy of data that can be used for restore and recovery purposes. The Backup copy is used when the primary copy is lost or corrupted. This Backup

Technical Considerations in a Windows Server Environment INTRODUCTION Cloud computing has changed the economics of disaster recovery and business continuity options. Accordingly, it is time many organizations

Maximizing Data Center Uptime with Business Continuity Planning Next to ensuring the safety of your employees, the most important business continuity task is resuming business critical operations. Having

Managing your data performance, availability, and retention It happens more than you d like to believe. A small company upgrades its storage infrastructure and puts all of its data on a high availability

Redefining Microsoft SQL Server Data Management Contact Actifio Support As an Actifio customer, you can get support for all Actifio products through the Support Portal at http://support.actifio.com/. Copyright,

Backup In information technology, a backup or the process of backing up refer to making copies of data so that these additional copies may be used to restore the original after a data loss event. These

TADCASTER GRAMMAR SCHOOL Toulston, Tadcaster, North Yorkshire. LS24 9NB BACKUP STRATEGY AND DISASTER RECOVERY POLICY STATEMENT Written by Steve South November 2003 Discussed with ICT Strategy Group January

Data Backup Options for SME s As an IT Solutions company, Alchemy are often asked what is the best backup solution? The answer has changed over the years and depends a lot on your situation. We recognize

Database as a Service (DaaS) Version 1.02 Table of Contents Database as a Service (DaaS) Overview... 4 Database as a Service (DaaS) Benefit... 4 Feature Description... 4 Database Types / Supported Versions...

Exhibit to Data Center Services Service Component Provider Master Services Agreement DIR Contract No. DIR-DCS-SCP-MSA-002 Between The State of Texas, acting by and through the Texas Department of Information

Oracle Recovery Manager 10g An Oracle White Paper November 2003 Oracle Recovery Manager 10g EXECUTIVE OVERVIEW A backup of the database may be the only means you have to protect the Oracle database from

WHITE PAPER The Microsoft Large Mailbox Vision Giving users large mailboxes without breaking your budget Introduction Giving your users the ability to store more e mail has many advantages. Large mailboxes

BrightStor ARCserve Backup for Windows Agent for Microsoft SQL Server r11.5 D01173-2E This documentation and related computer software program (hereinafter referred to as the "Documentation") is for the

Your Guide to Cost, Security, and Flexibility What You Need to Know About Cloud Backup: Your Guide to Cost, Security, and Flexibility 10 common questions answered Over the last decade, cloud backup, recovery

Load Balancing and Clustering in EPiServer Abstract This white paper describes the main differences between load balancing and clustering, and details EPiServer's possibilities of existing in a clustered

SOLUTION BRIEF: CA INSTANT RECOVERY ON DEMAND How can I deploy a comprehensive business continuity and disaster recovery solution in under 24 hours without incurring any capital costs? CA Instant Recovery

Missed Recovery Techniques for SQL Server By Rudy Panigas My Bio: *Working in the IT Industry for over 25 years *Over 12 years as a Database Administrator with experience including architecting, design,

Introduction By leveraging the inherent benefits of a virtualization based platform, a Microsoft Exchange Server 2007 deployment on VMware Infrastructure 3 offers a variety of availability and recovery

Exchange Data Protection: To the DAG and Beyond Whitepaper by Brien Posey Exchange is Mission Critical Ask a network administrator to name their most mission critical applications and Exchange Server is