A brand new point of view in the backup schedule strategy

Every sysadmin knows that the classical way to set up a backup strategy is to schedule different kinds of copy (full, incremental, differential …) in specific moments of the day, week and month, so as to obtain a regular loop creating daily, weekly, monthly backups and so on, according to the desired RPO and RTO and depending on the criticality and the time necessary to produce a copy of the data.

This is Ok, but we believe that there is a different approach much more dynamic and efficient that can respond to modern backup needs. AiRE changes the point of view from a calendar oriented strategy to a vision always starting from “now”, allowing you to decide different backup frequencies for the last few hours, or minutes, rather than of the data older than 10 years or more.

In this scenario it is necessary to stop referring to backup copies as “the last full”, “the month of July”, “the Friday 2” and so on, and start to think to “the one of 10 minutes ago”, “that of 7 hours ago”, ” that of 3 months ago” etc. All the copies are always related to “now”, and “one of 7 hours ago” changes every hour, don’t mind what time and date it was taken.

How is it possible?

AiRE file system allows you to make instant snapshots of a file system. This snapshot needs no time to be taken, they are immediate, and the data is always in a consistent state within them. Each snapshot is always absolutely independent (despite it’s always possible to copy snapshots in an incremental way, each snapshot is always independent and consistent, and they do not need to consolidate as in VMWare environment), so that you can delete any snapshot in the sequence without invalidating the remaining ones, and each snapshot that you keep allocates on the storage only the space of the differential portion of the data between it and any other snapshot of the same file system.

This feature has been exploited by DPE (Data Protection Engine) to implement a tier system in which you can define different tier periods, each one with its own retention number, that you to decide how long each tier must last and the backup retention frequency within each tier.

For example, you may want to protect yourself from threats such as CryptoLocker or similar by setting up a tier for the last 4 hours in which to create one snapshot every 5 minutes (equal to 12 snapshots/hour), then a subsequent tier of 20 hours in which to retain only one snapshot every hour. In this way, you’ll always have the chance to rollback your data to five, ten, fifteen minutes and so on up to 240 minutes ago, and then to 300, 360, 420 minutes and so on up to 24 hours ago.

In addition to this you can define a third daily and a fourth weekly tier to define the normal backup strategy, for example retaining one snapshot a day for 27 days (plus the 28th day made by the first two tiers, making exactly four weeks, about a month) and then one snapshot a week for 48 weeks (plus the four more weeks made by the first three tiers making about a year).

Finally, you can define a last long-term retention tier of one snapshot a year for the number of years you are interested in keeping your historical data (e.g. 15 years).

In the image below is shown how to set up this tier strategy in the AiRE graphical interface.

You can also copy snapshots to a backup target, even a remote one, or clone them on another file system. No matter how long it takes to copy it, the snapshots will be never modified after being taken, so you have all the time to copy them while users continue working on the file system; anyway, there are several copy techniques to save time, including various levels of data compression and deduplication.

After completing the copy of the first snapshot, you can keep it updated with subsequent snapshots using differential copy so that the process will copy only the data effectively modified since the previous snapshot, saving a great amount of bandwidth and time.

In the case of data replication, AiRE Data Protection Engine Interface lets you choose for the replicated data either the same or a different tier and retention strategy than the one applied to the original data. There is no need, for instance, to keep so many recent snapshots as defined in tiers 0 and 1 shown above to avoid CryptoLocker attacks: this is applied only to the live original data. On the other hand, it could be necessary to keep for a longer range of time the historical data of the last tier, or we would want to retain only monthly copies instead of the weekly ones, and so on.

Based on our experience, we strongly believe that this new approach is far better than the traditional one, once you become familiar with the principles underlying its operations, so much so that we decided to implement it despite the initial difficulty the user could experiment in understanding it. Are we wrong? What do you think about it?