|
Disaster Recovery
Bare Metal Disaster Recovery:
Bare Metal Recovery (BMR) is the process of taking a
low-level snapshot of a machine's operating system partition and storing it
where it can be quickly and easily accessed when required. A BMR solution
has two parts.
- The first is a program that is set up to periodically
snapshot an OS partition using image backup technology. This is
installed as a service and comes with a scheduler. The scheduler is then
programmed to take backups of the live machine without any requirement
to shut down services, close applications, or go offline. Image backups
are normally stored to a UNC path, SAN, or NAS device for online storage
and quick access when needed.
- The second part of a BMR solution is the process used
to boot a dead machine. This enables users to connect to the online
location where the image backups have been stored and initiate a
restore. Once the OS partition has been restored (which can take between
5 to 30 minutes), the only remaining steps necessary to complete the
disaster recovery are to remove the boot media and reboot the machine.
This latter phase takes approximately two minutes before the machine is
back to the exact state at which the image backup was performed.
Bare Metal Recovery (BMR) is most often considered a
supplemental layer of protection that can help insulate an organization
against unnecessary downtime. While file-by-file backup and restore software
is excellent at protecting against data loss, there is an inherent
disadvantage in being able to quickly return an unbootable machine to a
fully operational state. The shortcomings are the many steps required to
perform a file-by-file recovery, and the lack of guarantee that every
operating system (OS) change has been reinstated even after a restore. Thus,
BMR should actually be a vital part of any company's disaster recovery plan,
not just an afterthought.
Buying and implementing a BMR solution has become a
priority for many organizations – and it should be. BMR is a key part of any
formal disaster recovery plan. It not only offers a fast means of restoring
a failed server, but also offers extraordinary benefits to expedite
recovering from a catastrophic event. With the ability to recover to
dissimilar hardware and/or virtual environments, organizations can provide a
clear path to recovering lost servers by taking off-site backups to any
number of service companies who can provide temporary equipment. Rather than
attempt to locate exact hardware matches or conduct laborious file restores
to new equipment, users can restore an image of a Dell server to an HP or
IBM server. Using the right BMR solution, companies also have the ability to
restore multiple physical servers to a VMware ESX host machine, and be up
and running in literally minutes.
With the technology available today, it is no longer acceptable to have a
file-by-file backup solution as the only means of protecting data. Whether
an organization has a single server, or over a thousand, a bare metal
recovery solution is a necessary preventative measure against expensive and
unnecessary downtime. BMR should be an integral part of every disaster
recovery plan.
BMR -
Time Is Money:
While the definition (and monetary value) of a timely recovery of a
failed machine can vary from organization to organization, one
unarguable fact is that downtime costs money. Actual system downtime
loss is an expense that is usually not well perceived in most
organizations – it can even vary by the time of day. Downtime for
Company A might cost $5,000 an hour while the cost for Company B could
be $100,000 an hour. Even the rate between individual servers within a
company can be vastly different depending on the critical nature of the
applications being run. Here is a very simple formula to estimate
downtime: (Employee costs per hour)$45 x 50= $2250(labor)+$7000=$9250 (Fraction of employees affected by outage + Average income per hour) x
(Fraction of income affected by outage) = Estimated average cost of one hour of downtime *A Simple Way to Estimate the Cost of Downtime – David A. Patterson,
Computer Science Division, UC Berkeley Downtime costs fall into two broad categories: tangible and intangible.
Calculating tangible costs such as employee wages, operating costs, and
office expenses are straight forward and can be estimated with great
accuracy using a simple formula like the one provided above. The
difficultly lies in factoring all of the potential intangible costs such
as lowered employee morale, missed opportunities, forgone sales, and
loss of customer goodwill. These are hard to assign accurate costs.
The
bottom line is all companies recognize computer downtime means lost
money. Regrettably, most don't realize how much it truly costs.
A Money Saver:
Every minute of machine downtime costs an organization
time and money. Therefore, everyone should be able to agree that limiting
downtime is highly desirable, particularly if it is reasonably affordable.
To demonstrate the return on investment (ROI), here is a
BMR scenario:
If the national average for Windows server downtime is $15,000 an hour (and
this is a fairly modest sum), then this would mean that every minute of
downtime equals $250. If it then takes a standard bare metal disaster
recovery solution approximately 20 minutes, as opposed to 40 minutes using
file-by-file backup and restore, the 20 minute savings using the BMR
solution equates to a $5,000 dollar savings in downtime cost with its first
use.
Expanding on this, if the price of a premium BMR solution is $1,000 per
server, an organization could subtract the price of the BMR software from
the money they saved on restore times.
Bottom line, the company would still
be left with a $4,000 cost savings. Not many products offer a ROI like this,
particularly after just a first time use. In a real production environment,
the time savings is more like a 6-to-1 ratio, leading to even greater
savings as opposed to the 2-to-1 ratio used in this example.
THE BMR Process:
To give organizations a better understanding of how the
two backup methods differ, we have provided a procedure comparison between
using file-based backups and restores versus image-based backups and
restores.
- File-by-file restore example:
1. Install EISA Partition (53 minutes)
2. Install Windows OS (45 minutes)
3. Install Backup Software (5 minutes)
4. Create Data Partitions (10 minutes)
5. Restore System 4GB drive (35 minutes)
6. Restore System State/Registry (1 hour)
7. Reboot Server (2 minutes)
Total Restore Steps = 7
Restore Time = 3 ½ hours
- Bare Metal Recovery example using UltraBac Software's
UBDR Gold:
1. Boot server using UBDR Gold Restore Media (5 min)
2. Connect to a UNC path and initiate a 10GB OS partition restore with a
conservative 2GB/minute transfer rate (8 minutes)
3. Reboot Server (2 min)
Total Restore Steps = 3
Restore Time = 15 minutes
As the example demonstrates, a BMR solution can
easily restore a failed machine's 10GB OS partition in 15 minutes using
a conservative 2GB/minute restore speed on a Gigabit network connection.
Fast systems can experience over 5GB/minute restore speed. Organizations
using the BMR process now "complain" that the machine boot time takes
longer than the physical restore. When comparing file-by-file methods
with BMR, there simply is no comparison.
|
Free Product Download
Download
Overview
Details
Documentation
BMR-Time is Money
The BMR Process
Resources
Disaster
Recovery Planning
Business Continuity
Planning
|