Issue link: https://insights.oneneck.com/i/1186400
6 DISASTER RECOVERY GUIDE – POWERED BY ZERTO Business continuity Many companies have a – preferably remote – disaster recovery site where data is replicated to on a continuous basis, ready to be leveraged in the event of an outage. If this disaster recovery site is at a remote location it can also provide business continuity (BC): the ability for a business to continue to operate aer a major disaster, like a fire, power outage or a natural disaster. In case the original site is down, the services on the production site can be run on the DR site. This switching process is called failover. Once the normal production site is back up and running, the work that has been done at the disaster recovery site must be replicated back to ensure that all that work is not lost. The ability to failback applications and data from the DR site to the production site is a critical attribute of a solid disaster recovery solution. DR sites used to be a copy of the production site in another office location, but nowadays they are oen located at the datacenter of a cloud service provider or in the public cloud. High availability A concept that is oen confused with disaster recovery and business continuity is high availability (HA). This is functionality that helps avoid downtime by hardware issues, and involves technologies like Redundant Array of Independent Disks (RAID) and redundant parts like power supplies and cabling, but can be applied within virtualized environments as well. HA technologies are necessary to keep systems running, but will not help recover aer a disaster. High availability is mostly expressed in a percentage, somewhere in the 99%. But don't forget that 99.9% uptime still means that a system has 8 hours of unplanned downtime in a year. RTO and RPO When it comes to business needs, translated into Service Level Agreements, recovery is usually expressed in two types of objectives: RTO and RPO. The Recovery Time Objective (RTO) is the amount of time the business can be without the service that needs to be recovered, without significant losses or risks. The Recovery Point Objective (RPO) is the most recent point in time from which data can be recovered. Traditional backup or snapshot technologies have RPOs as low as 15 minutes and up to as long as 24 hours. In modern, ubiquitously digital enterprise environments both RTO and RPO need to be as low as possible, no longer expressed in hours but in minutes or even seconds. Though many organizations focus on RTO to get the business up and running as soon as possible, it is the inability to reproduce the loss of data (RPO) that will haunt an organization for a long time aer any disaster. RPO Any enterprise with a stock market quotation has to comply with rules regarding data security; loss of data will result in loss of revenue, reputation and shareholder value. An online business that loses 4 hours of business data might end up with angry customers wondering when their bought and paid for goods are coming. RTO If a transport company's systems are down for a few hours, it is impossible to plan deliveries and pick-ups efficiently, which has an enormous effect on revenues that are already under pressure Complex robotized production processes that are down aer a hardware or soware failure, cause enormous loss of productivity and revenue. 00:00 Last recovery copy Last recovery copy 08:00 04:00 13:00 16:00 20:00 24:00 Data lost since last replication Time lost due to recovery time 4H 3H 12H+ RTO AND RPO USE CASES (source: Wikipedia) Availability % Downtime per year Per week 90% ("one nine") 36.5 days 16.8 hours 99% ("two nines") 3.65 days 1.68 hours 99.9% ("three nines") 8.76 hours 10.1 minutes 99.99% ("four nines") 52.56 minutes 1.01 minutes 99.999% ("five nines") 5.26 minutes 6.05 seconds HIGH AVAILABILITY IN % AND IN TIME