Disasters, whether naturally occurring or maliciously imposed, rarely strike when you expect. In a disaster recovery (DR) situation, you must restore your digital infrastructure quickly and efficiently. Failover and failback are business continuity tools that help sustain normal virtual operations even when your primary production site is disabled.
Think of failover and failback processes as important complementary elements of a robust DR framework. The failover operation switches production from a primary site to a backup (recovery) site. A failback returns production to the original (or new) primary location after a disaster (or a scheduled event) is resolved.
In the event of a catastrophic outage, you can quickly restore any affected system by “failing over” to a copied version. In this context, failover is the transfer of business-critical workloads away from a compromised primary production system and to a designated recovery site— thereby restoring production system operations. Failover mitigates the effects of a disruption by sustaining operations in the face of a potentially debilitating system failure.
The remote (off-site) system copy then is initialized during failover to replace the original system. Depending on the nature of the failure event, you can fail over to the latest system image or to a specific, selected recovery image. Frequently copying system images ensures that you retain multiple system versions and minimizes any data loss. Failover to a curated copy of your system is a cost-effective way to protect against IT system failures.
Following primary site recovery—and resolution of any associated security risks or other failure-related issues—you can restore business operations back to your production system. Failback allows you to recover the pre-disaster image at the original production system (or other selected production location) and restore workloads from the copied system to the designated production system. It is likely, however, that incremental changes will occur in the recovery system following failover. Thus, you must synchronize the restored/new production system with the copied system prior to failback in order to avoid business-critical information loss. When executing a failback, only the interim (altered) data retained in the recovery system should be returned to the new/restored production system.
When an error is detected a failover workflow changes data sources to a recovery system while a failback workflow restores data back to the original state after a ransomware event or other corporate data loss.
Here’s what to expect with failover/fallback:
Our replication and disaster recovery solution integrates failover and failback operations into a seamless and comprehensive management framework. Rubrik automates non-disruptive failover testing and application cloud migration to mitigate risk and meet compliance. Rubrik DR software monitors replication tasks and DR failover status, tracks replication policy compliance, and accumulates proactive error/warning notifications. We deliver simplified DR orchestration with failover/failback, testing, and cloud migration.
Learn how Rubrik can help protect your data with class-leading replication and disaster recovery solutions.