维持业务持续性对于企业至关重要,关键业务系统宕机可能会对企业声誉、客户满意度或公司财务状况造成难于挽回的重大损失。因此,构建高可用高容灾的IT系统具有显而易见的价值。
在灾备解决方案的选定上,企业需要先制定好两个重要指标:RTO(恢复时间目标)和 RPO(恢复点目标)。基于这两个关键业务指标,企业IT部门可以设计相应的应用架构、备份架构和灾备体系。
本文首先对灾备的基本概念做概况介绍,之后利用WordPress模板模拟经典Web Hosting架构,给出基于AWS的4种备份/灾备方案,并做关键指标的对比。
关键指标 RPO & RTO
- RTO(Recovery Time Objective,复原时间目标):灾难发生后,从IT系统宕机导致业务停顿之时开始,到IT系统恢复至可以支持各部门运作之时,此两点之间的时间段。例如灾难发生后半天内需要恢复,RTO值就是12小时。
- RPO(Recovery Point Objective,复原点目标):要实现能够恢复至可以支持各部门业务运作,恢复得来的数据所对应的时间点。如果企业每天凌晨零时进行备份一次,当服务恢复后,系统内储存的只会是最近灾难发生前那个凌晨零时的资料。
四种灾备方案对比
| 方案 | RTO | RPO | 成本 | 适用场景 |
| Cold Backup(冷备) | 24小时 | 24小时 | 最低 | 非关键业务 |
| Pilot Light(热备核心) | 4小时 | 4小时 | 较低 | 一般业务 |
| Warm Standby(温备) | 分钟级 | 分钟级 | 中等 | 重要业务 |
| Multi-Site(多活) | 秒级 | 接近0 | 最高 | 关键业务 |
本系列文章详细介绍了前两种方案(Cold Backup 和 Pilot Light)的具体实施步骤,每种方案都给出了解决方案、架构图、成本估算、具体执行步骤和自动化脚本。
企业可结合自己对应用系统 RTO 和 RPO 的要求以及成本预算,选择适合自己的灾备方案。
返回技术博客
Maintaining business continuity is critical for enterprises. Downtime of key business systems can cause irreversible damage to reputation, customer satisfaction, and financial performance. Building highly available and disaster-resilient IT systems is therefore essential.
When selecting a DR solution, enterprises must first define two key metrics: RTO (Recovery Time Objective) and RPO (Recovery Point Objective). Based on these metrics, IT teams can design the appropriate application architecture, backup strategy, and DR framework.
This article introduces the fundamental concepts of disaster recovery, then uses a WordPress template to simulate a classic web hosting architecture and presents four AWS-based backup/DR solutions with a comparison of key metrics.
Key Metrics: RPO & RTO
- RTO (Recovery Time Objective): The time between when a disaster causes a system outage and when the system is restored to support business operations. For example, if recovery must happen within 12 hours, the RTO is 12 hours.
- RPO (Recovery Point Objective): The point in time to which data must be recovered to support business operations. If backups run daily at midnight, the RPO is up to 24 hours — meaning up to one day of data could be lost.
Comparison of Four DR Strategies
| Strategy | RTO | RPO | Cost | Use Case |
| Cold Backup | 24 hours | 24 hours | Lowest | Non-critical systems |
| Pilot Light | 4 hours | 4 hours | Low | General business systems |
| Warm Standby | Minutes | Minutes | Medium | Important systems |
| Multi-Site Active/Active | Seconds | Near zero | Highest | Mission-critical systems |
This blog series covers the first two strategies (Cold Backup and Pilot Light) in detail, providing solution architecture, cost estimates, step-by-step implementation guides, and automation scripts for each.
Enterprises should choose the DR strategy that best matches their RTO/RPO requirements and budget.
Back to Tech Blog