Home » Network Disaster Recovery: RPO, RTO, Site Failover, Backup Strategy, DR Testing และ Business Continuity
Network Disaster Recovery: RPO, RTO, Site Failover, Backup Strategy, DR Testing และ Business Continuity
Network Disaster Recovery: RPO, RTO, Site Failover, Backup Strategy, DR Testing และ Business Continuity
Network Disaster Recovery เตรียมพร้อมรับมือกับเหตุการณ์ที่ทำให้ network ล่ม RPO (Recovery Point Objective) กำหนดว่ายอมเสียข้อมูลได้กี่ชั่วโมง, RTO (Recovery Time Objective) กำหนดว่าต้องกลับมาทำงานได้ภายในกี่ชั่วโมง, Site Failover ย้ายระบบไป DR site, Backup Strategy วางแผนสำรองข้อมูล, DR Testing ทดสอบแผน DR และ Business Continuity รักษาการดำเนินธุรกิจต่อเนื่อง
Network disaster recovery เป็น สิ่งที่ทุกองค์กรต้องมี แต่หลายองค์กรไม่เคยทดสอบ : 75% ขององค์กรมี DR plan แต่เพียง 25% ทดสอบเป็นประจำ (Zerto Survey) ผลลัพธ์: เมื่อเกิด disaster จริง DR plan ไม่ทำงาน → downtime ยาวนาน → สูญเสียรายได้เฉลี่ย $5,600/นาที (Gartner) สาเหตุ disaster: natural disasters (น้ำท่วม ไฟไหม้), ransomware, hardware failure, human error, power outage — ทุกองค์กรจะเจอ disaster สักครั้ง คำถามคือ “เมื่อไหร่” ไม่ใช่ “จะเกิดไหม”
RPO vs RTO
Metric
Definition
Example
Impact
RPO (Recovery Point Objective)
Maximum acceptable data loss (time)
RPO = 1 hour → lose max 1 hour of data
Determines backup frequency: RPO 1h = backup every hour
RTO (Recovery Time Objective)
Maximum acceptable downtime
RTO = 4 hours → systems must be up within 4 hours
Determines DR architecture: RTO < 1h = active-active, RTO 4h = warm standby
RPO = 0
Zero data loss
Synchronous replication — every write goes to both sites
Most expensive, limited by distance (latency)
RTO = 0
Zero downtime
Active-active with automatic failover — always running
Most expensive, requires full duplicate infrastructure
DR Architecture Tiers
Tier
Architecture
RPO
RTO
Cost
Tier 1: Backup Only
Tape/disk backup, offsite storage
24 hours
Days-weeks
Lowest
Tier 2: Pilot Light
Core infra running at DR, other components off
Hours
Hours
Low
Tier 3: Warm Standby
Scaled-down copy running at DR site
Minutes-hours
Minutes-hours
Medium
Tier 4: Hot Standby
Full duplicate running, async replication
Minutes
Minutes
High
Tier 5: Active-Active
Both sites serving traffic, sync replication
Zero
Zero (automatic)
Highest
Network DR Components
Component
Primary Site
DR Strategy
WAN Connectivity
Primary ISP + MPLS/SD-WAN
Diverse ISPs at DR, different physical paths, SD-WAN failover
DNS
Primary DNS servers
GSLB/GeoDNS: automatic failover to DR site IPs, low TTL for fast cutover
Load Balancers
Active at primary
Standby LBs at DR — GSLB monitors health, redirects on failure
Firewalls
Active at primary
Mirrored config at DR — automated config sync, same policy
Routing
BGP/OSPF at primary
BGP failover: DR site announces routes when primary withdraws — AS-path prepend for preference
Config Backup
Running configs on devices
Automated config backup: Oxidized, RANCID, Ansible — version controlled in Git
Backup Strategy (3-2-1 Rule)
Rule
Meaning
Implementation
3 Copies
Keep 3 copies of data (production + 2 backups)
Production data + local backup + offsite/cloud backup
2 Media Types
Store on 2 different media types
Disk (fast restore) + tape/cloud (long-term, air-gapped)
1 Offsite
At least 1 copy offsite (different location)
DR site, cloud storage (S3, Azure Blob), or tape vault
Network Configs
Backup device configurations automatically
Git repo: Oxidized/RANCID push configs to Git daily — version history
Network Diagrams
Keep topology diagrams, IP plans, credentials updated
Netbox, draw.io, documentation wiki — stored offsite accessible during DR
Immutable Backup
Protect against ransomware: backups can’t be modified/deleted
WORM storage, air-gapped tapes, immutable cloud storage
DR Testing
Test Type
Scope
Frequency
Tabletop Exercise
Walk through DR plan on paper — discuss scenarios, identify gaps
Quarterly
Component Test
Test individual components: restore from backup, failover single service
Monthly
Partial Failover
Failover subset of services to DR site — verify functionality
Semi-annually
Full Failover
Complete site failover — all services running at DR — most realistic test
Annually
Chaos Engineering
Randomly inject failures in production — test resilience continuously
Continuous (Netflix Chaos Monkey approach)
Post-Test Review
Document findings, update DR plan, fix gaps identified during test
After every test
Business Continuity Plan (BCP)
Phase
Action
1. Business Impact Analysis
Identify critical systems, calculate cost of downtime per system, define RPO/RTO per system
2. Risk Assessment
Identify threats: natural disaster, cyber attack, hardware failure, human error — probability × impact
3. Strategy Design
Choose DR tier per system based on RPO/RTO and budget — not everything needs active-active
4. Plan Documentation
Step-by-step runbook: who does what, contact lists, escalation, vendor contacts
5. Implementation
Build DR infrastructure, configure replication, automate failover where possible
6. Testing
Regular testing (tabletop → component → partial → full) — DR plan that isn’t tested doesn’t work
7. Maintenance
Update plan when infrastructure changes — new systems, new staff, new risks
ทิ้งท้าย: DR = Hope for the Best, Plan for the Worst
Network Disaster Recovery RPO/RTO: RPO = max data loss acceptable, RTO = max downtime acceptable — drives DR architecture design Tiers: backup only (days) → pilot light (hours) → warm standby (minutes-hours) → hot (minutes) → active-active (zero) Network DR: WAN diversity, GSLB/DNS failover, mirrored firewall configs, BGP failover, automated config backup Backup: 3-2-1 rule (3 copies, 2 media, 1 offsite), immutable backup (ransomware protection), Git for configs Testing: tabletop (quarterly) → component (monthly) → partial failover (semi-annual) → full failover (annual) BCP: BIA → risk assessment → strategy → document runbook → implement → test → maintain Key: a DR plan that isn’t tested is just a wish — test regularly, update continuously, automate everything possible
อ่านเพิ่มเติมเกี่ยวกับ Network High Availability HSRP VRRP GLBP Redundancy และ Network Compliance PCI DSS ISO 27001 NIST SOC 2 ที่ siamlancard.com หรือจาก icafeforex.com และ siam2r.com