Hadoop Backup and Disaster Recovery 101
Any production-level implementation of Hadoop must have its data protected from threats. Threats to data integrity can be human-generated (malicious/unintentional) or site-level (power outage, flood, etc.). As soon as you start to identify these threats, it’s important to develop a backup or disaster-recovery solution for Hadoop!
In this class, you will learn the unique considerations for Hadoop backup and disaster recovery, as well as how to navigate the common issues that arise when architects and developers look to protect the data.
• How to model your backup/disaster-recovery solution, considering your threat model and specifics around data integrity, business continuity, and load balancing.
• Best practices and recommendations, highlighting Hadoop in contrast to traditional SAN/DB systems; replication versus "teeing" models for ensuring DR; replication scheduling; Hive; HBase; managing bandwidth; monitoring replication; using one's secondary beyond replication; and a survey of existing tools and products that can be used for backup and DR
After taking this class, you should be able to explain to your organization the right way to effect a backup or data recovery solution for Hadoop.
Level : Intermediate