Some quick thoughts about backup
This is a summary I wrote for someone else, not my usual blog entry, however it does encapsulate my thoughts around the benefits of NetApp’s implementation of replication based backup. I’ll try to get to a more technically focussed version soon.
Exponential increases in data combined with increased storage density techniques means that traditional “Bulk Copy” based method of backup are no longer able to address the growing backup challenges of a modern IT environment. Even backup architectures which are based on an “incremental forever” basis may find that the time it takes to move whole files from remote locations over slow and WAN links hit scalability limits as the amount of data at the remote sites increases.
Ultimately, the most promising technology to resolve this involves replicating only changed data blocks from primary data sources to secondary arrays in other physical locations. These secondary arrays are equipped with high-density low cost disk drives to provide the large amounts of raw capcity in the densest possible footprint. Once the data has been sent to the remote array, the solution can then perform various kinds of data manipulation to store multiple recovery points within a small data storage footprint. This class of technology generally requires that the primary storage is re-hosted on intelligent storage arrays, or that new agents and secondary storage arrays and backup systems are implemented that support this advanced functionality.
This has the advantage that only changed blocks will be moved from primary storage through to the secondary storage on the NearStore. This reduces both the amount of storage that needs to be provisioned, and allows for the data to be sent over low bandwidth high latency network connections. Because of this the secondary copies of the data will can be stored automatically in an offsite location without needing a second two step process as is commonly required with tape backups.
With robust data storage architectures, multiple logical data points, and offsite copies, these technologies can provide almost all the benefits of a tape based backup solution.
Customer Backup Considerations
While a number of companies have been changing their traditional backup engines to leverage the benefits of replication based storage, NetApp pioneered this technology with the release of industries first ATA based backup to disk appliance in 2002. Since then NetApp has deployed this technology in thousands of locations world-wide many of which protect critical data estates measured in Petabytes
Centralised Policy Based Protection
The advanced data protection capabilities provided by NetApp requires a correspondingly advanced set of management methodologies and tools to fully exploit the benefits of replication based data protection. Protection manager was designed for replication based backup, and provides an integrated way of managing both backup and disaster recovery in a single pane of glass through the following functions and features
- Discovery. Detects new volumes not protected and presents as “unprotected data” in the Protection Manager UI.
- Policy creation. Creates policies for data protection in a wizard-driven graphical process and then calls lower level NetApp tools for execution of the replication process
- Monitoring. Monitors the whole replication process, watching the capacity and performance against policy, and ensures that protection policies are not out of compliance
- Visualization. Provides discovery and mapping views including drilldown and management by exception
- Reporting. Offers status and health reporting such as a “data transfer report” to identify transfer amount, performance metrics, and duration of transfer for replication processes
- Virtual machine support. Support through Open Systems SnapVault includes VMware ESX, Microsoft Hyper-V, and Citrix XEN
- Application integration. Integrates with SharePoint, SQL Server, Microsoft Exchange, Oracle, and SAP via NetApp SnapManager
- DR task automation. Automates tasks, leverages templates, and provides ongoing monitoring with subsequent reporting to those in authority
- DR readiness. Monitors resources for changes that could compromise a disaster recovery and proactively communicates them to administrators for remediation
- One-button failover. Provides continued data access to users, even in the event of a disaster
Unlike typical backup applications, snapvault always keeps the data in it’s original usable format that can be accessed by open industry standard protocols and methods. Files can be accessed using CIFS, NFS or HTTP, LUNs can be accessed by iSCSI or Fibre Channel, all without having to restore the data back to the original location (which may destroy good data), or find alternate space to recover the file / data object.
Ease of Restore
Usable copies also provides a self service restore capability that reduces recovery times (RTO), decreases helpdesk calls, and increases end user faith in the backup process. This in turn reduces counterproductive end user driven backup strategies and reduces both infrastructure and business costs. Usable copies also allow backups to be verified for correctness, and provides easy ways of performing deep content searching of backups for legal and other data discovery requests.
Because the backup data remains in the same format used for traditional primary storage, the high speed NDMP based dump and mirror to tape options used by thousands of companies around the world to protect their NetApp primary storage . These long term archival copies can be sent to tape under the control of traditional backup systems such as NetBackup, TSM or CommVault leveraging existing knowhow and infrastructure, while minimising costs associated with tape management and off siting