You are currently viewing Backups in Action

Backups in Action

Designing a Comprehensive Backup Strategy for Our Client

As part of our client’s move to AWS, we needed to develop a backup solution that would encompass their entire AWS organisation.

This backup needed to be managed by their AWS operations team, with a way to segregate duties amongst them. It also had to be maintainable across all AWS accounts. Different groups of resources and environments required backups on varying schedules.

When setting specific RPO (Recovery Point Objective) and RTO (Recovery Time Objective), it’s crucial to ensure that backups are protected and immutable to guard against potential threats such as bad actors, corrupted files, or account-level disasters.

What was our backup approach?

A centralised backup was devised to enable rapid recovery from local accounts while incorporating strong disaster recovery capabilities through a centralised vault. This vault securely stores copies of the backups in case of a disaster and is accessible only to a limited operations team, while local backups remain fully available for immediate restores.
In an organisational setup with multiple accounts per environment and various policies per account for different resource groups, AWS Backup’s delegated admin account manages backup policies that apply to the attached accounts. This approach aligns with AWS best practices for implementing backup strategies.
 

To enhance security, the central backup strategy was divided into two isolated accounts:

  1. Centralised Prod Backup: A backup vault in a dedicated Prod Central Backup account stores copies of all local vault backups from production accounts.
  2. Centralised Non-Prod Backup: A backup vault in a dedicated Non-Prod Central Backup account stores copies of all local vault backups from non-production and lower environment accounts.
Application accounts contain resources with the appropriate backup tags applied, allowing the backup policy to identify and back up resources based on the corresponding tags.
Tags are associated with policies in the backup policy account, which serves as the delegated admin account. There is also a shared services account where other shared resources are managed.
 
Backups are backed up locally in the application account and then copied over to the centralised account upon completion. 
 
The value tags are designed so that each value clearly indicates the retention policy to be applied both locally and centrally, ensuring a consistent approach to tag values. These tags are easily readable, as opposed to ambiguous codes, allowing engineers and application teams to easily understand the expected behaviour and backup frequency in case a restore is needed.
 
Each application has different backup requirements, with each tag linking to their corresponding policy. These policies are applied to accounts as needed and are readily available for attachment whenever necessary.

Vault locks are applied at the central level to protect backups in the central location, whilst application account vaults do not have a vault lock unless requested due to the disaster recovery requirements of the service level agreement.

Local vault locks are applied individually to each application based on business requirements. When RPO and RTO are critical, the local vault may be prioritised due to limitations in the central backup.

Segregated Access:

  • Application accounts do not have access to policies in the shared resources account.
  • Application accounts cannot access centralised backup accounts, but the centralised accounts can access application accounts.

What are the benefits of our backup approach?

  • Centralised backup accounts for both higher and lower environments are secured by restricting access.
  • While some AWS resources have their own backup features, these backups are deleted when the resource is removed. In contrast, our backup approach ensures that backups are retained even after the resource is deleted.
  • Vault locks ensure that backups are immutable, preventing any user from corrupting or modifying them.

Cloud Elemental offers this solution using Infrastructure as Code (IaC) in Terraform, in line with AWS Best Practices. Resource deployment is managed via CI/CD pipelines, and policies can be updated through Terraform as needed, ensuring that all accounts and environments are up to date and synchronised. This approach facilitates quick and efficient onboarding, allowing the solution to be used almost immediately.

Leave a Reply