Please note that this post, first published over a year ago, may now be out of date.
Many SaaS companies want to be indispensable to their customers. They want to be trusted with business critical tasks and for valuable information to be stored on their SaaS platform. But with great power comes great responsibility. Customers want to know their data will be there when they need it and so more requirements are added to RFPs.
Data loss can be a disaster for client relations and your reputation. To protect your business from the risks of data loss you almost certainly take backups, the last line of defence. They are only needed when something else has gone badly wrong. When that happens, will your backups stand up to the pressure?
What do your current backups protect you from?
AWS makes it incredibly easy to create backups for RDS. Automatic daily snapshots that are incrementally kept up to date are the default. Job done? It depends what risks you are trying to mitigate.
A database fails
If you have automatic snapshots enabled you should be able to restore with only a few minutes of data loss.
An update needs to be rolled back
An upgrade failed and the data needs to be reverted to its previous state. Automatic backups allow you to choose a point in time to recover to. As long as you know when the bad change was made you can restore to just before that happened.
Someone accidentally deleted a database
Easier to do than you would expect. Unfortunately it is also easy to delete the automatic snapshots along with it. In fact, if you use Terraform to manage your databases this is the default behaviour.
The data has been maliciously deleted or ransomed
If your account is breached your data is a key target. AWS’s number one recommendation for mitigating ransomware is to be able to recover from backups.
If your backups are in the same account as the compromised infrastructure then any attacker has a better opportunity to deny your ability to restore from them.
Your primary region is down and you need to failover to another one
Failing over to a different region is complex and takes a level of investment that means it is not feasible for most businesses. If you have made that investment and are planning to restore from backup to a different region you may be surprised to learn that AWS Backup doesn’t copy incremental backups to different regions. You will be restoring from the snapshot taken last night and losing any changes to data since then.
Defence in depth
This is a common strategy in a security setting. Having multiple levels of protection around critical infrastructure means that if a breach does occur the damage is limited. Requiring multi-factor authentication, granting least privilege access and segmenting your network are all examples of this that are very common.
Backups are much less likely to be defended in depth. The vast majority of backups we see are:
- In the same region as the original
- In the same account as the original
- Managed by the same team as the original
- Using the same deletion protection settings as the original
Unfortunately this means there is a good chance that whatever caused a problem with the original database will affect the integrity of the backup as well.
Layers of protection
To add more layers of protection to your customer’s data, consider:
Store backups to a different account
Copy your backups to a different account and tighten security controls. The new account provides two benefits:
- The tighter controls don’t add friction to your team’s work in the primary account.
- If the first account is breached the backups are still available for restore.
Set Organisation wide restrictions on deleting backups
If your business uses AWS organisations to manage your accounts, you can set policies to deny the deletion of any backups. This will protect your individual accounts from both accidental or malicious deletion. This kind of restriction protects you even if the root user is compromised.
Review your configuration
Is deletion protection enabled? If you are replicating to other accounts are you doing this continuously or only once a day? Will you be notified if there is a failure?
I’ve used RDS and snapshots as an example, but you may have critical data in other services, such as DynamoDB or S3. Do the backup mechanisms for those services meet your requirements?
Running a game day
Many backups are only used when they are needed. By that point it is too late to discover that they aren’t fit for purpose. Have your team practise restoring from backups to work out the difficulties ahead of time.
Don’t assume you’re protected until you have tested a solution. Some behaviour isn’t intuitive and what works for RDS won’t necessarily work for S3 or DynamoDB.
Getting what you need from backups
Taking backups isn’t binary. Different strategies protect you from different risks. If all you care about is recovering from failure you get decent backups out of the box.
Preventing accidents requires a little more effort but can usually be done within the services you are already using. Stopping malicious actors or anything cross region will probably require other services like AWS Backup and Service Control Policies.
Decide what level of protection is right for your business and check your backups meet that standard.
We’ve seen it all at The Scale Factory, and our expert consultants have helped guide numerous companies through their backup and disaster recovery journeys. Whether you need help setting up baseline tooling like AWS Backup, or want a holistic assessment of your disaster recovery procedures to ensure they are fit for purpose, we can help. Book in your free disaster recovery healthcheck today.
This blog is written exclusively by The Scale Factory team. We do not accept external contributions.