Whether you are already using AWS or not, there’s a fair chance at some point you will think “I have this database, but I need it over here now”, whether that’s from on-premises to cloud, between cloud providers, or from AWS account A to AWS account B.
It’s no surprise that AWS has a service for this, and unlike some AWS services (I’m looking at you Elastic Beanstalk), it has a sensible name; AWS DMS (Database Migration Service). Today we’ll talk through how it works, what works well, what sort of works, and what could be a lot nicer to use.
What does DMS offer?
DMS offers a managed migration service across a range of source and target endpoints. Primarily, DMS aims to migrate you onto AWS, but it will also support any homogeneous data migration. Importantly for our case, it supports migrations onto many AWS services including RDS Aurora, DynamoDB, Redshift, and even Kinesis Data Streams.
DMS can perform 3 types of tasks:
- Full load - A one-time load from source to target,
- Full load and CDC (Change Data Capture) - A full load followed by ongoing incremental data capture,
- CDC only - Incremental data capture only.
This is important because DMS can allow you to run your source and target databases with the same data in parallel, then switch over your applications with little to no downtime and with an immediate rollback option available.
DMS also allows you to configure table mappings, letting you apply transformations to your data as you migrate such as adding, removing, or renaming tables and columns.
What doesn’t DMS offer?
DMS is primarily focused on migrating data and table structures, and DMS will not migrate indexes, triggers, stored procedures, and other database objects. For these, AWS offers the Schema Conversion Tool, or you can use any other third-party database change management utility. If you don’t have such tooling at your disposal, it is worth setting it up and it’s something The Scale Factory can help you with.
How does DMS work?
At the core of your DMS process is the replication instance. This is the compute resource that will connect to the source and target endpoints to run the task that we’ll define. Sizing this instance may turn out to be a challenge. As we’ll see later, DMS offers a lot of customisation around how your task runs that can impact the instance size required.
Alternatively, you can choose to use serverless replication which tries to simplify this issue with automatic provisioning and scaling, but at the cost of exposing yourself to other limitations. It’s also worth noting that serverless instances cannot autoscale down during a full load.
DMS Configuration
Here’s where the “fun” begins.
As previously mentioned, DMS offers a lot of customisation for your task. This includes parallelisation, logging, change application, data validation, error handling and so on. You can configure this in the AWS console or via the CLI, but either way you’ll soon find out it’s all one single JSON file.
On one hand, I love that there’s such detail available as some AWS services abstract this too far away. On the other hand, it can be daunting for first-time DMS users or those without extensive database knowledge. Unfortunately each database is different from the next, so you’re going to need to configure this yourself case by case. Or, for that matter, reach out to The Scale Factory for support.
I highly recommend reading through the options and taking note of which you think could impact you as this will save time later when you run into error logs (but remember you need to configure this logging yourself!).
Trying it out
Before running a migration task, DMS recommends performing a pre-migration assessment. This checks the source and target databases for potential issues like permissions, timeouts, column types, and a primary keys. While a failed assessment won’t block you starting the task, it’s a good way to identify problems early with minimal impact on your source database.
Once you start your task, you can pause or restart it at any time. In the console, you’re provided with a useful live reading of its progress on each identified table. This also applies to your Change Data Capture tasks, so you can see how many changes have been captured. As with most AWS services, there are some helpful CloudWatch metrics too.
Conclusion and alternatives
DMS offers a flexible, but very configuration-heavy service with features like full-load, incremental change data capture, and data transformation.
However, DMS has limitations. It does not migrate database objects like indexes and triggers, and the configuration to set up a DMS task can be complex.
Ultimately, DMS is a useful service, but it may not be the best fit for every database migration project. Depending on the specific requirements, alternative approaches may be more appropriate when you don’t require the advanced features of DMS, such as ongoing replication or data transformation.
- AWS Snapshots: For migrations between AWS accounts (e.g. between different RDS instances), AWS Snapshots can be a simpler and cost-effective solution. Snapshots allow you to create a point-in-time backup of your database, which can then be restored to a new instance.
- Native backup/restore tools (e.g
pg_dump
ormysqldump
): These tools allow you to create a backup of your database, which can then be applied to a new instance. This approach is simpler to set up and may be suitable for smaller-sized databases.
When deciding between DMS and these approaches, consider the complexity of your databases, the need for ongoing replication or data transformation, availability requirements, and the technical expertise available. Understanding and evaluating trade-offs can help you choose the best solution for your database migration.
Our team knows how to manage the risks around moving data to the cloud. We also know how risky it can be if you don’t have a cloud copy of your critical data. Book a free chat to find out how we can help.
This blog is written exclusively by The Scale Factory team. We do not accept external contributions.