Database cloud migrations.

Sergey Barinov
7 min readMar 5, 2021

The Good the Bad and The Ugly.

Originally posted on Linked-In

Introduction

In the current market of cloud providers, a number of firms are offering Relational Database Management Systems (RDBMS) managed instances as a service based on: Oracle, Microsoft SQL Server, MySQL, and Postgres. Where the smaller firms usually differentiate themselves by offering additional automation/abstraction by leveraging offerings from larger firms such as Amazon and Microsoft.

The landscape of the managed database providers is also filled with provider specific SQL and NoSQL offerings that are fully ACID compliant and transitionally safe, such as Azure Cosmos DB and AWS Dynamo DB.

Not to mention enterprises such as Cockroach Labs with a viable product called cockroachdb, of whom I learned during the March 2019 NYC Postgres Conference. Cockroachdb is a massively scalable solution claiming to power companies such as: Comcast, Baidu, and MetroAG. And in my opinion the product called cockroachdb based on PostgreSQL immediately deserves special attention based on the name and database architecture alone!

In short, going to the cloud is confusing, expensive, and requires allot of preparation. As always, “The devil is in the detail”, and migrating on premises RDBMS databases into cloud managed instances can still be a viable option. To maximize the success of the migration, everyone involved in the migration process including: project managers, application developers, systems engineers, database administrators, and C-level executives for themselves must

  • Clearly articulate the reason to go to the cloud
  • Quantify migration benefits/limitations/caveats, and
  • Define a clean plan of action for each use case.

For the purpose of simplifying this article two Relational Database Services will be referenced.

  1. Amazon Web Services (referred to below as AWS), and
  2. Microsoft Azure — Singleton Databases, Elastic Pools, and Managed Instances (referred to below as Azure).

It is worthwhile to highlight a key attribute of a managed service, which is an abstraction of the underlying hardware and the operating system from the RDBMS service provided. Commonly referred to as a “PaaS offering”

THE GOOD

HA and DR

Both AWS and Azure managed database services offer high availability options with redundant hardware as well as built-in disaster redundancy with automatic database backup and restore out of the box.

Ease of Scale

With just a few clicks of the mouse and little knowledge of the underlying RDBMS one can deploy highly redundant, highly available multi-region databases with SSL and in rest data encryption support.

  • Both AWS and Azure offer to easily scale CPU and RAM for the managed instanced, as well offer sideways scaling by leveraging read replicas.

Ease of Automation

Server-less services such as AWS Lambda’s and Azure Functions combined with native CLIs and APIs aid in the automation of the database deployments.

  • For advanced automation enthusiasts cloud deployment integration is available using open source technologies such as Terraform , Ansible and others: Chef and Puppet to name a few.

Migration Tools

Both Azure and AWS provide tools, services, and guides to analyze on premises database migration compatibility to the cloud.

Transparent Cost

The cost of a managed instances can be easily calculated using AWS Simple Monthly Calculator , Azure’s Pricing Calculator

  • Managed instances are offered in a CPU/RAM cookie cutter stamp
  • In general, for RDBMS instances one would prefer higher CPU per RAM ratio due to inherit penalty of fetching data from disk.
  • The “cookie cutter” deployment is also highly beneficial, because I firmly believe that providing a small amount of good options and services are magnitudes better than providing stakeholders/clients with a plethora of options.
  • Limited options also remove stakeholder confusion, and help to “bake-in” guard rails which prevent stakeholders from torpedoing themselves.

THE BAD

Scaling is not 100% transparent … out of the box … yet.

Scaling up RDBMS instances by increasing instance’s CPU/RAM in most cases is not seamless.

  • For example, one of the ways to upgrade your Postgres Aurora AWS RDS db.r5.2xlarge managed instance (8 CPUs 64 GBs of RAM) to the next level which is db.r5.4xlarge (16 CPU 64 GB RAM ) is to first ensure you have a db.r5.4xlarge replica in your cluster and then failover to it.
  • Make sure you understand the failover mechanism for each managed RDBMS.

Understand your use case and the underlying RDBMS

  • Although the managed instances come tuned out of the box by the cloud provider, you may need to tune it further, requiring in-depth understanding of the underlying RDBMS technology and your specific use case.
  • Knowing if your database is read or write heavy will influence the choice of disk volumes for your deployment.
  • RDBMS Monitoring is abstracted and becomes cloud provider specific

At its core, managed RDBMS instances offer an endpoint abstracted from the host and the operating system making monitoring — provider specific.

Native monitoring metrics may come at a cost

  • You may be charged directly for querying the metrics
  • You may need to integrate monitoring into your existing analytics ecosystems.
  • Third-party monitoring vendors also have difficulty penetrating the managed instance vail to get at the OS level metrics, which reduces their effectiveness

Fully understand your total cost!

Understanding the cost of the complete database migration is paramount, and getting specifications for desired managed instances may not be enough.

  • Ensure to price in both development and production environments at peak times of utilization.
  • You may gain significant savings by pre-purchasing managed instances for 1- 3 years, but keep in mind vendor lock in when pre-purchasing instances.

Price the data flow cost

In most cases data flowing out of cloud providers will incur additional cost per GB.

Fully understand network dependencies

Analyze ahead of time implication of deploying the database to the cloud and keeping the application on premises including any imposed firewall restrictions.

  • For larger enterprises dedicated network links between your existing data centers and the cloud provider may be required in order to reduce latency and to ensure overall stability.

“Put the money where your mouth is”, and fully engage your resources for the migration

  • Complicated database migrations inside large enterprises should be choreographed by project managers.

Facilitate inter department communication between:

  • Application owners
  • System engineers
  • Database administrators.

Conduct stand up meetings, and hold everyone in the migration chain accountable.

  • Remember that the weakest link in your plan may fail your migration or impede your progress
  • Dedicate domain experts in the underlying RDBMS and the Cloud Provider of choice to ensure a successful migration.

THE UGLY

Cloud migrations can turn very ugly. To avoid the ugliness as much as possible:

Keep the initial migrations simple

  • Any desire to convert RDBMS vendor as part of the migration effort creates allot of complexity, best to avoid this whenever possible.
  • Most of the RDBMS deployment scenarios power an application or a process, which may have embedded SQL code, masking the migration complexity.
  • This article assumes an application source control repository. If you do not have an application source control repository, please stop everything and obtain one.

Prior to the migration ensure application source code which references the database to be migrated is:

  1. Properly inventoried
  2. Has correct Connection Strings taking advantage of the HA/DR
  3. There is a dedicated engineer(s) to support the migration

Do your due diligence

Don’t be naïve, and validate database performance with realistic workflows.

  • Be especially mindful when your sales rep insists that your managed instance will see multiple performance boost and promises lower managed overhead vs existing on premises deployments.
  • Document the database workflows, and understand how the data is consumed upstream and downstream.
  • Accept the critical nature of early testing and do not leave it to the very end, because if you do, you may see allot of ugly in terms of cost for expensive instances and wasted engineering hours spent to fine tune or worse rewrite database workflows.

Understand RDBMS Vendor License

Failure to do may result in financial peril by being out of compliance with the vendor.

  • For-profit developers and vendors expect to get paid for their products and services and will actively protect their interests.
  • Fully understand specific RDBMS vendor caveats when it comes to licensing instances in the cloud.
  • Be adept at reading vendor fine print as it relates to towards PaaS and IaaS conversions

In general RDBMS licenses can be purchased

  • Per CPU with a virtual/physical distinction
  • Per connection
  • And other conditions…

Be especially mindful of cloud provider specific caveats for managed instances — especially when choosing to deploy to IaaS (or Virtual Machine) to host databases.

In Conclusion

Whether you have a few of a few thousand databases, it is in the best interest of your enterprise to consider leveraging cloud based managed instances.

I hope for this article to serve as a high-level guide for those who are willing to brave the waters of database cloud migrations.

Do remember there is no one way to migrate a database and to consider all available informed options prior to making the decision to go to the cloud.

And should you do find yourself lost in the sea of database migrations remind yourself that you are not the first to take on this venture and consider outsourcing the migration to experienced engineers and firms.

Happy Migrations!

Sergey Barinov

--

--

Sergey Barinov
0 Followers

Full Stack Engineer | DBMS Specialist | Entrepreneur