Near-zero Downtime Scaling in Azure Database for PostgreSQL Flexible Server

kabharati · ‎Nov 15 2023

In the fast-paced world of technology, businesses rely heavily on scalable server solutions to meet their ever-changing demands. One critical aspect of server management is the ability to modify storage and compute tiers without causing disruptive downtime. Traditional scaling methods often come with significant downtimes, ranging from a few minutes to several, causing inconvenience and downtime to customer applications. However, the introduction of the Near Zero Downtime Scaling feature will change this process, significantly minimizing downtime and enhancing the overall availability of your flexible server workloads.

We are excited to announce that Near Zero Downtime Scaling for Azure Database for PostgreSQL - Flexible Server - is now generally available in all Azure regions. In this blog, we will delve into the benefits and functionalities of this feature and explore how near zero downtime scaling will empower you to achieve seamless scalability for your Azure PostgreSQL databases, allowing you to perform both compute and storage scaling operations in under 30 seconds.

Understanding Near-zero Downtime Scaling

Near-Zero Downtime Scaling is a groundbreaking feature designed to minimize disruptions when modifying storage and compute tiers. When adjustments are made, such as modifying the number of vCores or changing the compute tier, the server undergoes a restart to apply the new configuration. During this transition, new connections cannot be established. In traditional scaling methods, this process could take anywhere from 2 to 10 minutes, but with Near Zero Downtime Scaling, this duration has been reduced to less than 30 seconds.

This remarkable decrease in downtime has a profound impact on businesses, ensuring that critical operations can continue without interruption even during server modifications. What makes this feature even more appealing is that it is enabled across all public regions, requiring no additional action from customers to leverage its benefits.

How Near Zero Downtime Scaling Works

Let us explore how this feature works using the screenshot below. In the command window located at the top left, we are initiating a compute scaling operation by upgrading the SKU from 2 cores to 4 cores. Simultaneously, in the window below, a Python script is running in a continuous loop, fetching the current time, server address, and a specific column from the table every second.

During the scaling-up process, a new virtual machine (VM) is provisioned, followed by a failover and subsequent recovery. If you direct your attention to the screenshot on the right, you'll observe that connections are momentarily interrupted for approximately 23 seconds during the upgrade and failover. In this phase, the old VM with the previous configuration is replaced by a new VM with an upgraded configuration. This is a great improvement over the current experience where the downtime can range anywhere from 5-10 mins.

Limitations and Considerations

While Near Zero Downtime Scaling is a game-changer for server management, there are certain limitations to be aware of:

Regional Capacity Constraints and Quota Limits: Near Zero Downtime Scaling will not work if there are regional capacity constraints or quota limits on customer subscriptions. It's essential to monitor these limits to ensure seamless scaling operations.
Replica Servers: This feature doesn't work for replica servers but supports the source server. Replica servers will automatically go through the regular scaling process.
Vnet Injected Servers: Near Zero Downtime Scaling won't work if a Vnet-injected Server with a delegated subnet does not have sufficient usable IP addresses. For standalone servers, additional IP addresses are necessary, and for HA-enabled servers, two extra IP addresses are required.
HA-enabled Servers: Near Zero Downtime Scaling for HA servers is currently enabled only in a few initial regions. This feature will be enabled in future releases, following thorough testing

Getting started

In this post, I explained near zero downtime scaling for Azure Database for PostgreSQL flexible server and its benefits. To learn more about this feature please go through Near-Zero downtime scaling documentation. To learn even more about our Flexible Server managed service, see the Azure Database for PostgreSQL service page.

We are always eager to get your feedback, please reach out via email to Ask Azure DB for PostgreSQL.

Products (49)

Special Topics (26)

Video Hub (462)

Most Active Hubs