Stretched Clusters Explained: What is a Stretch Cluster?

Published On: 21st November 2025//5 min read//Tags: //

Stretched clusters, also known as stretch clusters or metro clusters, are a computing configuration that spreads virtualized infrastructure across two or more physically separate locations, like data centers or buildings. If one site fails, the other site can take over to ensure continuous operation, enabling real-time failover capabilities. This setup provides enhanced availability and redundancy, enabling organizations to better prepare for disaster recovery. 

How a Stretched Cluster Works

Understanding what is a stretch cluster begins with grasping its fundamental architecture. A stretched cluster creates a unified computing environment from geographically distributed resources, operating as a single logical system despite physical separation.

Multiple Site Distribution 

Servers are strategically placed in separate physical locations to create a single logical cluster. This geographic distribution is the foundation of a stretch cluster’s resilience, ensuring that localized disruptions at one site don’t compromise the entire infrastructure.

Shared Storage Architecture 

Data storage is mirrored or “stretched” across sites, giving each location access to the data needed for virtual machines (VMs) and applications. 

This synchronous replication ensures that both sites maintain identical copies of critical information, enabling seamless transitions if a failover event occurs.

Automatic Failover Capability

When a failure occurs at one site, whether from a natural disaster, power outage, or equipment malfunction, the other site’s servers automatically assume the workload and continue running VMs. 

This automation minimizes downtime and eliminates the need for manual intervention during a failure.

Site-to-site Connectivity 

A dedicated, low-latency network link connects the locations, keeping storage and servers synchronized in real-time. 

Witness

A third, dedicated host, often called a witness, acts as a tie-breaker during network partitions between sites. This component helps the stretched cluster determine which site should remain active during network splits, preventing split-brain scenarios that could lead to data corruption.

4 Benefits of Stretched Clusters

For most industries, continuous operations are paramount, and stretched clusters offer compelling advantages.

1. Enabling Better Disaster Recovery

Stretch clusters protect against site-wide outages caused by catastrophic events. Think of natural disasters like fires, hurricanes, or flooding. By maintaining operations across multiple locations, businesses can continue serving customers even when one location gets shut down.

Example: You’re a large retail organization with a data storage site in a flood-prone region. When severe flooding strikes and compromises the site, forcing it offline, your stretch cluster’s capabilities ensure operations continue seamlessly at your secondary location. This means your customers can keep shopping, transactions keep processing, and your business is never impacted by the downtime.

2. High Availability

A stretched cluster ensures services remain operational even when one site experiences failures. This is also known as high availability. It ensures data uptime and minimizes operational disruption, providing peace of mind for organizations that cannot afford downtime.

Example: You’re a manufacturer with assembly lines running 24/7. When a power outage strikes your primary data site mid-production, a stretched cluster ensures your secondary site takes over instantly. Manufacturing data systems that impact your inventory databases and quality control applications continue running without interruption. This means no halted assembly lines, no missed deadlines, and no costly downtime.

3. Flexible Maintenance Windows

Planned maintenance at one site shouldn’t mean service interruptions. With stretched clusters, workloads can be seamlessly moved to the other location, allowing IT teams to perform updates, hardware replacements, and infrastructure improvements without impacting users.

Example: You’re a banking service processing hundreds of transactions per minute. Scheduled system maintenance can’t interrupt customer service. Stretched clusters support you to shift all data workloads to a secondary site before performing critical updates.

4. Optimized Costs 

Stretch clusters optimize hardware utilization across multiple locations, making hardware use more cost-effective. While it isn’t a complete alternative to comprehensive disaster recovery setups, it supports greater efficiency. For example, with stretched clusters, both sites actively participate in the cluster, maximizing cost-effectiveness.

Example: You’re a healthcare provider who needs a cost-effective budget to enable you to better support patient care. When an outage hits one of your sites, your stretched cluster doesn’t rely on expensive backup systems that have been sitting idle. Instead, your secondary site seamlessly absorbs the additional workload. You get the uptime you need, without noticing the impact.

What Impacts a Stretched Cluster Deployment and Performance?

Some organizations experience sluggish performance with stretched clusters. Your first instinct might be: “We need more bandwidth!” So, you invest in upgrading the network connection between sites, expanding capacity. The new infrastructure goes live, and the performance issues persist. What went wrong?

This scenario plays out repeatedly because bandwidth and latency are often confused as interchangeable solutions. 

Think of it like rush hour traffic. Adding more lanes to a highway only solves congestion problems when the lanes are actually full. But if there’s a strict 30mph speed limit on every lane, opening up additional lanes won’t get anyone home faster. The fundamental travel speed is simply too slow.

The same principle applies to stretch clusters. When data travels between sites, bandwidth determines how much information can move simultaneously (the number of lanes), while latency dictates how quickly each bit of data completes the journey (the speed limit). A stretched cluster with massive bandwidth but high latency is like a ten-lane highway where everyone crawls along at a slower speed.

How do you fix this? Understanding the challenges and limitations of each stretch cluster on a case-by-case basis is necessary to ensure it performs as you need.

A reliable, high-speed connection with low latency is essential for performance and stability. Many enterprise stretch cluster platforms recommend keeping inter-site round-trip time (RTT) under 5-10ms to avoid issues in application performance and failover times.

If you’re looking to deploy stretched clusters, seek support from a software vendor with years of experience managing and creating a stretched cluster. Look for providers with proven expertise in designing, deploying, and managing stretched clusters across diverse industries and use cases. The right vendor will help you navigate bandwidth and latency requirements, configure appropriate witness components, conduct thorough testing scenarios, and optimize your deployment for both performance and resilience.

Share This Post, Choose Your Platform!

Recent Blog Posts