SIOS SANless clusters

SIOS SANless clusters High-availability Machine Learning monitoring

  • Home
  • Products
    • SIOS DataKeeper for Windows
    • SIOS Protection Suite for Linux
  • News and Events
  • Clustering Simplified
  • Success Stories
  • Contact Us
  • English
  • 中文 (中国)
  • 中文 (台灣)
  • 한국어
  • Bahasa Indonesia
  • ไทย

The single best way to deploy quorum/witness

April 26, 2022 by Jason Aw Leave a Comment

The single best way to deploy quorum/witness

The single best way to deploy quorum/witness

During a recent meeting, a customer asked a question about High Availability (HA) and the need for quorum/witness feasibility. Their question was, “What is the best way to deploy quorum/witness?”  The answer to their question is simple, there is no single best way to deploy quorum.  To understand why, let’s start by defining three key things:  what is a witness resource, a quorum resource and a  split-brain scenario.

What is split brain?

In a normal cluster environment, the protected application is running on the primary node in the cluster.  In the event of an application failure of that primary node, the clustering software moves the application operation to a secondary or remote node, which assumes the role of primary. At any given time, there is only one primary node.

Split brain is a condition that occurs when members of a cluster are unable to communicate with each other, but are in a running and operable state, and subsequently take ownership of common resources simultaneously. In effect, you have two bus drivers fighting for the steering wheel.  Split-brain, due to its destructive nature, can cause data loss or data corruption and is best avoided through use of fencing, quorum, witness, or a quorum/witness functionality for cluster arbitration.

In most cluster managers, quorum is maintained when:

  1. All servers are able to see the same state for all cluster peers and the witness
  2. All servers are able to see the same state for the all of cluster peers, though not the witness
  3. All servers are able to see the see the witness resource, though not each other, and avoid split-brain scenarios

In most cluster managers, quorum is lost when:

  1. Servers are unable to see all cluster peers and the witness server
  2. Servers are unable to see a majority of cluster peers, even though they can see the witness server
  3. Servers are unable to access or maintain access to the quorum resource to successfully arbitrate quorum membership and resource access

What is a witness resource (or server)?

A witness resource is a server, network endpoint, or a device that is used to achieve and maintain quorum when a cluster has an even number of members.  A cluster with an odd number of members, using cluster majority, does not need to use a witness resource as all members of the cluster server to arbitrate majority membership.

What is quorum and a quorum resource?

A quorum resource is a resource (device, system, block storage, file storage, file share, etc) that serves as a means for arbitration of the cluster state and membership.  In some cluster managers, quorum is a resource within the cluster that aids or is required for any cluster state and cluster membership decisions.  In other cluster managers, quorum functions as a tie-breaker to avoid split-brain.

More than One Way to Deploy a Quorum

Given the critical nature of quorum it is essential that HA architectures deploy quorum/witness resources properly, and fortunately (or unfortunately) there is no single, best way to deploy quorum. There are several factors that may shape the way in which your witness and quorum resources behave.  These factors include:

1. Whether or not your deployment will be on-premises, cloud, or hybrid

Deploying in an on-premises datacenter where additional storage devices, such as fiber channel storage, power control devices or connections, or traditional stonith devices are present will provide customers with additional options for quorum and witness functionality that may not reside in the cloud.  Likewise, cloud and hybrid environments present differences in what can be deployed and what use cases quorum is being deployed to prevent. Additionally, latency requirements and differences may limit what types of devices and resources are available for a quorum/witness configuration.  

2. Your recovery objectives

Recovery objectives are also important to consider when designing and architecting your quorum and witness resources.  In an example two node cluster (node A and node B), when node A experiences a loss of connectivity to node B, what is the highest priority for recovery. If the witness/quorum resources are in the same network with node A, this could result in node A remaining online, but severed from clients, while node B is unable to assess quorum and takeover. Likewise, if the quorum device lived only in the region, data-center or network with node B, a loss could result in a failover of resources to a defunct network or center or away from a functional and operation primary node.

3. Redundancy of Available Data Centers (or Regions) Within Your Infrastructure

The redundancy of the data center or region is also an important factor in your HA topology with quorum/witness. If your data center has only two levels of redundancy, you must understand the tradeoff between placement of the quorum/witness in the same data center as the primary or standby cluster node. If the data center has more than two redundant tiers, such as a third availability zone or access to a second region, this option would provide a higher level of redundancy for the cluster.

4. Disaster Recovery Requirements

Understanding your true disaster recovery requirements is also a major factor in your design. If your cluster manager software requires access to the quorum/witness in order to recover from a total data center outage (or region failure) then you’ll need to understand this impact on your design. Many high availability software packages have tools or methods for this scenario, but if your software does not, your design and placement of quorum/witness may need to accommodate this reality.

5. Number Of Members Within the Cluster, and Their Location

An additional quorum/witness server is typically not required when the cluster contains an odd number of nodes.  However, if using only two nodes in a cluster or deploying a DR node that is not always available may change your architecture.  As VP of Customer Experience I have worked with customers who have deployed three node architectures, but for cost savings they automate periodic shutdown of the third server.

6. Operation System and Cluster Manager

The final factor to mention on quorum/witness is the cluster manager and operating system.  Not all HA software and cluster managers are equal when it comes to deployment of quorum/witness or arbitration of quorum status.  Some clustering software requires shared disks for arbitration, others are more flexible allowing shares (NFS, SMB, EFS, Azure Files, and S3).  Being aware of what your cluster manager requires, and the modes that it supports with regards to quorum (simple majority, witness, file share, etc) will impact not only what you deploy, but how you deploy.

The single best way to deploy a quorum/witness server is to understand your vendor’s definition of quorum/witness and their available options, know your requirements, factor in the limitations or opportunities presented by your data center (or cloud environment) and architect the solution that provides your critical systems the highest level of protection against split-brains, false failovers, and downtime.

-Cassius Rhue, VP, Customer Experience

Reproduced from SIOS

Filed Under: Clustering Simplified Tagged With: cluster quorum, Quorum, quorum witness, split brain

Clustering 101: Configuring A Windows Cluster Quorum

March 13, 2018 by Jason Aw Leave a Comment

Clustering 101: Configuring A Windows Cluster Quorum – What You Need To Know

In case you missed it, I held this in depth webinar on cluster quorums. In 30 minutes I go over everything you need to know about quorums, from node majority through Cloud Witness and everything in between. If you have additional questions about quorums post them as a comment on this article. I will be glad to help.

Reproduced with permission from https://clusteringformeremortals.com/2015/03/03/clustering101-configuring-a-windows-cluster-quorum-what-you-need-to-know/

Filed Under: Clustering Simplified Tagged With: cluster, Clustering, Quorum, Windows, Windows Cluster Quorum

Windows Server Failover Cluster Quorum Types In Windows Server 2012 R2

February 21, 2018 by Jason Aw Leave a Comment

Cluster Quorum types? What Does It Do?

Before we get started with all the great new cluster quorum types in Windows Server 2012 R2, we should take a moment and understand what it does and how we got to where we are today. Rob Hindman describes quorum best in his blog post…

“The quorum configuration in a failover cluster determines the number of failures that the cluster can sustain while still remaining online.”

The Beginning: Disk Only

Prior to Windows Server 2003, there was only one quorum type, Disk Only. Now there are different cluster quorum types. Disk Only is still available today, but is not recommended as the quorum disk is a single point of failure. In Windows Server 2003 Microsoft introduce the Majority Node Set (MNS) quorum. This was an improvement as it eliminated the disk only quorum as a single point of failure in the cluster. However, it did have its limitations. As implied in its name, Majority Node Set must have a majority of nodes to form a quorum and stay online. So, this quorum model is not ideal for a two node cluster where the failure of one node would only leave one node remaining. One out of two is not a majority, so the remaining node would go offline.

The Introduction Of File Share Witness

Microsoft introduced a hotfix that allowed for the creation of a File Share Witness (FSW) on Windows Server 2003 SP1 and 2003 R2 clusters. Essentially the FSW is a simple file share on another server that is given a vote in a MNS cluster. The driving force behind this innovation was Exchange Server 2007 Continuous Cluster Replication (CCR), which allowed for clustering without shared storage. Of course, without shared storage a Disk Only Quorum was not an option. Effective MNS clusters would require three or more cluster nodes. Hence, the introduction of the FSW to support two node Exchange CCR clusters.

The New Disk Witness Keeps A Copy Of Cluster Database

Windows Server 2008 saw the introduction of a new witness type, Disk Witness. Unlike the old Disk Only quorum type, the Disk Witness allows the users to configure a small partition on a shared disk that acts as a vote in the cluster, similar to that of the FSW. However, the Disk Witness is preferable to the FSW. This is because it keeps a copy of the cluster database and eliminates the possibility of “partition in time”. If you’d like to read more about partition in time, I suggest you read the File Share Witness vs. Disk Witness for local clusters.

Improvements

Windows Server 2012 continued to improve upon quorum options. It is my belief that many of these new features were driven by two forces: Hyper-V and SQL Server AlwaysOn Availability Groups. With Hyper-V, we began to see clusters that contained many more nodes than we have typically seen in the past. In a majority node set, as soon as you lose a majority of your votes, the remaining nodes go offline. For example, if you have a Hyper-V cluster with seven nodes, and you were to lose four of those nodes, the remaining nodes would go offline, even though there are three nodes remaining. This might not be exactly what you want to happen. So in Windows Server 2012, Microsoft introduced Dynamic Quorum.

Dynamic Quorum

Dynamic Quorum does what its name implies. It adjusts the quorum dynamically. So in the scenario described about, assuming I didn’t lose all four servers at the same time, as servers in the cluster went offline, the number of votes in the quorum would adjust dynamically. When node one went offline, I would then in theory have a six node cluster. When node two went offline, I would then have a five node cluster, and so on. In reality, if I continued to lose cluster nodes one by one, I could go all the way down to a two node cluster and still remain online. And, if I had configured a witness (Disk or File Share) I could actually go all the way down to a single node and still remain online.

Read more about cluster quorum types at….

http://blogs.msdn.com/b/microsoft_press/archive/2014/04/28/from-the-mvps-understanding-the-windows-server-failover-cluster-quorum-in-windows-server-2012-r2.aspx

Reproduced with permission from https://clusteringformeremortals.com/2014/04/29/understanding-the-windows-server-failover-cluster-quorum-in-windows-server-2012-r2/

Filed Under: Clustering Simplified Tagged With: cluster, cluster quorum types, Disk Only, Disk Witness, File Share Witness, Quorum, Windows Server, Windows Server 2012 R2

It’s Time To Demystify Quorums

June 29, 2015 by sios2017 Leave a Comment

As I read through the SQL Server forums and field questions from IT pros, it surprises me how many are simply off-put by the mention of the word quorum. It’s a six-letter word treated like a part of our collective four letter vocabulary.

Quorum Defined: A voting mechanism to ensure correct ownership of cluster resources.A “quorum” in an IT-sense is simply a voting mechanism to ensure the correct ownership of a cluster. Most commonly, it’s used in conjunction with Always On (both Failover Cluster Instances and Availability Groups), Hyper-V Clusters and all Windows Server Failover Clusters.

The key isn’t to know where to use them it’s how to use them. I put together a short 30 minute webcast that dives into the a variety of quorum types most commonly used — including the pros, cons, and illustrations to demystify quorums and, dare I say, make them easy to understand. I also suggest tuning in after the 30 minute mark — we had some great questions come in as part of the Q&A.

View the webinar now

Additional Resources

  • White Paper: Step-by-Step: How to Configure a SQL Server Failover Cluster Instance (FCI) in Microsoft Azure IaaS
  • Dave Bermingham’s Blog: ClusteringForMereMortals.com

Filed Under: Blog posts, News and Events Tagged With: Quorum, Windows Cluster Quorum, Windows Server Failover Clustering

Feb 25, 2015: Live Webinar – Clustering 101: Configuring a Windows Cluster Quorum – What You Need to Know

February 11, 2015 by Margaret Hoagland Leave a Comment

Are you wondering how your Microsoft cluster quorum should be configured? Microsoft MVP Dave Bermingham will simplify your quorum configuration options and give you the building blocks to better define or improve your cluster and quorum configuration. Dave will also break down quorum configurations based on your current server version, upgrade advantages and he’ll showcase some new options to achieve a majority and tackle interesting cases such as: What happens if you have no shared storage or want to use the cloud as a file share witness…

You can participate before the event by tweeting your questions to @SIOSTech using #Clustering101 and of course bring your own questions and comments.

Register Now

Date: February 25, 2014
Time: 10:00a PST — 1:00pm EST

About Clustering 101 Series

Clustering 101 is a webinar series hosted by Microsoft MVP, Dave Bermingham focused on addressing the numerous facets of clustering for high availability, data replication and any combination there of. This series will air monthly, the fourth Wednesday of the month at 10:00a PST / 1:00p EST.

About Dave Bermingham

David Bermingham is recognized within the technology community as a high availability expert and has been honored by his peers by being elected to be a Microsoft MVP in clustering since 2010. David’s work as director, technical evangelist at SIOS Technology Corp., has him focused on Microsoft high availability and disaster recovery solutions as well as providing hands on support, training and professional services for cluster implementations. David holds numerous technical certifications and draws from more than twenty years of experience in IT, including work in the finance, healthcare and education fields, to help organizations design solutions to meet their high availability and disaster recovery needs. Learn more at www.us.sios.com

Filed Under: Event posts, News and Events Tagged With: #SANLess Clusters for Windows Environments, Amazon EC2, Azure, Cloud, Clustering 101, Quorum, Webinar, Windows

Recent Posts

  • SIOS LifeKeeper Demo: How Rolling Updates and Failover Protect PostgreSQL in AWS
  • How to Assess if My Network Card Needs Replacement
  • SIOS Technology to Demonstrate High Availability Clustering Software for Mission-Critical Applications at Red Hat Summit, Milestone Technology Day and XPerience Day, and SQLBits 2025
  • Application Intelligence in Relation to High Availability
  • Transitioning from VMware to Nutanix

Most Popular Posts

Maximise replication performance for Linux Clustering with Fusion-io
Failover Clustering with VMware High Availability
create A 2-Node MySQL Cluster Without Shared Storage
create A 2-Node MySQL Cluster Without Shared Storage
SAP for High Availability Solutions For Linux
Bandwidth To Support Real-Time Replication
The Availability Equation – High Availability Solutions.jpg
Choosing Platforms To Replicate Data - Host-Based Or Storage-Based?
Guide To Connect To An iSCSI Target Using Open-iSCSI Initiator Software
Best Practices to Eliminate SPoF In Cluster Architecture
Step-By-Step How To Configure A Linux Failover Cluster In Microsoft Azure IaaS Without Shared Storage azure sanless
Take Action Before SQL Server 20082008 R2 Support Expires
How To Cluster MaxDB On Windows In The Cloud

Join Our Mailing List

Copyright © 2025 · Enterprise Pro Theme on Genesis Framework · WordPress · Log in