SQL Server High Availability Archives

Guide: Deploying a Multi-Zone and Multi-Region SQL Server FCI in Azure

April 27, 2026 by Jason Aw Leave a Comment

Guide: Deploying a Multi-Zone and Multi-Region SQL Server FCI in Azure

Organizations running mission-critical databases in the cloud need architectures that deliver both high availability and disaster recovery. In an InfoWorld feature, Dave Bermingham provides a detailed, step-by-step guide to building a multi-zone and multi-region Microsoft SQL Server Failover Cluster Instance in Microsoft Azure, including automated networking configuration and best practices for resilient cluster deployments. The article also explains how technologies like Windows Server Failover Clustering and SIOS DataKeeper enable reliable failover across availability zones and regions, helping organizations achieve true business continuity in the cloud.

Reproduced with permission from SIOS

Empowering Education: Enhancing System Availability with SIOS Solutions

October 26, 2023 by Jason Aw Leave a Comment

Empowering Education: Enhancing System Availability with SIOS Solutions

Education has been evolving rapidly, and the infusion of technology into the classroom has accelerated this change, especially in the last few years. With the increasing importance of online learning, educational institutions face unique challenges that demand innovative solutions. One such challenge is ensuring high system availability to provide uninterrupted education services to students.

Challenges in the Education Sector

Educational institutions encounter a variety of challenges in the digital age. These include managing high-traffic peak seasons, smoothly transitioning to online learning, mitigating system failures and their impacts, safeguarding data, accommodating a diverse global student population, and working within budget constraints. These challenges require robust solutions to maintain educational continuity.

Why High Availability (HA) Matters

High Availability (HA) is a critical concept in ensuring that systems remain operational without disruptions. In the context of education, HA ensures that essential services like learning management systems, communication tools, and data storage are always available. This minimizes downtime and ensures students and educators can access resources whenever they need them.

Assessing the Need for HA

Before implementing an HA solution, institutions must evaluate the impact of downtime on their operations. Understanding downtime tolerance and estimating cost implications are essential steps. It’s crucial to recognize the potential consequences of system failures, as they can disrupt learning and negatively affect an institution’s reputation.

Choosing the Right HA Solution

Selecting the right HA solution isn’t a one-size-fits-all decision. Institutions must consider factors such as cost implications, reliability, and system compatibility and integration. The goal is to find a solution that aligns with their specific needs and budget constraints.

Understanding Disaster Recovery (DR)

While HA focuses on minimizing downtime, Disaster Recovery (DR) is about restoring operations after major disruptions. It’s vital to distinguish between HA and DR and recognize their complementary roles. HA ensures uninterrupted access, while DR ensures recovery from significant losses, such as data breaches or natural disasters.

Introduction to SIOS Solutions

One company that specializes in providing HA and DR solutions tailored to educational institutions is SIOS Technology Corp. SIOS offers products like DataKeeper, designed to enhance system availability and data protection.

Deep Dive into DataKeeper

DataKeeper offers a range of features and benefits specifically aimed at meeting the unique challenges faced by educational institutions. It provides data replication and protection, ensuring that critical data remains accessible even in the face of hardware failures or other disruptions.

Real-World Case Study

To illustrate the effectiveness of solutions like DataKeeper, we will delve into a real-world case study. We’ll explore a university’s journey, including their initial problem statement, the HA/DR solution they chose, and the results and benefits they experienced.

Demonstration

During the webinar, attendees will have the opportunity to witness SIOS solutions in action. A live demonstration will showcase the setup and operation of these solutions, giving a practical insight into how they work.

Engage with Poll Questions

As part of the interactive experience, the webinar will include poll questions that allow attendees to provide feedback and share their insights on the challenges and solutions discussed.

Conclusion & Next Steps

In conclusion, as the education sector continues to evolve, ensuring system availability is paramount. The right HA and DR solutions can make a significant difference in maintaining educational continuity and safeguarding critical data. By exploring SIOS solutions and understanding their benefits, educational institutions can better prepare for the challenges of the digital age.

We invite you to join us in this informative webinar on October 24th, where we will delve deeper into these topics and showcase how SIOS solutions can empower education.

Don’t miss this opportunity to enhance your institution’s system availability and resilience in an ever-changing educational landscape.

Webinar Details:

Date: October 24th
Time: 1 PM EDT
Location: Online
Register here

We look forward to your participation in this empowering webinar!

Reproduced with permission from SIOS

Achieving HA/DR for SQL Server Without Breaking the Bank

June 30, 2023 by Jason Aw Leave a Comment

Achieving HA/DR for SQL Server Without Breaking the Bank

High availability and disaster recovery (HA/DR) are essential requirements for all database environments, especially mission-critical ones. However, many businesses face challenges in achieving HA/DR without significantly inflating costs. If you’re grappling with these issues, this article will shed light on an effective solution.

SQL Server Standard Edition is widely used, but it comes with certain limitations: it supports only two nodes in a cluster. However, by leveraging the capabilities of SIOS DataKeeper Cluster Edition, you can overcome this limitation, enabling replication of data to a third node for disaster recovery.

This strategy could save you over 70% on your SQL Server licensing by allowing you to use SQL Server Standard Edition to create a SANLess SQL Server Failover Cluster Instance (FCI) instead of upgrading to SQL Server Enterprise Edition and using Always On Availability Groups.

This blog post aims to guide you through the process of using SIOS DataKeeper for data recovery on a third node that is not part of the cluster.

Configuring Your Nodes

In this scenario, let’s consider that you have two nodes, namely DataKeeper-1 and DataKeeper-2, configured in a cluster. These nodes have their E drive replicating with each other. Also, DataKeeper-1 is replicating to a third node, DataKeeper-3, which is not part of the cluster. It’s important to note that with SQL Server Standard Edition, the third node can never be part of the cluster.

Preparing the Third Node

Firstly, ensure that DataKeeper-3 is separate from the cluster. With this, you now have a two-node cluster (DataKeeper-1 and DataKeeper-2) with SQL Server configured as a failover cluster instance, but still replicating to the third node, DataKeeper-3, using SIOS DataKeeper.

Navigating a Disaster Recovery Process

So, how would this work in an actual disaster? Here are the steps you would need to follow:

Simulate a Disaster: In this case, to simulate a disaster, we take SQL Server offline on the cluster (DataKeeper 1 and 2).
Switch to DataKeeper 3: With SQL Server offline, we switch over to DataKeeper-3. The volume E on DataKeeper 3, however, is initially not accessible.
Unlock the Volume: To unlock the volume on DataKeeper-3, you would need to execute a command-line operation as shown in the tutorial video called ‘emcmd . switchovervolume’
Attach Databases: In a real disaster, you’ll want to have a standalone instance of SQL Server running on DataKeeper-3. From this standalone instance, you could then attach the user-defined databases.
Replicate Back to the Cluster: Data written on DataKeeper-3 is replicated back to DataKeeper-1 and DataKeeper-2. This can be verified using the SIOS DataKeeper interface.

Post-Disaster Recovery

Once the disaster is resolved, you can switch back the volume to the original source using a similar process.

By leveraging SIOS DataKeeper Cluster Edition, you can implement a robust, cost-effective, and efficient high availability/disaster recovery strategy for your SQL Server environment. This process not only helps save significant costs by eliminating the need for upgrading to SQL Server Enterprise Edition, but it also ensures data availability and a quick recovery during a disaster.

Check out this video for a complete walkthrough of the process and ensure your SQL Server remains resilient, without breaking the bank.

Reproduced with permission from SIOS

High Availability Options for SQL Server on Azure VMs

February 28, 2023 by Jason Aw Leave a Comment

High Availability Options for SQL Server on Azure VMs

Microsoft Azure infrastructure is designed to provide high availability for your applications and data. Azure offers a variety of infrastructure options for achieving high availability, including Availability Zones, Paired Regions, redundant storage, and high-speed, low-latency network connectivity. All of these services are backed by Service Level Agreements (SLAs) to ensure the availability of your business-critical applications. This blog post will focus on high availability options when running SQL Server in Azure Virtual Machines.

Azure Infrastructure

Before we jump into the high availability options for SQL Server, let’s discuss the vital infrastructure that must be in place. Availability Zones, Regions, and Paired Regions are key concepts in Azure infrastructure that are important to understand when planning for the high availability of your applications and data.

Availability Zones are physically separate locations within a region that provides redundant power, cooling, and networking. Each Availability Zone consists of one or more data centers. By placing your resources in different Availability Zones, you can protect your applications and data from outages caused by planned or unplanned maintenance, hardware failures, or natural disasters. When leveraging Availability Zones for your SQL Server deployment, you qualify for the 99.99% availability SLA for Virtual Machines.

Regions are geographic locations where Azure services are available. Azure currently has more than 60 regions worldwide, each with multiple Availability Zones. By placing your resources in different regions, you can provide even greater protection against outages caused by natural disasters or other significant events.

Paired Regions are pre-defined region pairs that have unique relationships. Most notably, paired Regions replicate data to each other when geo-redundant storage is in use. The other benefits of paired regions are region recovery sequence, sequential updating, physical isolation, and data residency. When designing your disaster recovery plan, it is advisable to use Paired Regions for your primary and disaster recovery locations.

Using Availability Zones and Paired Regions in conjunction with high availability options such as Availability Groups and Failover Cluster Instances, you can create highly available, resilient SQL Server deployments that can withstand a wide range of failures, minimizing downtime.

SQL Server Availability Groups and Failover Cluster Instances

SQL Server Availability Groups (AGs) and SQL Server Failover Cluster Instances (FCIs) are both high availability (HA) and disaster recovery (DR) solutions for SQL Server, but they work in different ways.

An AG is a feature of SQL Server Enterprise edition that provides an HA solution by replicating a database across multiple servers (called replicas) to ensure that the database is always available in case of failure. AGs can be used to provide HA for both a single database and multiple databases.

SQL Server Standard Edition supports something called a Basic AG. There are some limitations to Basic AGs in SQL Server. Firstly, a Basic AG only supports a single database. You need an AG for each database and the associated IP address and load balancer if you have more than one database. Additionally, Basic AGs do not support read-only replicas. While Basic AGs provide a simple way to implement HA for a single database, they may not be suitable for more complex scenarios.

On the other hand, a SQL Server FCI is a Windows Server Failover Cluster (WSFC) that provides an HA solution by creating a cluster of multiple servers (called nodes) that use shared storage. In the event of a failure, the SQL Server instance running on one node can fail over to another.

In SQL Server 2022 Enterprise Edition, the new Contained Availability Groups (CAG) address some of the AG limitations by allowing users to create system databases to CAG, which can then be replicated. CAG eliminates the need to synchronize things like SQL logins and SQL Agent jobs manually.

Availability Groups and Failover Cluster Instances have their own pros and cons. AGs have advanced features like readable secondaries and synchronous and asynchronous replication. However, AGs require the Enterprise Edition of SQL Server, which can be cost-prohibitive, particularly if you don’t need any other Enterprise Edition features.

FCIs protect the entire SQL Server instance, including all user-defined databases and system databases. FCIs make management easier since all changes, including those made to SQL Server Agent jobs, user accounts and passwords, and database additions and deletions, are automatically reconciled on all versions of SQL Server, not just SQL 2022 with CAG. FCIs are available with SQL Server Standard Edition, which makes it more cost-effective. However, FCIs require shared storage, which presents challenges when deploying in environments that span Availability Zones, Regions, or hybrid cloud configurations. Read more about how SIOS software enables high availability for SQL servers.

Storage Options for SQL Server Failover Cluster Instances

Regarding storage options for SQL Server Failover Cluster Instances that span Availability Zones, there are three options: Azure File Share, Azure Shared Disk with Zone Redundant Storage, and SIOS DataKeeper Cluster Edition. There is a fourth option, Storage Spaces Direct (S2D), but that is limited to single AZ deployments, so clusters based on S2D would not qualify for the 99.99% SLA and would be susceptible to failures that impact and entire AZ.

Azure File Share

Azure File Share with zonal redundancy (ZRS) is a feature that allows you to store multiple copies of your data across different availability zones in an Azure region, providing increased durability and availability. This data can then be shared as a CIFS file share, and the cluster connects to it using the SMB 3 protocol.

Azure Shared Disk

Azure Shared Disk with Zone Redundant Storage (ZRS) is a shared disk that can store SQL Server data for use in a cluster. SCSI persistent reservations ensure that only the active cluster node can access the data. If a primary Availability Zone fails, the data in the standby availability zone becomes active. Shared Disk with ZRS is only available in the West US 2, West Europe, North Europe, and France Central regions.

SIOS DataKeeper Cluster Edition

SIOS DataKeeper Cluster Edition is a storage HA solution that supports SQL Server Failover Clusters in Azure. It is available in all regions and is the only FCI storage option that supports cross Availability Zone failover and cross Region failover. It also enables hybrid cloud configurations that span on-prem to cloud configurations. DataKeeper is a software solution that keeps locally attached storage in sync across all the cluster nodes. It integrates with WSFC as a third-party storage class cluster resource called a DataKeeper volume. Failover Cluster controls all the management of the DataKeeper volume, making the experience seamless for the end user. Learn more about SIOS DataKeeper.

Summary

In conclusion, Azure provides various infrastructure options for achieving high availability for SQL Server deployments, such as Availability Zones, Regions, and Paired Regions. By leveraging these options, in conjunction with high availability solutions like Availability Groups and Failover Cluster Instances, you can create a highly available, resilient SQL Server deployment that can withstand a wide range of failures and minimize downtime. Understanding the infrastructure required and the pros and cons of each option is essential before choosing the best solution for your specific needs. It’s advisable to consult with a SQL and Azure expert to guide you through the process and also review the Azure documentation and best practices. With the proper planning and implementation, you can ensure that your SQL Server deployments on Azure are always available to support your business-critical applications.

Reproduced with permission from SIOS

Fifty Ways to Improve Your High Availability

April 5, 2021 by Jason Aw Leave a Comment

Fifty Ways to Improve Your High Availability

I love the start of another year. Well, most of it. I love the optimism, the mystery, the potential, and the hope that seems to usher its way into life as the calendar flips to another year. But, there are some downsides with the turn of the calendar. Every year the start of the New Year brings ‘____ ways to do_____. My inbox is always filled with, “Twenty ways to lose weight.” “Ten ways to build your portfolio.” “Three tips for managing stress.” “Nineteen ways to use your new iPhone.” The onslaught of lists for self improvement, culture change, stress management, and weight loss abound, for nearly every area of life and work, including “Thirteen ways to improve your home office.” But, what about high availability? You only have so much time every week. So how do you make your HA solution more efficient and robust than ever. Where is your list? Here it is, fifty ways to make your high availability architecture and solution better:

Get more information from the cluster faster
Set up alerts for key monitoring metrics
Add analytics. Multiply your knowledge
Establish a succinct architecture from an authoritative perspective
Connect more resources. Link up with similar partners and other HA professionals
Hire a consultant who specializes in high availability
100x existing coverage. Expand what you protect
Centralize your log and management platforms
Remove busywork
Remove hacks and workarounds
Create solid repeatable solution architectures
Utilize your platforms: Public, private, hybrid or multi-cloud
Discover your gaps
Search for Single Points of Failure (SPOFs)
Refuse to implement incomplete solutions
Crowdsource ideas and enhancements
Go commercial and purpose built
Establish a clear strategy for each life cycle phase
Clarify decision making process
Document your processes
Document your operational playbook
Document your architecture
Plan staffing rotation
Plan maintenance
Perform regular maintenance (patches, updates, security fixes)
Define and refine on-boarding strategies
Clarify responsibility
Improve your lines of communication
Over communicate with stakeholders
Implement crisis resolution before a crisis
Upgrade your infrastructure
Upsize your VM; CPU, memory, and IOPs
Add redundancy at the zone or region level
Add data replication and disaster recovery
Go OS and Cloud agnostic
Get training for the team (cloud, OS, HA solution, etc)
Keep training the team
Explore chaos testing
Imitate the best in class architectures
Be creative. Innovation expands what you can protect and automate.
Increase your automation
Tune your systems
Listen more
Implement strict change management
Deploy QA clusters. Test everything before updating/upgrading production
Conduct root cause analysis exercises on any failures
Address RCA and Closed Loop Corrective Action reports
Learn your lesson the first time. Reuse key learnings.
Declutter. Don’t run unnecessary services or applications on production clusters
Be persistent. Keep working at it.

So, what are the ideas and ways that you have learned to increase and improve your enterprise availability? Let us know!

-Cassius Rhue, VP, Customer Experience

Reproduced from SIOS