SIOS SANless clusters

SIOS SANless clusters High-availability Machine Learning monitoring

  • Home
  • Products
    • SIOS DataKeeper for Windows
    • SIOS Protection Suite for Linux
  • News and Events
  • Clustering Simplified
  • Success Stories
  • Contact Us
  • English
  • 中文 (中国)
  • 中文 (台灣)
  • 한국어
  • Bahasa Indonesia
  • ไทย

Steeleye Datakeeper Cluster Edition Wins Windows It Pro Best High Availability/Disaster Recovery Awards

January 20, 2018 by Jason Aw Leave a Comment

I am pleased to announce that Windows IT Pro has awarded SteelEye DataKeeper Cluster Edition the Best High Availability and Disaster Recovery Product in two categories; Community Choice Gold Award and Editors’ Best Silver Award.

SteelEye DataKeeper Cluster Edition - Best High Availability Disaster Recover ProductSteelEye DataKeeper Cluster Edition - Best High Availability Disaster Recover Product

I am really proud to be a part of the SteelEye DataKeeper team and I appreciate all of the Windows IT Pro community that voted for us in the Community Choice award!

Reproduced with permission from https://clusteringformeremortals.com/2009/11/20/steeleye-datakeeper-cluster-edition-wins-windows-it-pro-best-high-availabilitydisaster-recovery-awards/

Filed Under: Clustering Simplified, Datakeeper Tagged With: DataKeeper, DataKeeper Cluster Edition, disaster recovery, High Availability, Windows IT Pro

How Can Asynchronous Replication Be Used In A Multi-Site Cluster? Isn’t The Data Out Of Sync?

January 18, 2018 by Jason Aw Leave a Comment

How Can Asynchronous Replication Be Used In A Multi-Site Cluster? Isn’t The Data Out Of Sync?

I have been asked this question more than a few times, so I thought I would answer it in my first blog post.  The basic answer is yes, you can lose data in an unexpected failure when using asynchronous replication in a multi-site cluster.  In an ideal world, every company would have a dark fiber connection to their DR site and use synchronous replication with their multi-site cluster, eliminating the possibility of data loss.  However, the reality is that in many cases, the WAN connectivity to the DR site has too much latency to support synchronous replication.  In such cases, asynchronous replication is an excellent alternative.

What Are My Options?

There are more than a few options when choosing an asynchronous replication solution to use with your WSFC multi-site cluster. This includes array based solutions from companies like EMC, IBM, HP, etc. and host based solutions, like the one that is near and dear to me, “SteelEye DataKeeper Cluster Edition“.  Since I know DataKeeper best, I will explain how this all works from DataKeeper’s prospective.

What About SteelEye DataKeeper?

When using SteelEye DataKeeper and asynchronous replication, we allow a certain number of writes to be stored in the async queue.  The number of writes which can be queued is determined by the “high water mark”. This is an adjustable value used by DataKeeper to determine how much data can be in the queue before the mirror state is changed from “mirroring” to “paused”.  A “paused” state is also entered anytime there is a communication failure between the secondary and primary server. While in a paused state, automatic failover in a multi-site cluster is disabled, limiting the amount of data that can be lost in an unexpected failure.  If the original data set is deemed “lost forever”, then the remaining data on the target server can be manually unlocked and the cluster node can then be brought into service.

While in the “paused” state, DataKeeper allows the async queue to drain until we reach the “low water mark”, at which point the mirror enters a “resync” state until all of the data is once again is in sync.  At that point, the mirror is once again in the “mirroring” state and automatic failover is once again enabled.

As long as your WAN link is not saturated or broken, you should never see more than a few writes at any given time in this async queue.   In an unexpected failure (think pulled power cord) you will lose any write that is in the async queue.  This is the trade off you make when you want the awesome recovery point objective (RPO) and recovery time objective (RTO) which you achieve with a multi-site cluster, but your WAN link has too much latency to effectively support synchronous replication.

Try The SteelEye DataKeeper

Take time to monitor the DataKeeper Async Queue via the Windows Performance Logs and Alerts. I think you will be pleasantly surprised to find that most of the time the async queue is empty due to the efficiency of the DataKeeper replication engine.  Even in times of heavy writes, the async queue seldom grows very large and always drains almost immediately. So the amount of data that is at risk at any given time is minimal.  Compared to the alternative in a disaster to restore from last night’s backup, the number of writes you could lose in an unexpected failure using asynchronous replication is minimal!

Of course, there are some instances where even losing a single write is not tolerable.  In those cases, it is recommended to use SteelEye DataKeeper’s synchronous replication option across a high-speed, low latency LAN or WAN connection.

Reproduced with permission from Clusteringformeremortals.com

Filed Under: Clustering Simplified, Datakeeper Tagged With: cluster

WHY I SHOULD CONVERT MY #AZURE CLUSTERS TO MANAGED DISKS TODAY!

August 9, 2017 by Jason Aw Leave a Comment

You may have heard about the recent storage outage that impacted some instances in the US East region back on March 16th. A root cause analysis of the outage is posted here.

March 16th US East Storage Outage

Customer impact: A subset of customers using Storage in the East US region may have experienced errors and timeouts while accessing their storage account in a single Storage scale unit

You might be asking, “What is a single Storage scale unit”. Well, you can think of it as a single storage cluster, or single SAN, or however you want to think about it. I don’t think Azure publishes their exact infrastructure, but you can probably assume that behind the scenes they are using Scale Out File Servers for backend storage.

So the question is, how could I have survived this outage with minimal downtime? If you read further down that root cause analysis you come across this little nugget.

Virtual Machines using Managed Disks in an Availability Set would have maintained availability during this incident.

What’s Managed Disks you ask? Well, just on February 8th Corey Sanders announced the GA of Managed Disks. You can read all about Managed Disks here. https://azure.microsoft.com/en-us/services/managed-disks/

The reason why Managed Disks would have helped in this outage is that by leveraging an Availability Set combined with Managed Disks you ensure that each of the instances in your Availability Set are connected to a different “Storage scale unit”. So in this particular case, only one of your cluster nodes would have failed, leaving the remaining nodes to take over the workload.

Prior to Managed Disks being available (anything deployed before 2/8/2016), there was no way to ensure that the storage attached to your servers resided on different Storage scale units. Sure, you could use different storage accounts for each instances, but in reality that did not guarantee that those Storage Accounts provisioned storage on different Storage scale units.

So while an Availability Set ensured that your instances reside in different Fault Domains and Update Domains to ensure the availability of the instance itself, the additional storage attached to each instance really represented a single point of failure. Although the storage itself is highly resilient, with three copies of your data and geo-redundant options available, in this case with a power failure the entire Storage scale unit went down along with all the servers attached to it.

So long story short…migrate to Managed Disk as soon as possible in order to help minimize downtime

https://docs.microsoft.com/en-us/azure/virtual-machines/virtual-machines-windows-migrate-to-managed-disks

And if you really want to minimize downtime you should consider Hybrid Cloud Deployments that span cloud providers or on-prem to cloud!

 

Reposted from original post by Dave Bermingham Microsoft Clustering MVP –  https://clusteringformeremortals.com/2017/03/22/why-i-should-convert-my-azure-clusters-to-managed-disks-today/

Filed Under: Clustering Simplified Tagged With: #SANLess Clusters for SQL Server Environments, #SANLess Clusters for Windows Environments, Awards, Azure, azure availability managed disk, Clustering 101, High Availability

Deploy 2-node File Server Failover Cluster in Azure using ARM

July 19, 2017 by Jason Aw Leave a Comment

Deploy 2-node File Server Failover Cluster In Azure Using ARM

In this post we will detail the specific steps required to deploy a 2-node File Server Failover Cluster in a single region of Azure using Azure Resource Manager. I will assume you are familiar with basic Azure concepts as well as basic Failover Cluster concepts and will focus this article on what is unique about deploying a File Server Failover Cluster in Azure.

With DataKeeper Cluster Edition you are able to take the locally attached storage, whether it is Premium or Standard Disks, and replicate those disks either synchronously, asynchronously or a mix or both, between two or more cluster nodes. In addition, a DataKeeper Volume resource is registered in Windows Server Failover Clustering which takes the place of a Physical Disk resource. Instead of controlling SCSI-3 reservations like a Physical Disk Resource, the DataKeeper Volume controls the mirror direction, ensuring the active node is always the source of the mirror. As far as Failover Clustering is concerned, it looks, feels and smells like a Physical Disk and is used the same way Physical Disk Resource would be used.

Pre-Requisites

  • You have used the Azure Portal before and are comfortable deploying virtual machines in Azure IaaS.
  • Have obtained a license or eval license of SIOS DataKeeper

Deploying A File Server Failover Cluster Instance Using The Azure Portal

To build a 2-node File Server Failover Cluster Instance in Azure, we are going to assume you have a basic Virtual Network based on Azure Resource Manager and you have at least one virtual machine up and running and configured as a Domain Controller. Once you have a Virtual Network and a Domain configured, you are going to provision two new virtual machines which will act as the two nodes in our cluster.

Our environment will look like this:

DC1 – Our Domain Controller and File Share Witness
SQL1 and SQL2 – The two nodes of our File Server Cluster

Provisioning The Two Cluster Nodes (SQL1 And SQL2)

Using the Azure Portal, we will provision both SQL1 and SQL2 exactly the same way. There are numerous options to choose from including instance size, storage options, etc. This guide is not meant to be an exhaustive guide to deploying Servers in Azure as there are some really good resources out there and more published every day. However, there are a few key things to keep in mind when creating your instances, especially in a clustered environment.

Availability Set – It is important that both SQL1, SQL2 AND DC1 reside in the same availability set. By putting them in the same Availability Set we are ensuring that each cluster node and the file share witness reside in a different Fault Domain and Update Domain. This helps guarantee that during both planned maintenance and unplanned maintenance the cluster will continue to be able to maintain quorum and avoid downtime.

3
Figure 3 – Be sure to add both cluster nodes and the file share witness to the same Availability Set

STATIC IP ADDRESS

Once each VM is provisioned, you will want to go into the setting and change the settings so that the IP address is Static. We do not want the IP address of our cluster nodes to change.

4
Figure 4 – Make sure each cluster node uses a static IP

Storage

As far as Storage is concerned, you will want to consult Performance best practices for SQL Server in Azure Virtual Machines. In any case, you will minimally need to add at least one additional disk to each of your cluster nodes. DataKeeper can use Basic Disk, Premium Storage or even Storage Pools consisting of multiple disks in a storage pool. Just be sure to add the same amount of storage to each cluster node and configure it identically. Also, be sure to use a different storage account for each virtual machine to ensure that a problem with one Storage Account does not impact both virtual machines at the same time.

5
Figure 5 – make sure to add additional storage to each cluster node

Create The Cluster

Assuming both cluster nodes (SQL1 and SQL2) have been provisioned as described above and added to your existing domain, we are ready to create the cluster. Before we create the cluster, there are a few Features that need to be enabled. These features are .Net Framework 3.5 and Failover Clustering. These features need to be enabled on both cluster nodes. You will also need to enable the FIle Server Role.

6
Figure 6 – enable both .Net Framework 3.5 and Failover Clustering features and the File Server on both cluster nodes

Once that role and those features have been enabled, you are ready to build your cluster. Most of the steps I’m about to show you can be performed both via PowerShell and the GUI. However, I’m going to recommend that for this very first step you use PowerShell to create your cluster. If you choose to use the Failover Cluster Manager GUI to create the cluster you will find that you wind up with the cluster being issues a duplicate IP address.

Without going into great detail, what you will find is that Azure VMs have to use DHCP. By specifying a “Static IP” when we create the VM in the Azure portal all we did was create sort of a DHCP reservation. It is not exactly a DHCP reservation because a true DHCP reservation would remove that IP address from the DHCP pool. Instead, this specifying a Static IP in the Azure portal simply means that if that IP address is still available when the VM requests it, Azure will issue that IP to it. However, if your VM is offline and another host comes online in that same subnet it very well could be issued that same IP address.

There is another strange side effect to the way Azure has implemented DHCP. When creating a cluster with the Windows Server Failover Cluster GUI when hosts use DHCP (which they have to), there is not option to specify a cluster IP address. Instead it relies on DHCP to obtain an address. The strange thing is, DHCP will issue a duplicate IP address, usually the same IP address as the host requesting a new IP address. The cluster will usually complete, but you may have some strange errors and you may need to run the Windows Server Failover Cluster GUI from a different node in order to get it to run. Once you get it to run you will want to change the cluster IP address to an address that is not currently in use on the network.

You can avoid that whole mess by simply creating the cluster via Powershell and specifying the cluster IP address as part of the PowerShell command to create the cluster.

You can create the cluster using the New-Cluster command as follows:

New-Cluster -Name cluster1 -Node sql1,sql2 -StaticAddress 10.0.0.101 -NoStorage

After the cluster creation completes, you will also want to run the cluster validation by running the following command:

Test-Cluster

7
Figure 7 – The output of the cluster creation and the cluster validation commands

Create The File Share Witness

Because there is no shared storage, you will need to create a file share witness on another server in the same Availability Set as the two cluster nodes. By putting it in the same availability set you can be sure that you only lose one vote from your quorum at any given time. If you are unsure how to create a File Share Witness you can review this article http://www.howtonetworking.com/server/cluster12.htm. In my demo I put the file share witness on domain controller. I have published an exhaustive explanation of cluster quorums at https://blogs.msdn.microsoft.com/microsoft_press/2014/04/28/from-the-mvps-understanding-the-windows-server-failover-cluster-quorum-in-windows-server-2012-r2/

Install DataKeeper

After the cluster is created it is time to install DataKeeper. It is important to install DataKeeper after the initial cluster is created so the custom cluster resource type can be registered with the cluster. If you installed DataKeeper before the cluster is created you will simply need to run the install again and do a repair installation.

8
Figure 8 – Install DataKeeper after the cluster is created

During the installation you can take all of the default options.  The service account you use must be a domain account and be in the local administrators group on each node in the cluster.

9
Figure 9 – the service account must be a domain account that is in the Local Admins group on each node

Once DataKeeper is installed and licensed on each node you will need to reboot the servers.

Create The DataKeeper Volume Resource

To create the DataKeeper Volume Resource you will need to start the DataKeeper UI and connect to both of the servers.
10Connect to SQL1
11

Connect to SQL2
12

Once you are connected to each server, you are ready to create your DataKeeper Volume. Right click on Jobs and choose “Create Job”
13

Give the Job a name and description.
14

Choose your source server, IP and volume. The IP address is whether the replication traffic will travel.
15 - File Server Failover Cluster

Choose your target server.
16

Choose your options. For our purposes where the two VMs are in the same geographic region we will choose synchronous replication. For longer distance replication you will want to use asynchronous and enable some compression.
17

By clicking yes at the last pop-up you will register a new DataKeeper Volume Resource in Available Storage in Failover Clustering.
18

You will see the new DataKeeper Volume Resource in Available Storage.
19

Create The File Server Cluster Resource

To create the File Server Cluster Resource, we will use Powershell once again rather than the Failover Cluster interface. This is when the virtual machines are configured to use DHCP, the GUI based wizard will not prompt us to enter a cluster IP address. Instead, it will issue a duplicate IP address. To avoid this, we will use a simple powershell command to create the FIle Server Cluster Resource and specify the IP Address

Add-ClusterFileServerRole -Storage "DataKeeper Volume E" -Name FS2 -StaticAddress 10.0.0.201

Make note of the IP address you specify here. It must be a unique IP address on your network. We will use this same IP address later when we create our Internal Load Balancer.

Create The Internal Load Balancer

Here is where failover clustering in Azure is different than traditional infrastructures. The Azure network stack does not support gratuitous ARPS, so clients cannot connect directly to the cluster IP address. Instead, clients connect to an internal load balancer and are redirected to the active cluster node. What we need to do is create an internal load balancer. This can all be done through the Azure Portal as shown below.

First, create a new Load Balancer
30

You can use an Public Load Balancer if your client connects over the public internet. But assuming your clients reside in the same vNet, we will create an Internal Load Balancer. The important thing to take note of here is that the Virtual Network is the same as the network where your cluster nodes reside. Also, the Private IP address that you specify will be EXACTLY the same as the address you used to create the SQL Cluster Resource.
31

After the Internal Load Balancer (ILB) is created, you will need to edit it. The first thing we will do is to add a backend pool. Through this process you will choose the Availability Set where your SQL Cluster VMs reside. However, when you choose the actual VMs to add to the Backend Pool, be sure you do not choose your file share witness. We do not want to redirect SQL traffic to your file share witness.
32
33

The next thing we will do is add a Probe. The probe we add will probe Port 59999. This probe determines which node is active in our cluster.
34

And then finally, we need a load balancing rule to redirect the SMB traffic, TCP port 445 The important thing to notice in the screenshot below is the Direct Server Return is Enabled. Make sure you make that change.

445_ilb

Fix The File Server IP Resource

The final step in the configuration is to run the following PowerShell script on one of your cluster nodes. This will allow the Cluster IP Address to respond to the ILB probes and ensure that there is no IP address conflict between the Cluster IP Address and the ILB. Please take note; you will need to edit this script to fit your environment. The subnet mask is set to 255.255.255.255, this is not a mistake, leave it as is. This creates a host specific route to avoid IP address conflicts with the ILB.

# Define variables
$ClusterNetworkName = “” 
# the cluster network name 
(Use Get-ClusterNetwork on Windows Server 2012 of higher to find the name)
$IPResourceName = “” 
# the IP Address resource name 
$ILBIP = “” 
# the IP Address of the Internal Load Balancer (ILB)
Import-Module FailoverClusters
# If you are using Windows Server 2012 or higher:
Get-ClusterResource $IPResourceName | Set-ClusterParameter 
-Multiple @{Address=$ILBIP;ProbePort=59999;SubnetMask="255.255.255.255";
Network=$ClusterNetworkName;EnableDhcp=0}
# If you are using Windows Server 2008 R2 use this: 
#cluster res $IPResourceName /priv enabledhcp=0 address=$ILBIP probeport=59999  
subnetmask=255.255.255.255

Creating File Shares

You will find that using the File Share Wizard in Failover Cluster Manager does not work. Instead, you will simply create the file shares in Windows Explorer on the active node. Failover clustering automatically picks up those shares and puts them in the cluster.

Note that the”Continuous Availability” option of a file share is not supported in this configuration.

Conclusion

You should now have a functioning File Server Failover Cluster. If you have ANY problems, please reach out to me on Twitter @daveberm. Need a DataKeeper evaluation key fill out the form at http://us.sios.com/clustersyourway/cta/14-day-trial and SIOS will send an evaluation key sent out to you.

Reproduced with permission from Clusteringformeremortals

Filed Under: Clustering Simplified, Datakeeper Tagged With: #SANLess, #SANLess Clusters for SQL Server Environments, #SANLess Clusters for Windows Environments, Azure, DataKeeper Cluster Edition, file server failover cluster, highly available cluster, MSSQLTips, SQL Server

  • « Previous Page
  • 1
  • …
  • 73
  • 74
  • 75

Recent Posts

  • 3-node Clusters Frequently Asked Questions and Answers
  • Cloud Repatriation and HA
  • Video: High Availability for State, local government, and education (SLED)
  • 8 Changes That Can Undermine Your High Availability Solution
  • High Availability Options for SQL Server on Azure VMs

Most Popular Posts

Maximise replication performance for Linux Clustering with Fusion-io
Failover Clustering with VMware High Availability
create A 2-Node MySQL Cluster Without Shared Storage
create A 2-Node MySQL Cluster Without Shared Storage
SAP for High Availability Solutions For Linux
Bandwidth To Support Real-Time Replication
The Availability Equation – High Availability Solutions.jpg
Choosing Platforms To Replicate Data - Host-Based Or Storage-Based?
Guide To Connect To An iSCSI Target Using Open-iSCSI Initiator Software
Best Practices to Eliminate SPoF In Cluster Architecture
Step-By-Step How To Configure A Linux Failover Cluster In Microsoft Azure IaaS Without Shared Storage azure sanless
Take Action Before SQL Server 20082008 R2 Support Expires
How To Cluster MaxDB On Windows In The Cloud

Join Our Mailing List

Copyright © 2023 · Enterprise Pro Theme on Genesis Framework · WordPress · Log in