SIOS SANless clusters

SIOS SANless clusters High-availability Machine Learning monitoring

  • Home
  • Products
    • SIOS DataKeeper for Windows
    • SIOS Protection Suite for Linux
  • News and Events
  • Clustering Simplified
  • Success Stories
  • Contact Us
  • English
  • 中文 (中国)
  • 中文 (台灣)
  • 한국어
  • Bahasa Indonesia
  • ไทย

Introduction To Clusters – Part 1

November 18, 2021 by Jason Aw Leave a Comment

Introduction To Clusters – Part 1

Introduction To Clusters – Part 1

What is clustering in the first place?

Clustering technology is a technology that allows you to connect multiple servers to act as a single functional unit.

Types of clustering

You can cluster servers for several purposes. For example, you can combine the processing power of multiple small servers for high performance. You can also distribute processing work to multiple nodes using a load balancer for added efficiency.

High availability (HA) clustering is a process of combining server nodes to protect important applications from downtime and data loss.

Introduction To Clusters
In a traditional shared storage failover cluster, a primary node and secondary or remote node share the same storage.

HA Clustering

High availability (HA) clustering is a mechanism that reduces downtime by eliminating single points of failure (SPOF). In an HA cluster, important applications are run on a primary node which is connected to one or more secondary or remote nodes in a cluster. Clustering software monitors the health of the application, server, and network. In the event of a failure on the primary node, it moves application operations over to a secondary node in a process called a failover, where operation continues.

High Availability

Application high availability is a measure of how much time in a given year an application is available and operational. In general, HA clusters provide 99.99% (Four nines) availability or a little more than 52 minutes of downtime over the course of a given year.

It is important to note that in a traditional HA cluster, all of the cluster nodes are connected to the same shared storage – typically a SAN. In this way, after a failover, the secondary node is accessing the same data as the primary node and operation can continue.

Introduction To Clusters
SANless cluster synchronizes local storage using host-based block level replication.

SANless Clusters

However, many companies prefer to use a SANless cluster for several reasons. First, shared storage represents a critical single point of failure. Second, shared storage is often not an option in public cloud environments. Third, SANs can sometimes impede performance of database applications, such as SQL Server, Oracle, and SAP.

Instead of shared storage, these companies use efficient, host-based, block-level replication to synchronize local storage on all cluster nodes. In the event of a failover, the secondary node is connected to local storage with an identical copy of the primary storage. This not only eliminates the SAN SPOF risk but also enables the addition of fast disk (SSD) to local on-premises storage for cost-efficient high performance. SANless clustering also enables companies to migrate on-premises HA environments to the cloud with minimal effort or disruption of ongoing business processes.

Reproduced from SIOS

Filed Under: Clustering Simplified Tagged With: cluster, Clustering, SIOS

Disaster Recovery Made Simple

October 22, 2021 by Jason Aw Leave a Comment

Disaster Recovery Made Simple

Disaster Recovery Made Simple

Disaster Recovery Made Simple

Heard the term disaster recovery (DR) thrown around often? DR is a strategy and set of policies, procedures, and tools. It ensures critical IT systems, databases, and applications continue to operate and be available to users when a man-made or natural disaster happens. It typically involves moving application operation to a redundant DR environment that is geographically separated from the primary environment. While the IT team owns the disaster recovery strategy, DR is an important component of every organization’s Business Continuity Plan. The latter is a strategy and set of policies, procedures, and tools to ensure business operations continue through an interruption in service.

It may sound confusing at first. But we’ve collected some quick facts to make disaster recovery simple to understand:

Point 1. Implement an IT disaster recovery or a disaster recovery plan (DRP)

A DRP is a strategy and set of policies, procedures, and tools that ensure critical IT systems, databases, and applications continue to operate and be available to users when a disaster strikes the organization’s primary computing environment. While the IT team owns the disaster recovery strategy, DR is an important component of every organization’s Business Continuity Plan.

Point 2. Ensure Geographic Separation

An essential part of application disaster recovery is ensuring there is a redundant, geographically separated application environment available. You have either efficient, block level replication and or a clustering software that can failover operation to it in the event of a disaster. If your application is running in a cloud, your clustering environment should failover across cloud regions and availability zones for disaster recovery.

Point 3. Test, test, and test some more

In a recent Spiceworks survey, 59 percent of organizations indicated they had experienced one to three outages (that is, any interruption to normal levels of IT-related service) over the course of one year. 11 percent have experienced four to six. 7 percent have experienced seven or more. In short, a DR event is nearly inevitable. Be sure you conduct regular testing to ensure you know exactly what will happen when it does.

Point 4. Understand Your Risk

The disaster in DR does not need to be a full-fledged hurricane, tornado, flood, or earthquake that impacts your business. Disasters come in many forms, including a cyber-attack, fire, theft, or vandalism. In fact, simple human error still rates among the leading causes of IT data center downtime. In short, a disaster is any crisis that results in a down system for a long duration and/or major data loss on a large scale that impacts your IT infrastructure, data center, and your business.

Point 5. Ensure Your DRP has a Checklist

It should include critical IT systems and network prioritized by their expected time for recovery (RTO). Document the steps needed to restart, reconfigure and recover systems and networks. Employees should know where to locate the DRP and how to execute basic emergency steps in the event of an unforeseen incident.

Point 6. Substantiate DRPs through testing

DRPs should identify deficiencies and provides opportunities to fix problems before a disaster occurs. Testing can offer proof that the plan is effective and that it will enable you to meet recovery point and recovery time objectives (RPOs and RTOs). Since IT systems and technologies are constantly changing, DR testing also helps ensure a disaster recovery plan is up to date.

Choose a failover clustering technology that makes DR testing simple by facilitating fast, simple, reliable switchover of application operation to DR nodes and back.

When you look at those statistics, you know you are living on borrowed time if you don’t have a disaster recovery plan in place. The SIOS disaster recovery solution is a multi-site, geographically dispersed cluster that meets RPO and  RTOs with ease. What makes SIOS different from many other DR providers is that it offers one solution that meets both high availability and disaster recovery needs. To learn more about our DR solutions, check out the insights page here.

Reproduced with permission from SIOS

Filed Under: Clustering Simplified Tagged With: cluster, clusters, disaster recovery, RPO, RTO

Step-By-Step: How to configure a SANless MySQL Linux failover cluster in Amazon EC2

August 18, 2020 by Jason Aw Leave a Comment

Step-By-Step: How to configure a SANless MySQL Linux failover cluster in Amazon EC2

Step-By-Step: How to configure a SANless MySQL Linux failover cluster in Amazon EC2

In this step by step guide, I will take you through all steps required to configure a highly available, 2-node MySQL cluster (plus witness server) in Amazon’s Elastic Compute Cloud (Amazon EC2). The guide includes both screenshots, shell commands and code snippets as appropriate. I assume that you are somewhat familiar with Amazon EC2 and already have an account. If not, you can sign up today. I’m also going to assume that you have basic familiarity with Linux system administration and failover clustering concepts like Virtual IPs, etc.

Failover clustering has been around for many years. In a typical configuration, two or more nodes are configured with shared storage to ensure that in the event of a failover on the primary node, the secondary or target node(s) will access the most up-to-date data. Using shared storage not only enables a near-zero recovery point objective, it is a mandatory requirement for most clustering software. However, shared storage presents several challenges. First, it is a single point of failure risk. If shared storage – typically a SAN – fails, all nodes in the cluster fails. Second, SANs can be expensive and complex to purchase, setup and manage. Third, shared storage in public clouds, including Amazon EC2 is either not possible, or not practical for companies that want to maintain high availability (99.99% uptime), near-zero recovery time and recovery point objectives, and disaster recovery protection.

The following demonstrates how easy it is to create a SANless cluster in the clouds to eliminate these challenges while meeting stringent HA/DR SLAs. The steps below use MySQL database with Amazon EC2 but the same steps could be adapted to create a 2-node cluster in AWS to protect SQL, SAP, Oracle, or any other application.

NOTE: Your view of features, screens and buttons may vary slightly from screenshots presented below

1. Create a Virtual Private Cloud (VPC)
2. Create an Internet Gateway
3. Create Subnets (Availability Zones)
4. Configure Route Tables
5. Configure Security Group
6. Launch Instances
7. Create Elastic IP
8. Create Route Entry for the Virtual IP
9. Disable Source/Dest Checking for ENIs
10. Obtain Access Key ID and Secret Access Key
11. Linux OS Configuration
12. Install EC2 API Tools
13. Install and Configure MySQL
14. Install and Configure Cluster
15. Test Cluster Connectivity

Overview

This article will describe how to create a cluster within a single Amazon EC2 region. The cluster nodes (node1, node2 and the witness server) will reside different Availability Zones for maximum availability. This also means that the nodes will reside in different subnets.

The following IP addresses will be used:

  • node1: 10.0.0.4
  • node2: 10.0.1.4
  • witness: 10.0.2.4
  • virtual/”floating” IP: 10.1.0.10

Step 1: Create a Virtual Private Cloud (VPC)

First, create a Virtual Private Cloud (aka VPC). A VPC is an isolated network within the Amazon cloud that is dedicated to you. You have full control over things like IP address blocks and subnets, route tables, security groups (i.e. firewalls), and more. You will be launching your Azure Iaas virtual machines (VMs) into your Virtual Network.

From the main AWS dashboard, select “VPC”

Under “Your VPCs”, make sure you have selected the proper region at the top right of the screen. In this guide the “US West (Oregon)” region will be used, because it is a region that has 3 Availability Zones. For more information on Regions and Availability Zones, click here.

Give the VPC a name, and specify the IP block you wish to use. 10.0.0.0/16 will be used in this guide:

You should now see the newly created VPC on the “Your VPCs” screen:

Step 2: Create an Internet Gateway

Next, create an Internet Gateway. This is required if you want your Instances (VMs) to be able to communicate with the internet.

On the left menu, select Internet Gateways and click the Create Internet Gateway button. Give it a name, and create:

Next, attach the internet gateway to your VPC:

Select your VPC, and click Attach:

 

Step 3: Create Subnets (Availability Zones)

Next, create 3 subnets. Each subnet will reside in it’s own Availability Zone. The 3 Instances (VMs: node1, node2, witness) will be launched into separate subnets (and therefore Availability Zones) so that the failure of an Availability Zone won’t take out multiple nodes of the cluster.

The US West (Oregon) region, aka us-west-2, has 3 availability zones (us-west-2a, us-west-2b, us-west-2c). Create 3 subnets, one in each of the 3 availability zones.

Under VPC Dashboard, navigate to Subnets, and then Create Subnet:

Give the first subnet a name (“Subnet1)”, select the availability zone us-west-2a, and define the network block (10.0.0.0/24):

Repeat to create the second subnet availability zone us-west-2b:

Repeat to create the third subnet in availability zone us-west-2c:

Once complete, verify that the 3 subnets have been created, each with a different CIDR block, and in separate Availability Zones, as seen below:

Step 4: Configure Route Tables

Update the VPC’s route table so that traffic to the outside world is sent to the Internet Gateway created in a previous step. From the VPC Dashboard, select Route Tables. Go to the Routes tab, and by default only one route will exist which allows traffic only within the VPC.

Click Edit:

Add another route:

The Destination of the new route will be “0.0.0.0/0” (the internet) and for Target, select your Internet Gateway. Then click Save:

Next, associate the 3 subnets with the Route Table. Click the “Subnet Associations” tab, and Edit:

Check the boxes next to all 3 subnets, and Save:

Verify that the 3 subnets are associated with the main route table:

Later, we will come back and update the Route Table once more, defining a route that will allow traffic to communicate with the cluster’s Virtual IP, but this needs to be done AFTER the linux Instances (VMs) have been created.

Step 5: Configure Security Group

Edit the Security Group (a virtual firewall) to allow incoming SSH and VNC traffic. Both will later be used to configure the linux instances as well as installation/configuration of the cluster software.

On the left menu, select “Security Groups” and then click the “Inbound Rules” tab. Click Edit:

Add rules for both SSH (port 22) and VNC. VNC generally uses ports in the 5900, depending on how you configure it, so for the purposes of this guide, we will open the 5900-5910 port range. Configure accordingly based on your VNC setup:

Step 6: Launch Instances

We will be provisioning 3 Instances (Virtual Machines) in this guide. The first two VMs (called “node1” and “node2”) will function as cluster nodes with the ability to bring the MySQL database and it’s associated resources online. The 3rd VM will act as the cluster’s witness server for added protection against split-brain.

To ensure maximum availability, all 3 VMs will be deployed into different Availability Zones within a single region. This means each instance will reside in a different subnet.

Go to the main AWS dashboard, and select EC2:

 

Create “node1”

Create your first instance (“node1”). Click Launch Instance:

Select your linux distribution. The cluster software used later supports RHEL, SLES, CentOS and Oracle Linux. In this guide we will be using RHEL 7.X:

Size your instance accordingly. For the purposes of this guide and to minimize cost, t2.micro size was used because it’s free tier eligible. See here for more information on instance sizes and pricing.

Next, configure instance details. IMPORTANT: make sure to launch this first instance (VM) into “Subnet1“, and define an IP address valid for the subnet (10.0.0.0/24) – below 10.0.0.4 is selected because it’s the first free IP in the subnet.
NOTE: .1/.2/.3 in any given subnet in AWS is reserved and can’t be used.

Next, add an extra disk to the cluster nodes (this will be done on both “node1” and “node2”). This disk will store our MySQL databases and the later be replicated between nodes.

NOTE: You do NOT need to add an extra disk to the “witness” node. Only “node1” and “node2”. Add New Volume, and enter in the desired size:

Define a Tag for the instance, Node1:

Associate the instance with the existing security group, so the firewall rules created previous will be active:

Click Launch:

IMPORTANT: If this is the first instance in your AWS environment, you’ll need to create a new key pair. The private key file will need to be stored in a safe location as it will be required when you SSH into the linux instances.

Create “node2”

Repeat the steps above to create your second linux instance (node2). Configure it exactly like Node1. However, make sure that you deploy it into “Subnet2” (us-west-2b availability zone). The IP range for Subnet2 is 10.0.1.0/24, so an IP of 10.0.1.4 is used here:

Make sure to add a 2nd disk to Node2 as well. It should be the same exact size as the disk you added to Node1:

Give the second instance a tag…. “Node2”:

Create “witness”

Repeat the steps above to create your third linux instance (witness). Configure it exactly like Node1&Node2, EXCEPT you DON’T need to add a 2nd disk, since this instance will only act as a witness to the cluster, and won’t ever bring MySQL online.

Make sure that you deploy it into “Subnet3” (us-west-2c availability zone). The IP range for Subnet2 is 10.0.2.0/24, so an IP of 10.0.2.4 is used here:

NOTE: default disk configuration is fine for the witness node. A 2nd disk is NOT required:

Tag the witness node:

It may take a little while for your 3 instances to provision. Once complete, you’ll see then listed as running in your EC2 console:

Step 7: Create Elastic IP

Next, create an Elastic IP, which is a public IP address that will be used to connect into you instance from the outside world. Select Elastic IPs in the left menu, and then click “Allocate New Address”:

 

Select the newly created Elastic IP, right-click, and select “Associate Address”:

Associate this Elastic IP with Node1:

Repeat this for the other two instances if you want them to have internet access or be able to SSH/VNC into them directly.

Step 8: create Route Entry for the Virtual IP

At this point all 3 instances have been created, and the route table will need to be updated one more time in order for the cluster’s Virtual IP to work. In this multi-subnet cluster configuration, the Virtual IP needs to live outside the range of the CIDR allocated to your VPC.

Define a new route that will direct traffic to the cluster’s Virtual IP (10.1.0.10) to the primary cluster node (Node1)

From the VPC Dashboard, select Route Tables, click Edit. Add a route for “10.1.0.10/32” with a destination of Node1:

Step 9: Disable Source/Dest Checking for ENIs

Next, disable Source/Dest Checking for the Elastic Network Interfaces (ENI) of your cluster nodes. This is required in order for the instances to accept network packets for the virtual IP address of the cluster.

Do this for all ENIs.

Select “Network Interfaces”, right-click on an ENI, and select “Change Source/Dest Check”.

Select “Disabled“:

Step 10: Obtain Access Key ID and Secret Access Key

Later in the guide, the cluster software will use the AWS Command Line Interface (CLI) to manipulate a route table entry for the cluster’s Virtual IP to redirect traffic to the active cluster node. In order for this to work, you will need to obtain an Access Key ID and Secret Access Key so that the AWS CLI can authenticate properly.

In the top-right of the EC2 Dashboard, click on your name, and underneath select “Security Credentials” from the drop-down:

Expand the “Access Keys (Access Key ID and Secret Access Key)” section of the table, and click “Create New Access Key”. Download Key File and store the file in a safe location.

Step 11: Configure Linux OS

Connect to the linux instance(s):

To connect to your newly created linux instances (via SSH), right-click on the instance and select “Connect”. This will display the instructions for connecting to the instance. You will need the Private Key File you created/downloaded in a previous step:

Example:

Here is where we will leave the EC2 Dashboard for a little while and get our hands dirty on the command line, which as a Linux administrator you should be used to by now.

You aren’t given the root password to your Linux VMs in AWS (or the default “ec2-user” account either), so once you connect, use the “sudo” command to gain root privileges:

$sudo su –

Unless you have already have a DNS server setup, you’ll want to create host file entries on all 3 servers so that they can properly resolve each other by nameEdit /etc/hosts

Add the following lines to the end of your /etc/hosts file:

10.0.0.4 node1
10.0.1.4 node2
10.0.2.4 witness
10.1.0.10 mysql-vip

Disable SELinux

Edit /etc/sysconfig/linux and set “SELINUX=disabled”:

# vi /etc/sysconfig/selinux

 

# This file controls the state of SELinux on the system. # SELINUX= can take one of these three values:

#   enforcing – SELinux security policy is enforced.

#   permissive – SELinux prints warnings instead of enforcing.

# disabled – No SELinux policy is loaded.

SELINUX=disabled

# SELINUXTYPE= can take one of these two values:

#       targeted – Targeted processes are protected, #       mls – Multi Level Security protection.

SELINUXTYPE=targeted

Set Hostnames

By default, these Linux instances will have a hostname that is based upon the server’s IP address, something like “ip-10-0-0-4.us-west-2.compute.internal”

You might notice that if you attempt to modify the hostname the “normal” way (i.e. editing /etc/sysconfig/network, etc), after each reboot, it reverts back to the original!! I found a great thread in the AWS discussion forums that describes how to actually get hostnames to remain static after reboots.

Details here: https://forums.aws.amazon.com/message.jspa?messageID=560446

Comment out modules that set hostname in “/etc/cloud/cloud.cfg” file. The following modules can be commented out using #.

# – set_hostname

# – update_hostname

Next, also change your hostname in /etc/hostname.

Reboot Cluster Nodes

Reboot all 3 instances so that SELinux is disabled, and the hostname changes take effect.

Install and Configure VNC (and related packages)

In order to access the GUI of our linux servers, and to later install and configure our cluster, install VNC server, as well as a handful of other required packages (cluster software needs the redhat-lsb and patch rpms).

# yum groupinstall “X Window System”

# yum groupinstall “Server with GUI”

# yum install tigervnc-server xterm wget unzip patch redhat-lsb

# vncpasswd

The following URL is a great guide to getting VNC Server running on RHEL 7 / CentOS 7:For RHEL 7.x/CentOS7.x:

https://www.digitalocean.com/community/tutorials/how-to-install-and-configure-vnc-remote-access-for-the- gnome-desktop-on-centos-7

NOTE: This example configuration runs VNC on display 2 (:2, aka port 5902) and as root (not secure). Adjust accordingly!

# cp /lib/systemd/system/vncserver@.service /etc/systemd/system/vncserver@:2.serv

# vi /etc/systemd/system/vncserver@:2.service

[Service]

Type=forking

# Clean any existing files in /tmp/.X11-unix environment ExecStartPre=/bin/sh -c ‘/usr/bin/vncserver -kill %i > /dev/null 2>&1 || :’

ExecStart=/sbin/runuser -l root -c “/usr/bin/vncserver %i -geometry 1024×768” PIDFile=/root/.vnc/%H%i.pid

ExecStop=/bin/sh -c ‘/usr/bin/vncserver -kill %i > /dev/null 2>&1 || :’

# systemctl daemon-reload

# systemctl enable vncserver@:2.service

# vncserver :2 -geometry 1024×768

For RHEL/CentOS 6.x systems:

# vi /etc/sysconfig/vncservers

 

VNCSERVERS=”2:root” VNCSERVERARGS[2]=”-geometry 1024×768″

 

# service vncserver start

# chkconfig vncserver on

Open a VNC client, and connect to the <ElasticIP:2>. If you can’t get it, it’s likely your linux firewall is in the way. Either open the VNC port we are using here (port 5902), or for now, disable the firewall (NOT RECOMMENDED FOR PRODUCTION ENVIRONMENTS):

# systemctl stop firewalld

# systemctl disable firewalld

Partition and Format the “data” disk

When the linux instances were launched, and extra disk was added to each cluster node to store the application data we will be protecting. In this case it happens to be MySQL databases.

The second disk should appear as /dev/xvdb. You can run the “fdisk -l” command to verify. You’ll see that
/dev/xvda (OS) is already being used.

# fdisk -l

# Start End Size Type NameDisk /dev/xvda: 10.7 GB, 10737418240 bytes, 20971520 sectors Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk label type: gpt

1 2048 4095 1M BIOS boot parti
2 4096 20971486 10G Microsoft basic
Disk /dev/xvdb: 2147 MB, 2147483648 bytes, 4194304 sectors Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes

Here I will create a partition (/dev/xvdb1), format it, and mount it at the default location for MySQL, which is
/var/lib/mysql. Perform the following steps on BOTH “node1” and “node2”:

# fdisk /dev/xvdb

Welcome to fdisk (util-linux 2.23.2).

 

Changes will remain in memory only, until you decide to write them. Be careful before using the write command.

 

Device does not contain a recognized partition table

Building a new DOS disklabel with disk identifier 0x8c16903a.

 

Command (m for help): n

Partition type:

p      primary (0 primary, 0 extended, 4 free) e    extended

Select (default p): p

Partition number (1-4, default 1): 1

First sector (2048-4194303, default 2048): <enter>

Using default value 2048

Last sector, +sectors or +size{K,M,G} (2048-4194303, default 4194303): <enter>

Using default value 4194303

Partition 1 of type Linux and of size 2 GiB is set

 

Command (m for help): w

The partition table has been altered!

Calling ioctl() to re-read partition table. Syncing disks.

# mkfs.ext4 /dev/xvdb1
# mkdir /var/lib/mysql

On node1, mount the filesystem:

# mount /dev/xvdb1 /var/lib/mysql

The EC2 API Tools (EC2 CLI) must be installed on each of the cluster nodes, so that the cluster software can later manipulate Route Tables, enabling connectivity to the Virtual IP.

Step 12: Install EC2 API Tools

The following URL is an excellent guide to setting this up:

http://docs.aws.amazon.com/AWSEC2/latest/CommandLineReference/set-up-ec2-cli-linux.html

Here are the key steps:
Download, unzip, and move the CLI tools to the standard location (/opt/aws):

# wget http://s3.amazonaws.com/ec2-downloads/ec2-api-tools.zip

# unzip ec2-api-tools.zip

# mv ec2-api-tools-1.7.5.1/ /opt/aws/

# export EC2_HOME=”/opt/aws”

If java isn’t already installed (run “which java” to check), install it:

# yum install java-1.8.0-openjdk

# export JAVA_HOME=”/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.71-

Example (Based on default config of RHEL 7.2 system. Adjust accordingly)

You’ll need your AWS Access Key and AWS Secret Key. Keep these values handy, because they will be needed later during cluster setup too! Refer to the following URL for more information:

https://console.aws.amazon.com/iam/home?#security_credential

# export AWS_ACCESS_KEY=your-aws-access-key-id

# export AWS_SECRET_KEY=your-aws-secret-key

Test CLI utility functionality:

# /opt/aws/bin/ec2-describe-regions
REGION eu-west-1 ec2.eu-west-1.amazonaws.com
REGION ap-southeast-1 ec2.ap-southeast-1.amazonaws.com
REGION ap-southeast-2 ec2.ap-southeast-2.amazonaws.com
REGION eu-central-1 ec2.eu-central-1.amazonaws.com
REGION ap-northeast-2 ec2.ap-northeast-2.amazonaws.com
REGION ap-northeast-1 ec2.ap-northeast-1.amazonaws.com
REGION us-east-1 ec2.us-east-1.amazonaws.com
REGION sa-east-1 ec2.sa-east-1.amazonaws.com
REGION us-west-1 ec2.us-west-1.amazonaws.com
REGION us-west-2 ec2.us-west-2.amazonaws.com

Step 13: Install and Configure MySQL

Next, install the MySQL packages, initialize a sample database, and set “root” password for MySQL. In RHEL7.X, the MySQL packages have been replaced with the MariaDB packages.

On “node1”:

# yum install mariadb mariadb-server

# mount /dev/xvdb1 /var/lib/mysql

# /usr/bin/mysql_install_db –datadir=”/var/lib/mysql/” –user=mysql

# mysqld_safe –user=root –socket=/var/lib/mysql/mysql.sock –port=3306 –datadi

#

# # NOTE: This next command allows remote connections from ANY host.  NOT a good # echo “update user set Host=’%’ where Host=’node1′; flush privileges | mysql mys #

# #Set MySQL’s root password to ‘SIOS’

# echo “update user set Password=PASSWORD(‘SIOS’) where User=’root’; flush

Create a MySQL configuration file. We will place this on the data disk (that will later be replicated –
/var/lib/mysql/my.cnf). Example:

# vi /var/lib/mysql/my.cnf

 

[mysqld] datadir=/var/lib/mysql

socket=/var/lib/mysql/mysql.sock

pid-file=/var/run/mariadb/mariadb.pid user=root

port=3306

# Disabling symbolic-links is recommended to prevent assorted security risks symbolic-links=0

[mysqld_safe]

log-error=/var/log/mariadb/mariadb.log pid-file=/var/run/mariadb/mariadb.pid

 

[client] user=root password=SIOS

Move the original MySQL configuration file aside, if it exists:

# mv /etc/my.cnf /etc/my.cnf.orig

On “node2”, you ONLY need to install the MariaDB/MySQL packages. The other steps aren’t required:On “node2”:

[root@node2 ~]# yum install mariadb mariadb-server

Step 14: Install and Configure the Cluster

At this point, we are ready to install and configure our cluster. SIOS Protection Suite for Linux (aka SPS-Linux) will be used in this guide as the clustering technology. It provides both high availability failover clustering features (LifeKeeper) as well as real-time, block level data replication (DataKeeper) in a single, integrated solution. SPS-Linux enables you to deploy a “SANLess” cluster, aka a “shared nothing” cluster meaning that cluster nodes don’t have any shared storage, as is the case with EC2 Instances.

Install SIOS Protection Suite for Linux

Perform the following steps on ALL 3 VMs (node1, node2, witness):

Download the SPS-Linux installation image file (sps.img) and and obtain either a trial license or purchase permanent licenses. Contact SIOS for more information.

You will loopback mount it and run the “setup” script inside, as root (or first “sudo su -” to obtain a root shell) For example:

# mkdir /tmp/install

# mount -o loop sps.img /tmp/install

# cd /tmp/install

# ./setup

During the installation script, you’ll be prompted to answer a number of questions. You will hit Enter on almost every screen to accept the default values. Note the following exceptions:

  • On the screen titled “High Availability NFS” you may select “n” as we will not be creating a highly available NFS server
  • Towards the end of the setup script, you can choose to install a trial license key now, or later. We will install the license key later, so you can safely select “n” at this point
  • In the final screen of the “setup” select the ARKs (Application Recovery Kits, i.e. “cluster agents”) you wish to install from the list displayed on the screen.
    • The ARKs are ONLY required on “node1” and “node2”. You do not need to install on “witness” Navigate the list with the up/down arrows, and press SPACEBAR to select the following:
        • lkDR – DataKeeper for Linux
        • lkSQL – LifeKeeper MySQL RDBMS Recovery Kit
      • This will result in the following additional RPMs installed on “node1” and “node2”:
        • steeleye-lkDR-9.0.2-6513.noarch.rpm steeleye-lkSQL-9.0.2-6513.noarch.rpm

Install Witness/Quorum package

The Quorum/Witness Server Support Package for LifeKeeper (steeleye-lkQWK) combined with the existing failover process of the LifeKeeper core allows system failover to occur with a greater degree of confidence in situations where total network failure could be common. This effectively means that failovers can be done while greatly reducing the risk of “split-brain” situations.

Install the Witness/Quorum rpm on all 3 nodes (node1, node2, witness):

# cd /tmp/install/quorum

# rpm -Uvh steeleye-lkQWK-9.0.2-6513.noarch.rpm

On ALL 3 nodes (node1, node2, witness), edit /etc/default/LifeKeeper, set NOBCASTPING=1
On ONLY the Witness server (“witness”), edit /etc/default/LifeKeeper, set WITNESS_MODE=off/none

Install the EC2 Recovery Kit Package

SPS-Linux provides specific features that allow resources to failover between nodes in different availability zones and regions. Here, the EC2 Recovery Kit (i.e. cluster agent) is used to manipulate Route Tables so that connections to the Virtual IP are routed to the active cluster node.

Install the EC2 rpm (node1, node2):

# cd /tmp/install/amazon

# rpm -Uvh steeleye-lkECC-9.0.2-6513.noarch.rpm

Install a License key

On all 3 nodes, use the “lkkeyins” command to install the license file that you obtained from SIOS:

# /opt/LifeKeeper/bin/lkkeyins <path_to_file>/<filename>.lic

Start LifeKeeper

On all 3 nodes, use the “lkstart” command to start the cluster software:

# /opt/LifeKeeper/bin/lkstart

Set User Permissions for LifeKeeper GUI

On all 3 nodes, create a new linux user account (i.e. “tony” in this example). Edit /etc/group and add the “tony” user to the “lkadmin” group to grant access to the LifeKeeper GUI. By default only “root” is a member of the group, and we don’t have the root password here:

 

# useradd tony

# passwd tony

# vi /etc/group

 

lkadmin:x:1001:root,tony

Open the LifeKeeper GUI

Make a VNC connection to the Elastic IP (Public IP) address of node1. Based on the VNC configuration from above, you would connect to <Public_IP>:2 using the VNC password you specified earlier. Once logged in, open a terminal window and run the LifeKeeper GUI using the following command:

# /opt/LifeKeeper/bin/lkGUIapp &

You will be prompted to connect to your first cluster node (“node1”). Enter the linux userid and password specified during VM creation:

Next, connect to both “node2” and “witness” by clicking the “Connect to Server” button highlighted in the following screenshot:

You should now see all 3 servers in the GUI, with a green checkmark icon indicating they are online and healthy:

Create Communication Paths

Right-click on “node1” and select Create Comm Path

Select BOTH “node2” and “witness” and then follow the wizard. This will create comm paths between:


node1 & node2 node1 & witness


A comm path still needs to be created between node2 & witness. Right click on “node2” and select Create Comm Path. Follow the wizard and select “witness” as the remote server:


At this point the following comm paths have been created:

node1 <—> node2 node1 <—> witness node2 <—> witness

The icons in front of the servers have changed from a green “checkmark” to a yellow “hazard sign”. This is because we only have a single communication path between nodes.

If the VMs had multiple NICs (information on creating Azure VMs with multiple NICs can be found here, but won’t be covered in this article), you would create redundant comm paths between each server.


To remove the warning icons, go to the View menu and de-select “Comm Path Redundancy Warning”:


Result:

 

Verify Communication Paths

Use the “lcdstatus” command to view the state of cluster resources. Run the following commands to verify that you have correctly created comm paths on each node to the other two servers involved:

# /opt/LifeKeeper/bin/lcdstatus -q -d node1
MACHINE NETWORK ADDRESSES/DEVICE STATE PRIO node2 TCP 10.0.0.4/10.0.1.4

ALIVE 1 witness TCP 10.0.0.4/10.0.2.4 ALIVE 1
#/opt/LifeKeeper/bin/lcdstatus -q -d node2
MACHINE NETWORK ADDRESSES/DEVICE STATE PRIO node1 TCP 10.0.1.4/10.0.0.4

ALIVE 1 witness TCP 10.0.1.4/10.0.2.4 ALIVE 1
#/opt/LifeKeeper/bin/lcdstatus -q -d witness

MACHINE NETWORK ADDRESSES/DEVICE STATE PRIO node1 TCP 10.0.2.4/10.0.0.4
ALIVE 1 node2 TCP 10.0.2.4/10.0.1.4 ALIVE 1

Create a Data Replication cluster resource (i.e. Mirror)

Next, create a Data Replication resource to replicate the /var/lib/mysql partition from node1 (source) to node2 (target). Click the “green plus” icon to create a new resource:


Follow the wizard with these selections:

Please Select Recovery Kit: Data Replication Switchback Type: intelligent

Server: node1

Hierarchy Type: Replicate Exiting Filesystem

Existing Mount Point: /var/lib/mysql

Data Replication Resource Tag: datarep-mysql

File System Resource Tab: /var/lib/mysql

Bitmap File: (default value)

Enable Asynchronous Replication: No

After the resource has been created, the “Extend” (i.e. define backup server) wizard will appear.

Use the following selections:

Target Server: node2 Switchback Type: Intelligent Template Priority: 1

Target Priority: 10 Target Disk: /dev/xvdb1

Data Replication Resource Tag: datarep-mysql Bitmap File: (default value)

Replication Path: 10.0.0.4/10.0.1.4 Mount Point: /var/lib/mysql

Root Tag: /var/lib/mysql

The cluster will look like this:

Create Virtual IP

Next, create a Virtual IP cluster resource. Click the “green plus” icon to create a new resource:


Follow the wizard to create the IP resource with these selections:

Select Recovery Kit: IP Switchback Type: Intelligent IP Resource: 10.1.0.10

Netmask: 255.255.255.0

Network Interface: eth0

IP Resource Tag: ip-10.1.0.10

Extend the IP resource with these selections:

Switchback Type: Intelligent Template Priority: 1

Target Priority: 10

IP Resource: 10.1.0.10

Netmask: 255.255.255.0

Network Interface: eth0

IP Resource Tag: ip-10.1.0.10

The cluster will now look like this, with both Mirror and IP resources created:

Configure a Ping List for the IP resource

By default, SPS-Linux monitors the health of IP resources by performing a broadcast ping. In many virtual and cloud environments, broadcast pings don’t work. In a previous step, we set “NOBCASTPING=1” in
/etc/default/LifeKeeper to turn off broadcast ping checks. Instead, we will define a ping list.

This is a list of IP addresses to be pinged during IP health checks for this IP resource.

In this guide, we will add the witness server (10.0.2.4) to our ping list.

Right-click on the IP resource (ip-10.1.0.10) and select Properties:

You will see that initially, no ping list is configured for our 10.1.0.0 subnet. Click “Modify Ping List”:

Enter “10.0.2.4” (the IP address of our witness server), click “Add address” and finally click “Save List”:


You will be returned to the IP properties panel, and can verify that 10.0.2.4 has been added to the ping list. Click OK to close the window:

Create the MySQL resource hierarchy

Next, create a MySQL cluster resource. The MySQL resource is responsible for stopping/starting/monitoring of your MySQL database.

Before creating MySQL resource, make sure the database is running. Run “ps -ef | grep sql” to check.

If it’s running, great – nothing to do. If not, start the database back up:

# mysqld_safe –user=root –socket=/var/lib/mysql/mysql.sock –port=3306 –datadi

Follow the wizard with to create the IP resource with these selections:To create, click the “green plus” icon to create a new resource:

Select Recovery Kit: MySQL Database Switchback Type: Intelligent Server: node1

Location of my.cnf: /var/lib/mysql

Location of MySQL executables: /usr/bin

Database Tag: mysql

Extend the IP resource with the following selections:

Target Server: node2 Switchback Type: intelligent Template Priority: 1

Target Priority: 10

As a result, your cluster will look as follows. Notice that the Data Replication resource was automatically moved underneath the database (dependency automatically created) to ensure it’s always brought online before the database:

Create an EC2 resource to manage the route tables upon failover

SPS-Linux provides specific features that allow resources to failover between nodes in different availability zones and regions. Here, the EC2 Recovery Kit (i.e. cluster agent) is used to manipulate Route Tables so that connections to the Virtual IP are routed to the active cluster node.

To create, click the “green plus” icon to create a new resource:


Follow the wizard to create the EC2 resource with these selections:

Select Recovery Kit: Amazon EC2 Switchback Type: Intelligent Server: node1

EC2 Home: /opt/aws

EC2 URL: ec2.us-west-2.amazonaws.com

AWS Access Key: (enter Access Key obtained earlier) AWS Secret Key: (enter Secret Key obtained earlier) EC2 Resource Type: RouteTable (Backend cluster)

IP Resource: ip-10.1.0.10

EC2 Resource Tag: ec2-10.1.0.10

Extend the IP resource with the following selections:

Target Server: node2 Switchback Type: intelligent Template Priority: 1

Target Priority: 10

EC2 Resource Tag: ec2-10.1.0.10

The cluster will look like this. Notice how the EC2 resource is underneath the IP resource:

Create a Dependency between the IP resource and the MySQL Database resource

Create a dependency between the IP resource and the MySQL Database resource so that they failover together as a group. Right click on the “mysql” resource and select “Create Dependency”:

On the following screen, select the “ip-10.1.0.10” resource as the dependency. Click Next and continue through the wizard:

At this point the SPS-Linux cluster configuration is complete. The resource hierarchy will look as follows:

Step 15: Test Cluster Connectivity

At this point, all of our Amazon EC2 and Cluster configurations are complete! Cluster resources are currently active on node1:

Test connectivity to the cluster from the witness server (or another linux instance if you have one) SSH into the witness server, “sudo su -” to gain root access. Install the mysql client if needed:

[root@witness ~]# yum -y install mysql

Test MySQL connectivity to the cluster:

[root@witness ~]# mysql –host=10.1.0.10 mysql -u root -p

Execute the following MySQL query to display the hostname of the active cluster node:

MariaDB [mysql]> select @@hostname;

++

| @@hostname |

++

| node1     |

++

1 row in set (0.00 sec) MariaDB [mysql]>

Using LifeKeeper GUI, failover from Node1 -> Node2″. Right-click on the mysql resource underneath node2, and select “In Service…”:

After failover has completed, re-run the MySQL query. You’ll notice that the MySQL client has detected that the session was lost (during failover) and automatically reconnects:

Execute the following MySQL query to display the hostname of the active cluster node, verifying that now “node2” is active:

MariaDB [mysql]> select @@hostname;

ERROR 2006 (HY000): MySQL server has gone away No connection. Trying to reconnect…

Connection id:      12

Current database: mysql

++

| @@hostname |

++

| node2     |

++

1 row in set (0.53 sec) MariaDB [mysql]>

Reproduced with permission from SIOS

 

Filed Under: Clustering Simplified Tagged With: #SANLess, amazon, AWS, cluster, Linux

Video: The SIOS Clustering Advantage

June 24, 2019 by Jason Aw Leave a Comment

SIOS video SIOS clustering advantage

Video: The SIOS Clustering Advantage

Each year, your task is likely to provide higher levels of service using existing infrastructure and a smaller IT budget. Tolerance for downtime or data loss is gone.  Applications have to be up 24/7 and you need to be protected whether that’s a server outage, a networking outage, application outage, or even entire data center loss. The expectation is that the amount of downtime and the amount of data loss converges on “0”.

IT professionals have more options than ever on how you’re going to support your end users whether that’s deployment of physical servers, virtual servers, or even cloud technologies. Choosing a solution comes down to understanding business objectives, technical requirements, and budget limitations as well as needing to understand how you’re going to protect the environment to ensure it is always available and you don’t have any downtime or any data loss.

This is typically done by implementing a traditional SAN based cluster involving two or more servers connected into some type of shared storage. If there is an issue, it will fail the application over and bring everything back online. SIOS software supports this and makes it easy to set up and manage. While a SAN based cluster is great for local high availability, the SAN generally represents high cost, complexity, potential failure in your clustering architecture, and it also doesn’t help you solve the disaster recovery problem.

SIOS software allows you to build out your cluster using your choice of hardware but now leveraging local storage. SIOS provides real-time block level data replication that’s fully cluster aware and integrated allowing you to leverage that very fast local storage with your cluster configuration. Also, adopting a SANLess cluster can reduce the overall cost of the solution by eliminating the SAN. As a result, you’ve not only eliminated the cost of the SAN hardware but also SAN infrastructure and administrative costs that come along with your SAN license savings. In addition, you will be cutting out that single point of failure in your clustering architecture so it won’t take down the entire environment. You can also eliminate data loss because our real-time block level data replication technology keeps the local storage in sync. Provided with the software there is also user friendly wizard-based user interfaces.

To sum things up, SIOS gives you the flexibility to protect your mission-critical applications and data in physical, virtual, or cloud environments.  Learn more about our high availability solutions.

Learn how SIOS clustering software makes protecting applications easy.

Filed Under: News and Events Tagged With: AWS QuickStart, cluster, HA clusters-cloud, High Performance Storage, Linux, Physical Servers, SQL Server Failover Clusters, Virtual / VMware

Configure SQL Server 2008 R2 Failover Cluster Instance On Windows Server 2008 R2 In Azure

April 24, 2019 by Jason Aw Leave a Comment

Step-By-Step: How To Configure A SQL Server 2008 R2 Failover Cluster Instance On Windows Server 2008 R2 In Azure

Intro

On July 9, 2019, support for SQL Server 2008 and 2008 R2 will end. That means the end of regular security updates. However, if you move those SQL Server instances to Azure, Microsoft will give you three years of Extended Security Updates at no additional charge. If you are currently running SQL Server 2008/2008 R2 and you are unable to update to a later version of SQL Server before the July 9th deadline, you will want to take advantage of this offer rather than running the risk of facing a future security vulnerability. An unpatched instance of SQL Server could lead to data loss, downtime or a devastating data breach.

One of the challenges you will face when running SQL Server 2008/2008 R2 in Azure is ensuring high availability. On premises you may be running a SQL Server Failover Cluster (FCI) instance for high availability, or possibly you are running SQL Server in a virtual machine and are relying on VMware HA or a Hyper-V cluster for availability. When moving to Azure, none of those options are available. Downtime in Azure is a very real possibility that you must take steps to mitigate.

In order to mitigate the possibility of downtime and qualify for Azure’s 99.95% or 99.99% SLA, you have to leverage SIOS DataKeeper. DataKeeper overcomes Azure’s lack of shared storage and allows you to build a SQL Server FCI in Azure that leverages the locally attached storage on each instance. SIOS DataKeeper not only supports SQL Server 2008 R2 and Windows Server 2008 R2 as documented in this guide, it supports any version of Windows Server, from 2008 R2 through Windows Server 2019 and any version of SQL Server from from SQL Server 2008 through SQL Server 2019.

This guide will walk through the process of creating a two-node SQL Server 2008 R2 Failover Cluster Instance (FCI) in Azure, running on Windows Server 2008 R2. Although SIOS DataKeeper also supports clusters that span Availability Zones or Regions, this guide assumes each node resides in the same Azure Region, but different Fault Domains. SIOS DataKeeper will be used in place of the shared storage normally required to create a SQL Server 2008 R2 FCI.

Create The First SQL Server Instance In Azure

This guide will leverage the SQL Server 2008R2SP3 on Windows Server 2008R2 image that is published in the Azure Marketplace.

When you provision the first instance you will have to create a new Availability Set. During this process be sure to increase the number of Fault Domains to 3. This allows the two cluster nodes and the file share witness each to reside in their own Fault Domain.

Add additional disks to each instance. Premium or Ultra SSD are recommended. Disable caching on the disks used for the SQL log files. Enable read-only caching on the disk used for the SQL data files. Refer to Performance guidelines for SQL Server in Azure Virtual Machines for additional information on storage best practices.

If you don’t already have a virtual network configured, allow the creation wizard to create a new one for you.

Once the instance is created, go in to the IP configurations and make the Private IP address static. This is required for SIOS DataKeeper and is best practice for clustered instances.

Make sure that your virtual network is configured to set the DNS server to be a local Windows AD controller. This is to ensure you will be able to join the domain in a later step.

Create The End SQL Server Instance In Azure

Follow the same steps as above. Except be sure to place this instance in the same virtual network and Availability Set that you created with the 1st instance.

Create A File Share Witness (FSW) Instance

In order for the Windows Server Failover Cluster (WSFC) to work optimally you are required to create another Windows Server instance and place it in the same Availability Set as the SQL Server instances. By placing it in the same Availability Set, you ensure that each cluster node and the FSW reside in different Fault Domains. Thereby ensuring your cluster stays on line should an entire Fault Domain go off line. This instances does not require SQL Server. It can be a simple Windows Server as all it needs to do is host a simple file share.

This instance will host the file share witness required by WSFC. This instance does not need to be the same size, nor does it require any additional disks to be attached. It’s only purpose is to host a simple file share. It can in fact be used for other purposes. In my lab environment my FSW is also my domain controller.

Uninstall SQL Server 2008 R2

Each of the two SQL Server instances provisioned already have SQL Server 2008 R2 installed on them. However, they are installed as standalone SQL Server instances, not clustered instances. SQL Server must be uninstalled from each of these instances before we can install the cluster instance. The easiest way to do that is to run the SQL Setup as shown below.

When you run setup.exe /Action-RunDiscovery you will see everything that is preinstalled 

setup.exe /Action-RunDiscovery

Running setup.exe /Action=Uninstall /FEATURES=SQL,AS,RS,IS,Tools /INSTANCENAME=MSSQLSERVER kicks off the uninstall process

setup.exe /Action=Uninstall /FEATURES=SQL,AS,RS,IS,Tools /INSTANCENAME=MSSQLSERVER

Running setup.exe /Action-RunDiscovery confirms the uninstallation completed

setup.exe /Action-RunDiscovery

Run this uninstallation process again on the 2nd instance.

Add Instances To The Domain

All three of these instances will need to be added to a Windows Domain.

Add Windows Failover Clustering Feature

The Failover Clustering Feature needs to be added to the two SQL Server instances

Add-WindowsFeature Failover-Clustering

Turn Off Windows Firewall

For simplicity sake, turn off the Windows Firewall during the installation and configuration of the SQL Server FCI. Consult Azure Network Security Best Practices for advice on securing your Azure resources. Details on required Windows ports can be found here , SQL Server ports here and SIOS DataKeeper ports here, The Internal Load Balancer that we will configure later also requires port 59999 access. So be sure to account for that in your security configuration.

NetSh Advfirewall set allprofiles state off

Install Convenience Rollup Update For Windows Server 2008 R2 SP1

There is a critical update ( kb2854082) that is required in order to configure a Windows Server 2008 R2 instance in Azure. That update and many more are included in the Convenience Rollup Update for Windows Server 2008 R2 SP1. Install this update on each of the two SQL Server instances.

Format The Storage

The additional disks that were attached when the two SQL Server instances were provisioned need to be formatted. Do the following for each volume on each instance.

Microsoft best practices says the following…

“NTFS allocation unit size: When formatting the data disk, it is recommended that you use a 64-KB allocation unit size for data and log files as well as TempDB.”

Run Cluster Validation

Run cluster validation to ensure everything is ready to be clustered.

Your report will contain WARNINGS about Storage and Networking. You can ignore those warnings as we know there are no shared disks and only a single network connection exists between the servers. You may also receive a warning about network binding order which can also be ignored. If you encounter any ERRORS you must address those before you continue.

Create The Cluster

Best practices for creating a cluster in Azure would be to use Powershell as shown below. Powershell allows us to specify a Static IP Address, whereas the GUI method does not. Unfortunately, Azure’s implementation of DHCP does not work well with Windows Server Failover Clustering. If you use the GUI method you will wind up with a duplicate IP address as the Cluster IP Address. It’s not the end of the world, but you will need to fix that as I show.

As I said, the Powershell method generally works best. However, for some reason, it seems to be failing on Windows Server 2008 R2 as shown below.

New-Cluster -Name cluster1 -Node sql1,sql2 -StaticAddress 10.1.0.100 -NoStorage

You can try that method and if it works for you – great! I need to go back and investigate this a bit more to see if it was a fluke. Another option I need to explore if Powershell is not working is Cluster.exe. Running cluster /create /? gives the proper syntax to use for creating clusters with the deprecated cluster.exe command.

However, if Powershell or Cluster.exe fails you, the steps below illustrate how to create a cluster via the Windows Server Failover Clustering UI, including fixing the duplicate IP address that will be assigned to the cluster.

Remember, the name you specify here is just the Cluster Name Object (CNO). This is not the name that your SQL clients will use to connect to the cluster; we will define that during the SQL Server cluster setup in a later step. 

At this point, the cluster is created, but you may not be able to connect to it with the Windows Server Failover Clustering UI due to the duplicate IP address problem.

Fix The Duplicate IP Address

As I mentioned earlier, if you create the cluster using the GUI, you are not given the opportunity to choose an IP address for the cluster. Because your instances are configured to use DHCP (required in Azure), the GUI wants to automatically assign you an IP address using DHCP. Unfortunately, Azure’s implementation of DHCP does not work as expected and the cluster will be assign the same address that is already being used by one of the nodes. Although the cluster will create properly, you will have a hard time connecting to the cluster until you fix this problem.

To fix this problem, from one of the nodes run the following command to ensure the Cluster service is started on that node.

Net start clussvc /fq

On that same node you should now be able to connect to the Windows Server Failover Clustering UI, where you will see the IP Address has failed to come online.

Open the properties of the Cluster IP address and change it from DHCP to Static, and assign it an unused IP address.

Bring the Name resource online

Add The File Share Witness

Next we need to add the File Share Witness. On the 3rd server we provisioned as the FSW, create a folder and share it as shown below. You will need to grant the Cluster Name Object (CNO) read/write permissions at both the Share and Security levels as shown below.

Once the share is created, run the Configure Cluster Quorum wizard on one of the cluster nodes and follow the steps illustrated below.

Create Service Account For DataKeeper

We are almost ready to install DataKeeper. However, before we do that you need to create a Domain account and add it to the Local Administrators group on each of the SQL Server cluster instances. We will specify this account when we install DataKeeper.

Install DataKeeper

Install DataKeeper on each of the two SQL Server cluster nodes as shown below.

This is where we will specify the Domain account we added to each of the local Domain Administrators group.

Configure DataKeeper

Once DataKeeper is installed on each of the two cluster nodes, you are ready to configure DataKeeper.

NOTE – The most common error encountered in the following steps is security related, most often by pre-existing Azure Security groups blocking required ports. Please refer to the SIOS documentation to ensure the servers can communicate over the required ports.

First, you must connect to each of the two nodes.

If everything is configured properly, you should then see the following in the Server Overview report.

Next, create a New Job and follow the steps illustrated below

Choose Yes here to register the DataKeeper Volume resource in Available Storage

Complete the above steps for each of the volumes. Once you are finished, you should see the following in the Windows Server Failover Clustering UI.

You are now ready to install SQL Server into the cluster.

NOTE – At this point the replicated volume is only accessible on the node that is currently hosting Available Storage. That is expected, so don’t worry!

Install SQL Server On The First Node

On the first node, run the SQL Server setup.

Choose New SQL Server Failover Cluster Installation and follow the steps as illustrated.

Choose only the options you need. 

Please note, this document assumes you are using the Default instance of SQL Server. If you use a Named Instance, you need to make sure you lock down the port that it listens on, and use that port later on when you configure the load balancer. You also will need to create a load balancer rule for the SQL Server Browser Service (UDP 1434) in order to connect to a Named Instance. Neither of those two requirements are covered in this guide. But if you require a Named Instance, it will work if you do those two additional steps.

Here you will need to specify an unused IP address

Go to the Data Directories tab and relocate data and log files. At the end of this guide we talk about relocating tempdb to a non-mirrored DataKeeper Volume for optimal performance. For now, just keep it on one of the clustered disks.

Install SQL On Second Node

Run the SQL Server setup again on the second node. Then, choose Add node to a SQL Server Failover Cluster.

Congratulations, you are almost done! However, due to Azure’s lack of support for gratuitous ARP, we will need to configure an Internal Load Balancer (ILB) to assist with client redirection as shown in the following steps.

Update The SQL Cluster IP Address

In order for the ILB to function properly, you must run run the following command from one of the cluster nodes. It SQL Cluster IP enables the SQL Cluster IP address to respond to the ILB health probe while also setting the subnet mask to 255.255.255.255 in order to avoid IP address conflicts with the health probe.

cluster res <IPResourceName> /priv enabledhcp=0 address=<ILBIP> probeport=59999  subnetmask=255.255.255.255

NOTE – I don’t know if it is a fluke. On occasion I have run this command and it looks like it works, but it doesn’t complete the job and I have to start again. The way I can tell if it worked is by looking at the Subnet Mask of the SQL Server IP Resource. If it is not 255.255.255.255 then you know it didn’t run successfully.  It may simply be a GUI refresh issue. Do try restarting the cluster GUI to verify the subnet mask was updated.

After it runs successfully, take the resource offline and bring it back online for the changes to take effect.

Create The Load Balancer

The final step is to create the load balancer. In this case we are assuming you are running the Default Instance of SQL Server, listening on port 1433.

The Private IP Address you define when you Create the load balancer will be the exact same address your SQL Server FCI uses.

Add just the two SQL Server instances to the backend pool. Do NOT add the FSW to the backend pool.

In this load balancing rule, you must enable Floating IP.

Test The Cluster

The most simple test is to open SQL Server Management Studio on the passive node and connect to the cluster. Congratulations! You did everything correctly as it connects! If you can’t connect, don’t fear. I wrote a blog article to help troubleshoot the issue. Managing the cluster is exactly the same as managing a traditional shared storage cluster. Everything is controlled through Failover Cluster Manager.

Optional – Relocate TempDB

For optimal performance it would be advisable to move tempdb to the local, non replicated, SSD. But, SQL Server 2008 R2 requires tempdb to be on a clustered disk. SIOS has a solution called a Non-Mirrored Volume Resource which addresses this issue. It would be advisable to create a non-mirrored volume resource of the local SSD drive and move tempdb there. Do note, the local SSD drive is non-persistent. You must take care to ensure the folder holding tempdb and the permissions on that folder are recreated each time the server reboots.

After you create the Non-Mirrored Volume Resource of the local SSD, follow the steps in this article to relocate tempdb. The startup script described in that article must be added to each cluster node.

Reproduced with permission from Clusteringformeremortals.com

Filed Under: Clustering Simplified, Datakeeper Tagged With: cluster, failover cluster, High Availability, Windows Server Failover Cluster

  • 1
  • 2
  • 3
  • …
  • 8
  • Next Page »

Recent Posts

  • Performance Of Azure Shared Disk With Zone Redundant Storage (ZRS)
  • Availability SLAs: FT, High Availability and Disaster Recovery – Where to start
  • How to Avoid IO Bottlenecks: DataKeeper Intent Log Placement Guidance for Windows Cloud Deployments
  • Leading Media Platform Protects Critical SAP S/4 HANA in AWS EC2 Cloud
  • Protect Systems from Downtime

Most Popular Posts

Maximise replication performance for Linux Clustering with Fusion-io
Failover Clustering with VMware High Availability
create A 2-Node MySQL Cluster Without Shared Storage
create A 2-Node MySQL Cluster Without Shared Storage
SAP for High Availability Solutions For Linux
Bandwidth To Support Real-Time Replication
The Availability Equation – High Availability Solutions.jpg
Choosing Platforms To Replicate Data - Host-Based Or Storage-Based?
Guide To Connect To An iSCSI Target Using Open-iSCSI Initiator Software
Best Practices to Eliminate SPoF In Cluster Architecture
Step-By-Step How To Configure A Linux Failover Cluster In Microsoft Azure IaaS Without Shared Storage azure sanless
Take Action Before SQL Server 20082008 R2 Support Expires
How To Cluster MaxDB On Windows In The Cloud

Join Our Mailing List

Copyright © 2022 · Enterprise Pro Theme on Genesis Framework · WordPress · Log in