Clustering Simplified Archives - Page 51 of 104

Introduction To Clusters – Part 2

November 23, 2021 by Jason Aw Leave a Comment

Introduction To Clusters – Part 2

What Types of Clusters Are There and How Do They Work?

An Overview of HA Clusters, and Load Balancing Clusters

Clustering helps improve reliability and performance of software and hardware systems by creating redundancy to compensate for unforeseen system failure. If a system is interrupted due to hardware or software failure or natural disaster, this can have a major impact on business and revenue, wasting crucial time and expense to get things back up and running.

This is where clustering comes in. There are three main types of clustering solutions – HA clusters, load balancing clusters, and HPC clusters. Which type will best increase system availability and performance for your business? Let’s have a look at the three types of clustering solutions in more detail below.

What is HA Clustering?

High Availability clustering, also known as HA clustering, is effective for mission-critical business applications, ERP systems, and databases, such as SQL Server SAP, and Oracle that require near-continuous availability.

HA clustering can be divided into two types, “Active-Active” configuration and active-passive configuration.

Let’s take a look at the difference between these two HA clustering types.

HA Clustering Type 1: Active-Active Configuration

In the active-active configuration, processing is performed on all nodes in the cluster. For example, in the case of two-node clustering, both nodes are active. If one node stops, the processing will be taken over the other.

However, if each node is operating at close to 100% and one node stops, it will be difficult for another node to take on the additional processing load. Therefore, capacity planning with a margin is important for HA clustering.

HA Clustering Type 2: Active-Standby Configuration

Let’s use our two-node example again. In the active-standby configuration, one node is configured as the active node and the other node is configured as the standby node. The active node and the standby node exchange signals called “heartbeats” to constantly check whether they are operating normally.

If the standby node cannot receive the heartbeat of the active node, the standby node determines that the active node has stopped and will take over the processing of the active node. This mechanism is called “failover”. Conversely, the mechanism that recovers the stopped operating node and transfers the processing back to the recovered active node is called “failback.”

In an active/standby configuration, when a failure occurs, the simple switch from the active node to the standby node makes recovery relatively easy. However, it is necessary to consider that the resources of the standby node when the operating node is operating normally will be wasted.

Two Components of HA Clustering: Application and Storage

For an HA cluster to be effective, two areas need to be addressed: application orchestration and storage protection. Clustering software monitors the health of the application being protected and, if it detects an issue, moves operation of that application over to the standby node. The standby node needs access to the most up-to-date versions of data – preferably identical to the data that the primary node was accessing before the incident. This can be accomplished in two ways: shared storage, share-nothing storage. In the shared storage model, both cluster nodes access the same storage – typically a SAN. In shared-nothing (aka SANless) configurations, local storage on all nodes are mirrored using replication software.

Clustering software products vary widely in their ability to monitor and detect issues that may cause application failure and in their ability to orchestrate failovers reliably. Many clustering products only detect whether the application server is operational, but do not detect a wide range of software, services, network, and other issues that can cause application failure.

Application Awareness is Essential

Similarly, complex ERP and database applications have multiple component parts that have to be stored on the correct server or instance, started up in the right order, and brought on line in accordance with complex best practices. Choose a clustering software with specialized software called application recovery kits designed specifically to maintain best practices for the application/database-specific requirements.

There are multiple ways to configure an HA Cluster:

Traditional Two Node Clusters with Shared Storage

Two servers are clustered with shared storage.

Two Node SANless Cluster

Clusters can be configured using local LAN and high speed synchronous block-level replication.

Real-time replication can be used to synchronize storage on the primary server with storage on a standby server located in the same data center, in your disaster recovery site, or both. This allows you to build high availability and disaster recovery configurations flexibly; Two node or multi-nodeSIOS block level replication is highly optimized for performance. You can even use super fast, high-speed locally attached storage such as PCIe flash type storage devices on your physical servers to achieve very low cost, high performance, high availability configurations. Your data is protected on the flash device and your application too.

Third Node for Disaster Protection

This configuration uses a SAN-based cluster and adds a third, SANless node into a remote data center or the cloud and achieve full disaster recovery protection. In the event of a disaster, the standby remote physical server is brought into service automatically with no data loss, eliminating the hours needed for restoration from backup media.

What is a Load Balancing Cluster?

Load balancing clustering is a mechanism that can be used as a single system by distributing processing to multiple nodes using a load balancer to improve performance by distributing processing. While it can isolate a failed node to prevent node failure from affecting the entire system, the load balancer is a critical single point of failure risk and not a high availability option. It is only effective for applications such web server load balancing. If the load balancer itself fails, the entire system stops.

What is HPC Clustering?

You can also use clustering for performance instead of high availability. High-Performance Computing clusters, or HPC clusters combine the processing power of multiple (sometimes thousands of nodes) to get the CPU performance needed in CPU-intensive environments such as scientific and technological environments requiring large-scale simulations, CAE analysis, and parallel processing.

Are you ready to find the right HA clustering solution for your business?

Learn more about SIOS High Availability clustering here.

Reproduced with permission from SIOS

Introduction To Clusters – Part 1

November 18, 2021 by Jason Aw Leave a Comment

Introduction To Clusters – Part 1

What is clustering in the first place?

Clustering technology is a technology that allows you to connect multiple servers to act as a single functional unit.

Types of clustering

You can cluster servers for several purposes. For example, you can combine the processing power of multiple small servers for high performance. You can also distribute processing work to multiple nodes using a load balancer for added efficiency.

High availability (HA) clustering is a process of combining server nodes to protect important applications from downtime and data loss.

Introduction To Clusters — In a traditional shared storage failover cluster, a primary node and secondary or remote node share the same storage.

HA Clustering

High availability (HA) clustering is a mechanism that reduces downtime by eliminating single points of failure (SPOF). In an HA cluster, important applications are run on a primary node which is connected to one or more secondary or remote nodes in a cluster. Clustering software monitors the health of the application, server, and network. In the event of a failure on the primary node, it moves application operations over to a secondary node in a process called a failover, where operation continues.

High Availability

Application high availability is a measure of how much time in a given year an application is available and operational. In general, HA clusters provide 99.99% (Four nines) availability or a little more than 52 minutes of downtime over the course of a given year.

It is important to note that in a traditional HA cluster, all of the cluster nodes are connected to the same shared storage – typically a SAN. In this way, after a failover, the secondary node is accessing the same data as the primary node and operation can continue.

SANless Clusters

However, many companies prefer to use a SANless cluster for several reasons. First, shared storage represents a critical single point of failure. Second, shared storage is often not an option in public cloud environments. Third, SANs can sometimes impede performance of database applications, such as SQL Server, Oracle, and SAP.

Instead of shared storage, these companies use efficient, host-based, block-level replication to synchronize local storage on all cluster nodes. In the event of a failover, the secondary node is connected to local storage with an identical copy of the primary storage. This not only eliminates the SAN SPOF risk but also enables the addition of fast disk (SSD) to local on-premises storage for cost-efficient high performance. SANless clustering also enables companies to migrate on-premises HA environments to the cloud with minimal effort or disruption of ongoing business processes.

Reproduced from SIOS

Clustering Software for High Availability and Disaster Recovery

November 13, 2021 by Jason Aw Leave a Comment

Clustering Software

Clustering Software for High Availability and Disaster Recovery

What is Clustering Software?

Clustering software lets you configure your servers as a grouping or cluster so that multiple servers can work together to provide availability and prevent data loss. Each server maintains the same information – operating systems, applications, and data. If one server fails, another server immediately picks up the workload. IT professionals rely on clustering to eliminate a single point of failure and minimize the risk of downtime. In fact, 86 percent of all organizations are operating their HA applications with some kind of clustering or high availability mechanism in place.[1]

Types of Cluster Management Software

There are a variety of cluster management software solutions available for Windows and Linux distributions. Examples include:

Windows Server Failover Clustering (WSFC),
SUSE Linux Enterprise High Availability Extension,
Red Hat Cluster Suite,
Oracle Real Application Clusters (RAC), and
SIOS software.

Except for SIOS, these products support a single operating system or require expensive SAN hardware, constraining flexibility and deployment options. Moreover, Linux open-source HA extensions require a high degree of technical skill, creating complexity and reliability issues that challenge most operators.

SIOS products uniquely protect any Windows- or Linux-based application operating in physical, virtual, cloud or hybrid cloud environments and in any combination of site or disaster recovery scenarios. Applications such as SAP and databases, including Oracle, SQL Server, DB2, SAP HANA and many others, benefit from SIOS software. The “out-of-the-box” simplicity, configuration flexibility, reliability, performance, and cost-effectiveness of SIOS products set them apart from other clustering software.

How SIOS Clustering Software Provides High Availability for Windows and Linux Clusters

If you are running a critical application in a Windows or Linux environment, you may want to consider SIOS Technology Corporation’s high availability software clustering products.

In a Windows environment, SIOS DataKeeper Cluster Edition seamlessly integrates with and extends Windows Server Failover Clustering (WSFC) by providing a performance-optimized, host-based data replication mechanism. While WSFC manages the software cluster, SIOS performs the replication to enable disaster protection and ensure zero data loss in cases where shared storage clusters are impossible or impractical, such as in cloud, virtual, and high-performance storage environments.

In a Linux environment, the SIOS Protection Suite for Linux provides a tightly integrated combination of high availability failover clustering, continuous application monitoring, data replication, and configurable recovery policies, protecting your business-critical applications from downtime and disasters.

Whether you are in a Windows or Linux environment, SIOS products free your IT team from the complexity and challenges of computing infrastructures. They provide the intelligence, automation, flexibility, high availability, and ease-of-use IT managers need to protect business-critical applications from downtime or data loss. With over 80,000 licenses sold, SIOS is used by many of the world’s largest companies.

Here is one case study that discusses how a leading Hospital Information Systems (HIS) provider deployed SIOS DataKeeper Cluster Edition to improve high availability and network bandwidth in their Windows cluster environment.

How One HIS Provider Improved RPO and RTO With SIOS DataKeeper Clustering Software

This leading HIS provider has more than 10,000 U.S.-based health care organizations (HCOs) using a variety of its applications, including patient care management, patient self-service, and revenue management. To support these customers, the organization had more than 20 SQL Server clusters located in two geographically dispersed data centers, as well as a few smaller servers and SQL Server log shipping for disaster recovery (DR).

The organization has a large customer base and vast IT infrastructure and needed a solution that could handle heavy network traffic and eliminate network bandwidth problems when replicating data to its DR site. The organization also needed to improve its Recovery Point Objective (RPO) and Recovery Time Objective (RTO) to reduce the volume of data at risk and get IT operations back up and running faster after a disaster or system failure. RPO is the maximum amount of data loss that can be tolerated when a server fails, or a disaster happens. RTO is the maximum tolerable duration of any outage.

To address these challenges, this organization chose SIOS DataKeeper Cluster Edition, which provides seamless integration with WSFC, making it possible to create SANless clusters.

Once SIOS DataKeeper Cluster Edition passed the organization’s stringent POC testing, the IT team deployed the solution in the company’s production environment. The team deployed SIOS across a three-node cluster comprised of two SAN-based nodes in the organization’s primary, on-premises data center and one SANless node in its remote DR site.

The SIOS solution synchronizes replication across the three nodes in the cluster and eliminates the bandwidth issues at the DR site, improving both RPO and RTO and reducing the cost of bandwidth. Today, the organization uses SIOS DataKeeper Cluster Edition to protect their SQL Server environment across more than 18 cluster nodes.

See the full case study to learn more.

How SIOS Clustering Software Works

SIOS software is an essential part of your cluster solution, protecting your choice of Windows or Linux environments in any configuration (or combination) of physical, virtual and cloud (public, private, and hybrid) environments without sacrificing performance or availability.

If you need fast, efficient, replication to transfer data across low-bandwidth local or wide area networks, SIOS DataKeeper protects business-critical Windows environments, including Microsoft SQL Server, Oracle, SharePoint, Lync, Dynamics, and Hyper-V from downtime and data loss in a physical, virtual, or cloud environment.

SIOS Protection Suite for Linux supports all major Linux distributions, including Red Hat Enterprise Linux, SUSE Linux Enterprise Server, CentOS, and Oracle Linux and accommodates a wide range of storage architectures.

To see how SIOS clustering software works to protect Windows and Linux environments, request a demo or get a free trial.

Learn more about:

SAP clustering
SQL Server clustering
Oracle clustering
Linux clustering

Check out recent blog posts about our clustering products.

References

[1] SIOS in partnership with ActualTech Research, (2018) The State of Application High Availability Survey Report

Reproduced from SIOS

Disaster Recovery Fundamentals

November 8, 2021 by Jason Aw Leave a Comment

Disaster Recovery Fundamentals

Disaster recovery overview

Disaster recovery refers to the ability to quickly restore/repair a system and minimize damage in the event of a sitewide or even regional failure. Disaster recovery is a crucial part of business continuity management and having a robust disaster recovery protocol in place will help prevent unnecessary data loss and expense associated with system downtime.

What constitutes the ‘disaster’ part of disaster recovery? This could refer to a natural disaster such as earthquakes, floods, etc. but also a wide range of events such as “fire,” “terrorism,” “unauthorized intrusion,” “large-scale hacking,” and “long-term large-scale power outages.” Anything that has the potential to cause catastrophic damage to an IT system if it were to fail.

The Real Impact of System Failure

In addition to potential physical damages and data loss associated with a system failure, the lack of a disaster recovery plan can cause unrecoverable revenue loss for businesses. For every minute of system downtime, this means lost sales and opportunities, potential negative customer experience, tarnished business reputation and high expense in emergency IT repair.

The Importance of Disaster Recovery

For a company that provides mission-critical services, building a business continuity system that can handle unexpected system downtime is essential. Having the ability to prevent failure in the first place, and to quickly recover in the event a local failure or even a sitewide or regional disaster occurs will help to protect data, maintain rapport with customers, and save time and potentially devastating financial loss.

It’s important to recognize that catastrophic system failure is something that will happen, not something that may happen, so putting a proper disaster recovery plan in place will protect your business.

Disaster Recovery Challenges

While a disaster recovery protocol is essential, it is not without its challenges to set up and implement. Here are some common barriers to proper disaster recovery implementation:

Challenge 1: Geographic separation.
The essence of disaster protection is keeping systems and data in a location that is geographically separated from the primary data center or cloud instance so that, in the event of a disaster or cloud outage, the secondary systems can be brought online and operation can continue.

Challenge 2: Network bandwidth Requirements
Replicating data to an offsite location for disaster recovery can mean added network bandwidth requirements and latency issues.

Challenge 3: Data volume continues to increase
The storage capacity requirements on the disaster recovery site will increase over time. A proper disaster recovery plan needs to establish “protection priority” to clarify which data should be protected and optimize available storage resources.

Challenge 4: Recovery procedure at the time of recovery
If a system goes down due to a disaster, service recovery is required. Often, companies find their data is scattered in multiple locations and there aren’t standardized procedures for and recovery, resulting in immense loss of time and expenses. Developing a clear, standardized restoration procedure will eliminate this headache and allow for quick action when it matters most.

Data backup vs Availability Protection

Traditionally, data backup – essentially a process of making a copy of data and applications and moving it to an offsite location — has been performed for the purpose of protecting data in case of IT equipment failure/failure and for recordkeeping/archiving in compliance with regulatory requirements such as the HIPAA (Healthcare Information Portability Accountability Act). To recover operation, any servers, storage, and other hardware, as well as networking affected by the incident need to be replaced or repaired. Servers have to be configured and applications have to be restored, brought back online and connected to recovered data. These steps can months.

Without an availability protection process in place, recovery operations with backup alone can be a time-consuming and expensive process. Availability processes keep fully operational systems ready to take over in the event of a disaster, enabling resumption of service in minutes.

Here are some other common reasons an effective disaster recovery plan is important:

Disaster recovery indicators

The main metrics for disaster recovery are “RPO” and “RTO”.

RPO (Recovery Point Objective)
RPO indicates the point from the time of disaster occurrence to what time in the past the data recovery is guaranteed.

If “RPO = < 5 minutes

When aiming for “RPO = 0 (zero data loss)”, an availability protection mechanism such as failover clustering is required.

RTO (Recovery Time Objective)
RTO is an index that shows how much time your business can allow to pass from initial downtime to restoration of operation. “RTO = 1 month or more”, you may be able to handle data recovery by only doing remote backup and securing a substitute device. But if your “RTO = within a minutes”, failover clustering is required.

Selecting a Disaster Recovery Method

When determining the right disaster recovery method for your business, consider these important factors:

Criticality of business processes and tolerance for impact
The data type and capacity that you want to protect
Recovery requirements – your RPO and RTO
Budget

Focus on Business Impact

While IT departments take the technical lead in developing disaster recovery measures for IT systems, business owners must consider the impact and extent of system outages to the business impact of each system stop” to ensure the least harmful impact to the business.

Protected data type (data integrity)

It is important to classify the type and importance of protected data. For data that does not require very precise consistency (such as file servers), a simple primary storage backup may be sufficient.

On the other hand, ERP systems and databases such as SQL Server, Oracle, and SAP have multiple services and parts that need to be located on specific servers, started up in specific orders, and managed according to a variety of application-specific best practices. They typically require high availability protection and an application-aware clustering solution to orchestrate failover.

——————————————————————————————————————

Key Disaster Recovery Terms

Remote backup – essentially keeping a copy of applications and data in a geographically separated remote location.

Synchronous Storage Mirroring
Keeping a local and remote copy of storage synchronized for DR protection. In this method, data is written to local storage and immediately replicated to remote storage. The local storage is not “committed” until the process of writing data to the remote location has been completed. This process keeps both locations identical, eliminating discrepancies that may result if data-in-transit at the time of an event fails to write on the remote location. Data integrity is guaranteed between the primary and backup sites.

Asynchronous Storage Mirroring.
This method writes data to the local storage then replicates it to the remote location. It enables greater network utilization efficiency and reduced bandwidth contention when geographic separation causes latency.

“Cold standby” and “hot standby”

Cold standby
A process of keeping a copy of data or secondary system offline in case of disaster. If the primary system goes down, the systems and software have to be manually started up – in some cases configured – and data has to be restored before operation can continue.

Hot standby
This is a process of keeping secondary systems operational and switching over to them in the event of downtime on the primary system.

Disaster Recovery Method Cost Comparisons

The smaller the RPO and RTO, the shorter the downtime, but the cost will increase accordingly.

Considering the cost and asset value of each type of data, it is necessary to find the optimum method for what level of protection is required. A balance between in-house implementation and outsourcing of services will impact costs.

To learn more about high availability and disaster recovery solutions at SIOS, click here.

Reproduced from SIOS

Disaster Recovery

November 2, 2021 by Jason Aw Leave a Comment

Disaster Recovery

How to Enable Disaster Recovery with a Single Clustering Software Solution

Protect Windows or Linux Applications Operating in any Combination of Physical, Virtual, Cloud, or Hybrid Cloud Infrastructures with Disaster Recovery

What is Disaster Recovery?

Disaster Recovery is Critical to Continued Business Operations

Disaster recovery (DR) is a strategy and set of policies, procedures, and tools that ensure critical IT systems, databases, and applications continue to operate and be available to users when a man-made or natural disaster happens. While the IT team owns the disaster recovery strategy, DR is an important component of every organization’s Business Continuity Plan, which is a strategy and set of policies, procedures, and tools that get the entire business back up and running after a disaster.

But when we speak of a disaster, it does not need to be a full-fledged hurricane, tornado, flood, or earthquake that impacts your business. Disasters come in many forms, including a cyber-attack, user error, fire, theft, vandalism, even a terrorist attack. In short, a disaster is any crisis that results in a down system for a long duration and/or major data loss on a large scale that impacts your IT infrastructure, data center, and your business.

Disaster Recovery

In a recent Spiceworks survey, 59 percent of organizations indicated they had experienced one to three outages (that is, any interruption to normal levels of IT-related service) over the course of one year, 11 percent have experienced four to six, and 7 percent have experienced seven or more. In addition, the survey also indicates that larger companies, which rely on a greater number of services, are more likely to experience outages than smaller organizations. For example, 71 percent of small businesses experienced one or more outages in the last 12 months, compared to 79 percent of mid-size businesses, and 87 percent of large businesses. When you look at those statistics, you know you are living on borrowed time if you don’t have a disaster recovery plan in place.

But there is good news. Compared with statistics from previous years, it appears that organizations of all sizes and from all industries are doing better when it comes to having a disaster recovery plan in place. According to the same Spiceworks survey, 95 percent of organizations have a DR plan in place but unfortunately, 23 percent never test or exercise their plan. Exercising your DR plan is as important as a student fire drill or muster drill. Having a plan in place is just the first step. If the people involved in executing the plan don’t know what to do, you won’t be able to recover from a disaster.

High Availability Vs. Disaster Recovery

But before we go any further, let’s be clear on the difference between best practices for handling a system failure versus a disaster. To recover from a system failure, redundant systems, software, and data should be on your local area network (LAN). For critical database applications, you can replicate data synchronously across the LAN. This makes your standby instance “hot” and in sync with your active instance, so it is ready to take over immediately in the event of a failure. This is referred to as high availability (HA).

However, to recover systems, software, and data in the event of a disaster means redundant components must be on a wide-area network (WAN). With a WAN, data replication is asynchronous to avoid negatively impacting throughput performance. This means that updates to standby instances will lag updates made to the active instance, resulting in a delay during the recovery process. Since disasters are rare, some delay may be tolerable and is dependent upon (a) how critical it is to your business to achieve the lowest possible Recovery Time Objective (RTO) and Recovery Point Objective (RPO) and (b) how much budget you can allocate to achieve the best RTO and RPO.

RTO is the maximum tolerable duration of any outage and RPO is the maximum amount of data loss that can be tolerated when a disaster happens. For disaster recovery, RTOs of many minutes or even hours is common with some solutions as it is too expensive to try to recover across a WAN in just a few minutes. For mission-critical applications, your organization will want to achieve a low RPO but the lower your RPO, the more you need processes in place to ensure all data has been replicated on the standby server before failover. This effort tends to increase recovery time.

But with SIOS disaster recovery solution, you can achieve a minimal-to-no-data-loss RPO and an RTO of one to two minutes.

SIOS Delivers One Solution to Meet Your HA and DR Needs

Whether you need local HA within a single site or fast, efficient DR across multiple sites, SIOS solutions meet all your business continuity needs.

The SIOS disaster recovery solution is a multi-site, geographically dispersed cluster that provides RPOs of seconds and RTOs of minutes. What makes SIOS different than many other DR providers is that it offers one solution that meets both high availability and disaster recovery needs.

To support DR, you configure your clusters the same way as you do for high availability but with two distinct differences previously discussed:

The DR cluster node(s) are in a geographic site – on-premises, virtual, or in the cloud – that is further away from the HA instance.
The DR site is on a wide-area network (WAN), which means that data replication will be asynchronous to avoid negatively impacting throughput performance.

Remember, asynchronous data replication means that updates to the DR instances will lag updates made to the active instance but typically only by a few seconds at most. But with SIOS’ incredibly fast data replication across the WAN, you can keep real-time copies of data synchronized across multiple servers and data centers to achieve both HA and DR.

In addition to one single solution for HA/DR and real-time data replication, the SIOS HA/DR solution also provides:

Block-level data compression to minimize network loads
Bandwidth throttling to regulate and minimize network congestion
WAN optimization to improve network performance
Integration with push-button failover to support DR and automatic failover to support HA
An agnostic platform approach, allowing you to choose on-premises, virtual, cloud, or a hybrid DR solution

The following case study showcases the use of SIOS DataKeeper to deliver HA and DR in a single solution.

———————————————————————————————————————————–

Enabling HA and DR Protection at a Premier Health Center

ALYN Hospital, located in Israel, is a premier pediatric rehabilitation health center, specializing in diagnosing and rehabilitating infants, children, and adolescents with physical disabilities. Parents bring their children from Israel and abroad to receive a wide range of medical services, paramedical therapies, and additional state-of-the-art rehabilitation services.

The Search for the Right Solution

ALYN Hospital operates a variety of applications – including electronic medical records (EMR), customer relationship management (CRM), SQL Server databases, Microsoft Exchange, and Microsoft Office in support of its clinical and administrative operations. As a healthcare provider, the hospital is subject to strict government regulations and needed to implement strong DR provisions to ensure the protection and availability of their mission-critical applications. The hospital chose Hyper-V Replica to support its DR strategy, operating two, physically separated server rooms on-premises, enabling all critical virtual machines (VNs) running on any Hyper-V host server to be replicated to another in the other room. Unfortunately, this configuration was not satisfying RPO and RTO requirements, so the IT team started to investigate other options.

In looking for the right DR solution, the IT team considered Windows Server Failover Clustering (WSFC), which uses shared storage. Unfortunately, ALYN did not have a SAN in place and because of budget restrictions, it was cost-prohibitive to implement identical SANs in both server rooms. For this reason, ALYN investigated third-party solutions.

In its search for third-party failover clustering software, ALYN established three criteria:

The solution had to work with existing hardware.
The solution had to provide both high availability (HA) and disaster recovery (DR) protection across all hospital critical applications.
The total cost of ownership (TCO) had to fit within the department’s limited budget.

SIOS DataKeeper – The Obvious Choice

After evaluating several different solutions, the IT staff chose SIOS DataKeeper, which the team described as a solution “that delivers carrier-class capabilities with a remarkably low total cost of ownership” and delivers HA and DR in a single cost-effective solution.

SIOS DataKeeper combines real-time, block-level data replication with continuous application-level monitoring and flexible failover/failback policies in a total solution that is easy to implement and manage. DataKeeper leverages WSFC and maintains compatibility with the operating environment, making it easy for the IT team to quickly learn how to use the solution and quickly complete HA configurations for all applications.

With DataKeeper, the IT team can create three-node SANless failover clusters with a single active instance and two standby instances. With this configuration, ALYN can continuously update systems and software without disrupting operations because the active instance can be moved to any server in a three-node cluster and remain fully protected during periods of planned hardware and software maintenance.

In addition, SIOS can work with any type of storage and WAN-optimized data replication, which simplifies the implementation of ALYN’s remote DR site. To maintain high transactional throughput performance, data replication across the WAN occurs asynchronously but SIOS DataKeeper employs special techniques to optimize data transmission, allowing ALYN to achieve demanding RPOs and RTOs.

The Bottom Line

Today, SIOS DataKeeper is providing high availability protection for all of ALYN Hospital’s mission-critical applications. Comments Uri Inbar, ALYN Hospital IT Director, “With SIOS we found a solution that delivers carrier-class capabilities with a remarkably low total cost of ownership. For us, it was an obvious choice.

ALYN Hospital tests the configuration regularly, and routinely changes the active and standby designations, while redirecting the data replication as needed during planned software updates. The applications continue to run uninterrupted.

———————————————————————————————————————————–

Final Thoughts on SIOS Disaster Recovery

In a Windows environment, SIOS DataKeeper for Windows Server is available in both a Standard Edition and a more robust Cluster Edition. SIOS DataKeeper Standard Edition provides real-time data replication for disaster recovery protection in a Windows Server environment. SIOS DataKeeper Cluster Edition seamlessly integrates with Windows Server Failover Clustering (WSFC), enabling both high availability and disaster recovery configurations.

SIOS LifeKeeper and DataKeeper support all major Linux distributions, including Red Hat Enterprise Linux, SUSE Linux Enterprise Server, CentOS, and Oracle Linux and accommodates a wide range of storage architectures.

Visit the references below for more information on SIOS DataKeeper or SIOS LifeKeeper:

References

See our White Paper: Understanding Disaster Recovery for Options for SQL Server

Reproduced with permission from SIOS