data replication Archives - SIOS SANless clusters

How does Data Replication between Nodes Work?

June 19, 2022 by Jason Aw Leave a Comment

In the traditional datacenter scenario, data is commonly stored on a storage area network (SAN). The cloud environment doesn’t typically support shared storage.

SIOS DataKeeper presents ‘shared’ storage using replication technology to create a copy of the currently active data. It creates a NetRAID device that works as a RAID1 device (data mirrored across devices).

Data changes are replicated from the Mirror Source (disk device on the active node – Node A in the diagram below) to the Mirror Target (disk device on the standby node – Node B in the diagram below).

In order to guarantee consistency of data across both devices, only the active node has write access to the replicated device (/datakeeper mount point in the example below). Access to the replicated device (the /datakeeper mount point) is not allowed while it is a Mirror Target (i.e., on the standby node).

Reproduced with permission from SIOS

Data Replication

December 13, 2021 by Jason Aw Leave a Comment

Data Replication

Real-Time Data Replication for High Availability

What is Data Replication

Data replication is the process by which data residing on a physical/virtual server(s) or cloud instance (primary instance) is continuously replicated or copied to a secondary server(s) or cloud instance (standby instance). Organizations replicate data to support high availability, backup, and/or disaster recovery. Depending on the location of the secondary instance, data is either synchronously or asynchronously replicated. How the data is replicated impacts Recovery Time Objectives (RTOs) and Recovery Point Objectives (RPO).

For example, if you need to recover from a system failure, your standby instance should be on your local area network (LAN). For critical database applications, you can then replicate data synchronously from the primary instance across the LAN to the secondary instance. This makes your standby instance “hot” and in sync with your active instance, so it is ready to take over immediately in the event of a failure. This is referred to as high availability (HA).

In the event of a disaster, you want to be sure that your secondary instance is not co-located with your primary instance. This means you want your secondary instance in a geographic site away from the primary instance or in a cloud instance connected via a WAN. To avoid negatively impacting throughput performance, data replication on a WAN is asynchronous. This means that updates to standby instances will lag updates made to the active instance, resulting in a delay during the recovery process.

Why Replicate Data to the Cloud?

There are five reasons why you want to replicate your data to the cloud.

As we discussed above, cloud replication keeps your data offsite and away from the company’s site. While a major disaster, such as a fire, flood, storm, etc., can devastate your primary instance, your secondary instance is safe in the cloud and can be used to recover the data and applications impacted by the disaster.
Cloud replication is less expensive than replicating data to your own data center. You can eliminate the costs associated with maintaining a secondary data center, including the hardware, maintenance, and support costs.
For smaller businesses, replicating data to the cloud can be more secure especially if you do not have security expertise on staff. Both the physical and network security provided by cloud providers is unmatched.
Replicating data to the cloud provides on-demand scalability. As your business grows or contracts, you do not need to invest in additional hardware to support your secondary instance or have that hardware sit idle if business slows down. You also have no long-term contracts.
When replicating data to the cloud, you have many geographic choices, including having a cloud instance in the next city, across the country, or in another country as your business dictates.

Why Replicate Data Between Cloud Instances?

While cloud providers take every precaution to ensure 100 percent up-time, it is possible for individual cloud servers to fail as a result of physical damage to the hardware and software glitches – all the same reasons why on-premises hardware would fail. For this reason, organizations that run their mission-critical applications in the cloud should replicate their cloud data to support high availability and disaster recovery. You can replicate data between availability zones in a single region, between regions in the cloud, between different cloud platforms, to on-premise systems, or any hybrid combination.

SIOS Real-Time Data Replication for High Availability and Disaster Recovery

SIOS Datakeeper™ uses efficient, block-level, data replication to keep your primary and secondary instances synchronized. If a failover happens, the secondary instance(s) continues to operate, providing users with access to the most recent data. With SIOS solutions, RPO is always zero and RTO is dependent on the application but typically 30 seconds to a few minutes.

SIOS products uniquely protect any Windows- or Linux-based application operating in physical, virtual, cloud or hybrid cloud environments and in any combination of site or disaster recovery scenarios, enabling high availability and disaster recovery for applications such as SAP and databases, including Oracle, HANA, MaxDB, SQL Server, DB2, and many others. The “out-of-the-box” simplicity, configuration flexibility, reliability, performance, and cost-effectiveness of SIOS products set them apart from other clustering software.

In a Windows environment, SIOS DataKeeper Cluster Edition seamlessly integrates with and extends Windows Server Failover Clustering (WSFC) by providing a performance-optimized, host-based data replication mechanism. While WSFC manages the software cluster, SIOS performs the data replication to enable disaster protection and ensure zero data loss in cases where shared storage clusters are impossible or impractical, such as in cloud, virtual, and high-performance storage environments.

In a Linux environment, SIOS LifeKeeper and SIOS DataKeeper provide a tightly integrated combination of high availability failover clustering, continuous application monitoring, data replication, and configurable recovery policies, protecting your business-critical applications from downtime and disasters.

———————————————————————————————————————————

Here is a real-world example of how one leading manufacturing company uses SIOS to create a high availability solution in the cloud using real-time data replication.

How to Achieve HA in a Cloud Environment with Real-Time Data Replication

Bonfiglioli is a leading Italian design, manufacturing, and distribution company, specializing in industrial automation, mobile machinery, and wind energy products and employing over 3,600 employees in locations around the globe. To run its business, the company relies on various mission-critical applications, including its SAP ERP system. The company’s IT infrastructure includes an on-premises VMware data center and a remote data center for business continuity and disaster protection. Since most of their applications run in a Windows environment, Bonfiglioli used guest-level Windows Server failover clustering in their VMware environment to provide high availability and disaster protection.

The company’s IT team implemented a program to move part of its IT operations into the Microsoft Azure cloud and to leverage Azure as their disaster recovery site. An important requirement of the company’s migration plan was to ensure the cloud architecture could provide better high availability protection than before and ensure Bonfiglioli could continue to meet its strict Service Level Agreements (SLAs).

In its on-premises environment, the company uses VMware clustering, which allows Windows Server Failover Clustering (WSFC) to manage failover to a secondary server in the event of an infrastructure failure. However, it was a challenge to provide this type of protection in the cloud because using guest-clustering with shared-bus disks is not a viable cloud solution. Creating a cluster in VMware using Raw Device Mapping and shared-bus disks (RDM) is challenging and creates limitations for backing up the virtual machines.

The Solution

After evaluating several solutions, Bonfiglioli chose SIOS DataKeeper as their cloud high availability and disaster recovery solution upon learning that SIOS DataKeeper is the only certified high availability clustering solution for SAP in a public cloud. In addition, Bonfiglioli’s management consulting partner, BGP, had experience with SIOS DataKeeper and knew that it is easy to install, transparent to the operating system, and a proven, highly effective solution.

With SIOS, the IT team fashioned a cluster environment without RDM. They created a two-node cluster in VMware and added SIOS DataKeeper Cluster Edition to synchronize storage via real-time data replication in each cluster instance. In an on-premises environment, synchronized storage appears to WSFC as a single shared storage disk.

SIOS DataKeeper also provides high availability protection for the company’s SAP instance and eliminates single point of failure. Using SIOS DataKeeper, the IT team replicated an SSD-tiered disk partition in the company’s on-premises data center using real-time data replication. This allows Bonfiglioli to restore their virtual machines to Microsoft Azure in the event of a disaster.

The Results

Daniele Bovina, Systems Architect at Bonfiglioli, comments about the results, “SIOS DataKeeper gave us an easy way to move our business-critical SAP system to the Microsoft Azure cloud while meeting our stringent SLAs for availability, disaster recovery, and performance.”

—————————————————————————————————————————–

For more information about SIOS Clustering Solutions, contact us or request a free trial.

References

Reproduced from SIOS

Glossary: Data Replication

May 19, 2021 by Jason Aw Leave a Comment

glossary Data Replication

Glossary of Terms: Data Replication

Definition: The practices of copying information between redundant servers and keeping the copies consistent to improve reliability, fault-tolerance, or accessibility.

Reproduced from SIOS

Case Study: Chris O’Brien Lifehouse Hospital Ensures High Availability in the AWS Cloud with SIOS DataKeeper

May 12, 2020 by Jason Aw Leave a Comment

Case Study: Chris O’Brien Lifehouse Hospital Ensures High Availability in the AWS Cloud with SIOS DataKeeper

SIOS Chosen for its Ability to Deliver both High Availability and High Performance

Chris O’Brien Lifehouse (www.mylifehouse.org.au) is an integrated and focused center of excellence specializing in state-of-the-art treatment and research for patients who are suffering from rare and complex cancer cases. Lifehouse offers everything a cancer patient might need in one place, including advanced oncology-surgery, chemotherapy, radiation therapy, clinical trials, research, education, complementary therapies and psychosocial support. Situated alongside Royal Prince Alfred Hospital and the University of Sydney in Camperdown, the not-for-profit hospital sees more than 40,000 patients annually for screening, diagnosis and treatment. As one of Australia’s largest clinical trial centers, Lifehouse also provides its patients access to the world’s latest cancer treatment breakthroughs.

The Environment

Lifehouse uses the MEDITECH healthcare Electronic Medical Record and patient administration system, which stores the electronic health records for all patients in a database.

“The health information system and database are vital to the care we provide, and if either goes down, patient records would not be accessible, and that would paralyze the hospital’s operations,” explains Peter Singer, Director Information Technology at Lifehouse.

In the hospital’s datacenter, mission-critical uptime has been provided by Windows Server Failover Clustering (WSFC) running on a Storage Area Network (SAN). But like many organizations, Lifehouse wanted to migrate to the cloud to take advantage of its superior agility and affordability.

The Challenge

Lifehouse chose Amazon Web Services as its cloud service provider, and had hoped to “lift and shift” its environment directly to the AWS cloud. To simulate its on-premises configuration, Peter chose a “cloud volumes” service available in the AWS Marketplace. Failover clusters were configured using software defined storage volumes to share data between active and standby instances, and testing proved that the approach could provide the automatic failover needed to satisfy the hospital’s demanding recovery point and recovery time objectives.

There was a problem, however: The use of software-defined cloud volumes had a substantial adverse impact on throughput performance. With so many elements and layers involved, performance problems are notoriously difficult to troubleshoot in software defined configurations deployed in the cloud. With the “No Protection” option specified, the cloud volumes performed well. But “No Protection” was not really an option for the Chris O’Brien Lifehouse Ensures High Availability in the AWS Cloud with SIOS DataKeeper

“We were able to go from testing to production in a matter of days. Ongoing maintenance is also quite simple, which we expect will minimize our operational expenditures associated with high availability and disaster recovery,” said Peter who is responsible for mission-critical MEDITECH application and its database. “We made every reasonable effort to find and fix the root cause, and eventually concluded that software-defined storage would never be able to deliver the throughput performance we needed,” Peter recalls. So the team at Lifehouse began looking for another solution.

The Evaluation

In its search for another solution capable of providing both high availability and high performance, Lifehouse established three criteria:

Validation for use in the AWS cloud
Ability to work across multiple Availability Zones
Performance that was as good as or better than what had been achieved on-premises
Security / Privacy with support for encryption in motion and at rest

Validation was important to minimize risk associated with using a third-party solution in the cloud. The ability to work across multiple Availability Zones would assure business continuity in the event an entire AWS datacenter was impacted by a localized disaster. The sub-millisecond latency AWS delivers between Availability Zones would be critical to being able to replicate data synchronously to “hot” standby instances to meet the hospital’s demanding recovery time and recovery point objectives.

After conducting an exhaustive search, Peter concluded that the best available solution was SIOS DataKeeper Cluster Edition from SIOS Technology. SIOS DataKeeper was available on the AWS Marketplace, which assured it was proven to operate reliably in the AWS cloud. And because it did not use software-defined storage, Peter was confident SIOS DataKeeper would be able to deliver the performance Lifehouse needed.

The Solution

SIOS DataKeeper provides the high-performance, synchronous data replication Lifehouse needs. By using real-time, block-level data mirroring between the local storage attached to all active and standby instances, the solution overcomes the problems caused by the lack of a SAN in the cloud, including the poor performance that often plagues software-defined storage. The resulting SANless cluster is compatible with Windows Server Failover Clustering, provides continuous monitoring for detecting failures at the application and database levels, and offers configurable policies for failover and failback.

Lifehouse currently has eight instances in SANless failover clusters to support its MEDITECH application and database across different AWS availability zones to protect against widespread disasters. The latency inherent across the long distances involved normally requires the use of asynchronous data replication to avoid delaying commits to the active instance of the database. But the real-time, block level data mirroring technology used in SIOS DataKeeper still enables Peter Singer to achieve a near-zero recovery point.

The Results

Unlike software-defined shared storage, SIOS DataKeeper is purpose-built for high performance high availability, so it came as no surprise to Peter Singer that the cloudbased configuration now works as needed. What was a bit surprising was just how easy the solution has been to implement and operate: “We were able to go from testing to production in a matter of days. Ongoing maintenance is also quite simple, which we expect will minimize our operational expenditures associated with high availability and disaster recovery.”

SIOS DataKeeper has enabled Lifehouse to take full advantage of the economies of scale afforded in the cloud without sacrificing uptime or performance. “If it were not for SIOS, we might not have been able to migrate our environment to the cloud,” Peter Singer concluded.

Download the pdf

Platforms to replicate data (Host-Based Replication vs SAN Replication)

December 10, 2018 by Jason Aw Leave a Comment

Choosing Platforms To Replicate Data – Host-Based Or Storage-Based?

Two common platforms to replicate data are from the server host that operates against the data and from the storage array that holds the data.

When creating remote replicas for business continuity, the decision whether to deploy a host- or storage-based solution depends heavily on the platform that is being replicated and the business requirements for the applications that are in use. If the business demands zero impact to operations in the event of a site disaster, then host-based techniques provide the only feasible solution.

Host-Based Replication

One of the two platforms to replicate data is Host-based replication. It doesn’t lock users into a particular storage array from any one vendor. SIOS SteelEye DataKeeper, for example, can replicate from any array to any array, regardless of vendor. This ability ultimately lowers costs and provides users the flexibility to choose what is right for their environment. Most host-based replication solutions can also replicate data natively over IP networks, so users don’t need to buy expensive hardware to achieve this functionality.

Host-based solutions are storage-agnostic, providing IT managers complete freedom to choose any storage that matches the needs of the enterprise. The replication software functions with any storage hardware that can be mounted to the application platform, offering heterogeneous storage support. It can operate at the block or volume level are also ideally suited for cluster configurations.

One disadvantage is that host-based solutions consume server resources and can affect overall server performance. Despite this possibility, a host-based solution might still be appropriate when IT managers need a multi-vendor storage infrastructure or have a legacy investment or internal expertise in a specific host-based application.

Storage-Based Replication

Another platforms to replicate data is the storage-based replication is OS-independent and adds no processing overhead. However, vendors often demand that users replicate from and to similar arrays. This requirement can be costly, especially when you use a high-performance disk at your primary site — and now must use the same at your secondary site. Also, storage-based solutions natively replicate over Fibre Channel and often require extra hardware to send data over IP networks, further increasing costs.

A storage-based alternative does provide the benefit of an integrated solution from a dedicated storage vendor. These solutions leverage the controller of the storage array as an operating platform for replication functionality. The tight integration of hardware and software gives the storage vendor unprecedented control over the replication configuration and allows for service-level guarantees that are difficult to match with alternative replication approaches. Most storage vendors have also tailored their products to complement server virtualization and use key features such as virtual machine storage failover. Some enterprises might also have a long-standing business relationship with a particular storage vendor; in such cases, a storage solution might be a relevant fit.

Choices

High quality of service comes at a cost, however. Storage-based replication invariably sets a precondition of like-to-like storage device configuration. This means that two similarly configured high-end storage arrays must be deployed to support replication functionality, increasing costs and tying the organization to one vendor’s storage solution.

This locking in to a specific storage vendor can be a drawback. Some storage vendors have compatibility restrictions within their storage-array product line, potentially making technology upgrades and data migration expensive. When investigating storage alternatives, IT managers should pay attention to the total cost of ownership: The cost of future license fees and support contracts will affect expenses in the longer term.

Cost is a key consideration, but it is affected by several factors beyond the cost of the licenses. Does the solution require dedicated hardware, or can it be used with pre-existing hardware? Will the solution require network infrastructure expansion and if so, how much? If you are using replication to place secondary copies of data on separate servers, storage, or sites, realize that this approach implies certain hardware redundancies. Replication products that provide options to redeploy existing infrastructure to meet redundant hardware requirements demand less capital outlay.

Pros And Cons

Before deciding between a host- or storage-based replication solution, carefully consider the pros and cons of each, as illustrated in the following table.

	Host-Based Replication	Storage-Based Replication
Pros	Storage agnostic Sync and async Data can reside on any storage	Unaffected by storage upgrades Single vendor for storage and replication No burden on host system OS agnostic
Cons	Use of computing resources on host	Vendor lock-in Higher cost Data must reside on array Distance limitations of Fibre Channel
Best Fit	Multi-vendor storage environment Need option of sync or async Implementing failover cluster Replicating to multiple targets	Prefer single vendor Limited distance and controlled environment Replicating to single target

To understand how SIOS can work on platforms to replicate data, do read our success stories

Reproduced with permission from Linuxclustering