Clustering Simplified Archives - Page 30 of 104

8 Changes That Can Undermine Your High Availability Solution

March 2, 2023 by Jason Aw Leave a Comment

8 Changes That Can Undermine Your High Availability Solution

As VP of Customer Experience I have observed that most organizations are conscious and careful of deploying any tools or processes that could have an impact on their businesses’ high availability. These companies typically go through great care with regards to HA including strict change vetting for any HA Clustering, DR, Security, or Backup solution changes. Most companies understand changes to these tools needs to be carefully considered and tested so as to avoid impact to the overall application availability and system stability. IT administrators are aware that even the most inconspicuous change in their HA Clustering, Disaster Recovery, Security, or Backup solution can lead to a major disruption.

However, changes in other workplace and productivity tools are most often not considered with the same diligence.

Eight changes that can be undermining your HA solution:

Lost Documentation

Your existing tools often encapsulate a lot of documentation around the company, decisions, integrations, and overall HA architecture. As teams transition to new tools, these documents are often lost, or access becomes blocked or hampered.

Suggested Improvement: Export and import all existing documents into the new tool. Use archive storage and backups to retain complete copies of the data before the import.

Lost Requirements

Similar to the lost documents, requirements are often the first thing to be lost when transferring tools.

Suggested Improvement: Document known requirements, export requirements related documents from any existing productivity tools.

Lost History and Mindshare

Almost as important as the documentation and requirements is the history behind changes, revisions, and decisions. Many organizations keep historical information within workplace and office productivity tools. Such information could include decisions around tools and solutions that have been previously evaluated. When these workplace tools are changed or transitioned, this type of history can be lost. Existing tools often have a lot of tacit knowledge involved with them as well. As the new tools are integrated, that knowledge and mindshare disappears. Two decades ago our team migrated bug tracking solutions. The knowledge gap between the tools was huge and impacted multiple departments, including the IT team now tasked with managing, backing up, and resolving issues.

Suggested Improvement: Be sure to adequately train and transfer mindshare and knowledge between new tools. Be sure that history, context and decisions around the current tools and previous tools are documented before terminating the current tool

Lost Access / Access control

Every new tool has a different set of security and access rules. Often in the transition teams end up with too many admins, not enough admins, or too many restrictions on permissions.

Suggested Improvement: Map access and user controls, based on requirements and security rules, in advance and have a process for quick resolution.

Lost Contacts

Email and contact system migrations are rarely seamless. Even upgrades between existing versions can have consequences. One downside of a migration from one tool (Exchange to Gmail) could be lost contacts. Our team worked with a customer who once called our support team for help obtaining their partner contacts. Their transition for email systems had stalled and access to critical contacts was delayed.

Suggested Improvement: Plan for contact migration and validation. Be sure that any critical contacts for your HA cluster are definitely a part of a validated migration step.

Broken integration

Broken integrations is a very common item that impacts high availability, monitoring and alerting. As companies move towards newer productivity tools, existing integrations may no longer work and require additional development. As a relative example, a company previously using Skype for messaging, moved away to Slack. Many of the tools that delivered messages via Skype needed to be adjusted. In your HA environment, a broken integration between dashboards or alert systems could mean critical notifications are not received in a timely manner.

Suggested Improvement: Map out any automated workflows to help identify integration points between tools. Also work to identify any new requirements and integration opportunities. Plan and test integrations during the proof of concept or controlled deployment phase.

Lost Champions

Every tool set has a champion and a critic. The champion may or may not be the same as your administrator. The role of the champion changes within each organization and often with each tool, but what is common among them is their willingness to address issues, problems, or challenges with the new productivity tool for the benefit of themselves and others. The champion is the first to find the new features, uncover and report new issues, and help onboard new people to the toolset. Champions go beyond mindshare and history. Often with the changing of tool sets, your team will lose a champion.

Lost Productivity

New tools, even those not directly related to HA have an impact on your team’s productivity. Even tools related to priority management, development and code repositories require ramp up and onboarding time. This time often translates into lost productivity, which can translate into risks to your cluster. Make sure your processes related to all of the existing and new tools are documented well so that the change to a new tool does not cause confusion, break process flow and lead to even greater losses of productivity.

Suggested Improvement: Reduce the risk of lost productivity by using training tools, leveraging a product champion, and making sure that rollout focuses on shortening the learning curve

Changing workplace productivity tools so that you don’t undermine your high availability solution requires capturing requirements, identifying key documents, transferring mindshare, mapping dependencies, testing and configuring proper access, identifying a toolset champion. It’s making sure that your new tools actual improve productivity rather than pull your key resources away from maintaining uptime.

Cassius Rhue, VP Customer Experience

Reproduced with permission from SIOS

High Availability Options for SQL Server on Azure VMs

February 28, 2023 by Jason Aw Leave a Comment

High Availability Options for SQL Server on Azure VMs

Microsoft Azure infrastructure is designed to provide high availability for your applications and data. Azure offers a variety of infrastructure options for achieving high availability, including Availability Zones, Paired Regions, redundant storage, and high-speed, low-latency network connectivity. All of these services are backed by Service Level Agreements (SLAs) to ensure the availability of your business-critical applications. This blog post will focus on high availability options when running SQL Server in Azure Virtual Machines.

Azure Infrastructure

Before we jump into the high availability options for SQL Server, let’s discuss the vital infrastructure that must be in place. Availability Zones, Regions, and Paired Regions are key concepts in Azure infrastructure that are important to understand when planning for the high availability of your applications and data.

Availability Zones are physically separate locations within a region that provides redundant power, cooling, and networking. Each Availability Zone consists of one or more data centers. By placing your resources in different Availability Zones, you can protect your applications and data from outages caused by planned or unplanned maintenance, hardware failures, or natural disasters. When leveraging Availability Zones for your SQL Server deployment, you qualify for the 99.99% availability SLA for Virtual Machines.

Regions are geographic locations where Azure services are available. Azure currently has more than 60 regions worldwide, each with multiple Availability Zones. By placing your resources in different regions, you can provide even greater protection against outages caused by natural disasters or other significant events.

Paired Regions are pre-defined region pairs that have unique relationships. Most notably, paired Regions replicate data to each other when geo-redundant storage is in use. The other benefits of paired regions are region recovery sequence, sequential updating, physical isolation, and data residency. When designing your disaster recovery plan, it is advisable to use Paired Regions for your primary and disaster recovery locations.

Using Availability Zones and Paired Regions in conjunction with high availability options such as Availability Groups and Failover Cluster Instances, you can create highly available, resilient SQL Server deployments that can withstand a wide range of failures, minimizing downtime.

SQL Server Availability Groups and Failover Cluster Instances

SQL Server Availability Groups (AGs) and SQL Server Failover Cluster Instances (FCIs) are both high availability (HA) and disaster recovery (DR) solutions for SQL Server, but they work in different ways.

An AG is a feature of SQL Server Enterprise edition that provides an HA solution by replicating a database across multiple servers (called replicas) to ensure that the database is always available in case of failure. AGs can be used to provide HA for both a single database and multiple databases.

SQL Server Standard Edition supports something called a Basic AG. There are some limitations to Basic AGs in SQL Server. Firstly, a Basic AG only supports a single database. You need an AG for each database and the associated IP address and load balancer if you have more than one database. Additionally, Basic AGs do not support read-only replicas. While Basic AGs provide a simple way to implement HA for a single database, they may not be suitable for more complex scenarios.

On the other hand, a SQL Server FCI is a Windows Server Failover Cluster (WSFC) that provides an HA solution by creating a cluster of multiple servers (called nodes) that use shared storage. In the event of a failure, the SQL Server instance running on one node can fail over to another.

In SQL Server 2022 Enterprise Edition, the new Contained Availability Groups (CAG) address some of the AG limitations by allowing users to create system databases to CAG, which can then be replicated. CAG eliminates the need to synchronize things like SQL logins and SQL Agent jobs manually.

Availability Groups and Failover Cluster Instances have their own pros and cons. AGs have advanced features like readable secondaries and synchronous and asynchronous replication. However, AGs require the Enterprise Edition of SQL Server, which can be cost-prohibitive, particularly if you don’t need any other Enterprise Edition features.

FCIs protect the entire SQL Server instance, including all user-defined databases and system databases. FCIs make management easier since all changes, including those made to SQL Server Agent jobs, user accounts and passwords, and database additions and deletions, are automatically reconciled on all versions of SQL Server, not just SQL 2022 with CAG. FCIs are available with SQL Server Standard Edition, which makes it more cost-effective. However, FCIs require shared storage, which presents challenges when deploying in environments that span Availability Zones, Regions, or hybrid cloud configurations. Read more about how SIOS software enables high availability for SQL servers.

Storage Options for SQL Server Failover Cluster Instances

Regarding storage options for SQL Server Failover Cluster Instances that span Availability Zones, there are three options: Azure File Share, Azure Shared Disk with Zone Redundant Storage, and SIOS DataKeeper Cluster Edition. There is a fourth option, Storage Spaces Direct (S2D), but that is limited to single AZ deployments, so clusters based on S2D would not qualify for the 99.99% SLA and would be susceptible to failures that impact and entire AZ.

Azure File Share

Azure File Share with zonal redundancy (ZRS) is a feature that allows you to store multiple copies of your data across different availability zones in an Azure region, providing increased durability and availability. This data can then be shared as a CIFS file share, and the cluster connects to it using the SMB 3 protocol.

Azure Shared Disk

Azure Shared Disk with Zone Redundant Storage (ZRS) is a shared disk that can store SQL Server data for use in a cluster. SCSI persistent reservations ensure that only the active cluster node can access the data. If a primary Availability Zone fails, the data in the standby availability zone becomes active. Shared Disk with ZRS is only available in the West US 2, West Europe, North Europe, and France Central regions.

SIOS DataKeeper Cluster Edition

SIOS DataKeeper Cluster Edition is a storage HA solution that supports SQL Server Failover Clusters in Azure. It is available in all regions and is the only FCI storage option that supports cross Availability Zone failover and cross Region failover. It also enables hybrid cloud configurations that span on-prem to cloud configurations. DataKeeper is a software solution that keeps locally attached storage in sync across all the cluster nodes. It integrates with WSFC as a third-party storage class cluster resource called a DataKeeper volume. Failover Cluster controls all the management of the DataKeeper volume, making the experience seamless for the end user. Learn more about SIOS DataKeeper.

Summary

In conclusion, Azure provides various infrastructure options for achieving high availability for SQL Server deployments, such as Availability Zones, Regions, and Paired Regions. By leveraging these options, in conjunction with high availability solutions like Availability Groups and Failover Cluster Instances, you can create a highly available, resilient SQL Server deployment that can withstand a wide range of failures and minimize downtime. Understanding the infrastructure required and the pros and cons of each option is essential before choosing the best solution for your specific needs. It’s advisable to consult with a SQL and Azure expert to guide you through the process and also review the Azure documentation and best practices. With the proper planning and implementation, you can ensure that your SQL Server deployments on Azure are always available to support your business-critical applications.

Reproduced with permission from SIOS

Exploring High Availability Use Cases in Regulated Industries

February 24, 2023 by Jason Aw Leave a Comment

Exploring High Availability Use Cases in Regulated Industries

While downtime in business-critical systems, databases, and applications imposes costs on every organization, different industries have different consequences associated with unplanned downtime. In this post, we explore high availability use cases and SIOS customer success stories in the financial services, healthcare, manufacturing, and education industries.

High Availability for Financial Services

Ranging from small credit unions to regional banks to global investment firms, financial services is a highly regulated, fast-paced industry in which billions of dollars in electronic transactions occur every second. Thus, the average cost of downtime ($300,000 for a single hour of downtime according to the ITIC 2021 Hourly Cost of Downtime Survey) can be significantly higher for a financial services firm than for other industries.

Large Financial Services Firm Adds High Availability/Disaster Recovery for Critical Securities Applications on Oracle Databases

One of the oldest financial services firms in China provides securities and futures brokerage as well as investment banking, asset management, private equity, alternative investments, and financial leasing services. It is listed on both the Shanghai and the Hong Kong Stock Exchanges. The firm has 343 branches spanning 30 provinces, municipalities, and autonomous regions in the Peoples Republic of China. It also operates across 14 countries and major international cities, serving approximately 18 million customers.

THE ENVIRONMENT
The company relies on securities trading applications based on Oracle Database running in a Red Hat Linux operating system environment. While the firm’s IT team was backing up these applications and database frequently, they could not recover operation quickly in the event of a failure or disaster.

THE CHALLENGE
The financial services firm wanted to implement high availability (99.99% uptime) protection for its critical applications and the Oracle database upon which it relies.

THE EVALUATION
The firm’s IT team wanted a clustering solution that would ensure they could reliably meet their service level agreements (SLAs) for high availability as well as their stringent recovery time and recovery point objectives (RTO, RPO). It needed to be proven to provide high availability in a Linux environment. They also wanted a solution that would reduce the complexity of clustering in an open-source environment.

THE SOLUTION
The firm created a two-node cluster on physical servers using SIOS Protection Suite for Linux clustering software. The SIOS clustering software monitors the entire application stack—network, storage, OS, and application. In the event of a failover, the software orchestrates the failover of application operation to the secondary node in the cluster. Application-aware modules in SIOS Protection Suite simplifies the complexity of configuring a cluster for Linux environments.

THE RESULTS
The firm has been using SIOS Protection Suite for Linux clustering for many years and continually implements SIOS products in their environments. In that time, the firm has consistently met its availability SLAs. From its straightforward implementation to its reliable, easyto-use management, SIOS clustering software has met or exceeded the firm’s expectations.

High Availability for Healthcare

Downtime for applications and storage in the healthcare industry can literally be a matter of life and death. It’s imperative to assure reliable access to critical systems used in hospitals and surgery centers, as well as electronic health records (EHR) and medical imaging technology such as picture archiving and communication systems (PACS). The healthcare industry has also increasingly been targeted in ransomware attacks, leading to significant downtime.

Lifehouse Hospital Ensures High Availability in Amazon Web Services with SIOS DataKeeper

Chris O’Brien Lifehouse Hospital (www.mylifehouse.org.au) specializes in state-of-the-art research and treatment of rare and complex cancer cases. The not-for-profit hospital sees more than 40,000 patients annually for screening, diagnosis, and treatment.

THE CHALLENGE
Lifehouse uses MEDITECH for patient administration and central storage of its patients’ electronic health records. “The health information system and database are vital to the care we provide. If either goes down, patient records would not be accessible, and that would paralyze the hospital’s operations,” explains Peter Singer, director of Information Technology at Lifehouse. In the hospital’s data center, mission-critical uptime has been provided by Windows Server Failover Clustering (WSFC) running on a SAN. Like many organizations, Lifehouse planned to migrate to the cloud to take advantage of its agility and affordability.

Lifehouse chose Amazon Web Services (AWS) and had hoped to “lift and shift” its environment directly to the AWS cloud. To simulate its on-premises configuration, Singer chose a “cloud volumes” service available in the AWS Marketplace. Failover clusters were configured using Amazon FSx software-defined storage volumes to share data between active and standby instances. However, the software-defined cloud volumes had a substantial adverse impact on throughput performance. With the “No Protection” option, the cloud volumes performed well, but “no protection” wasn’t really an option for the mission-critical MEDITECH application and its database.

THE SOLUTION
After conducting an exhaustive search, Singer concluded that the best solution was SIOS DataKeeper. SIOS DataKeeper provides the high-performance, synchronous data replication that Lifehouse needs. By using real-time, block-level data mirroring between the local storage attached to all active and standby instances, the solution overcomes the problems caused by the lack of a SAN in the cloud, including the poor performance that often plagues software-defined storage (see Figure 3). The resulting SANless cluster is compatible with WSFC, provides continuous monitoring for detecting failures at the application and database levels, and offers configurable policies for failover and failback.

THE RESULTS
Unlike software-defined storage, SIOS DataKeeper is purpose-built for high performance and high availability, so it was no surprise to Singer that the cloud-based configuration works as needed. But Singer was pleasantly surprised by how easy the solution is to implement and operate: “We were able to go from testing to production in a matter of days. Ongoing maintenance is also quite simple, which we expect will minimize our operational expenditures associated with high availability and disaster recovery.”

High Availability for Manufacturing

There’s a great deal of attention on the supply chain today. Although much of the recent focus has been on logistics issues (such as congestion in the Port of Los Angeles) and cyberattacks (such as the Colonial Pipeline ransomware attack), many critical challenges extend further up the supply chain. For more than 50 years, lean manufacturing (also known as just-in-time inventory management) has been a hallmark of efficiency in the manufacturing industry. However, “just-in-time” means exactly that and there’s no room for system or application downtime in manufacturing.

SIOS DataKeeper Cluster Edition Protects Van de Lande Data Systems

Van de Lande BV (VDL) specializes in the manufacture of PVC-U and PE pressure fittings and valves for plastic piping systems, both made from tube and injection molded. Its products are used all over the world in industrial and technical installations. What sets VDL apart is its impressive range of product types and sizes, and its continuous commitment to product improvement and enhancement. As a result, VDL has been the brand of choice for builders of systems and installations for more than 50 years.

THE CHALLENGE
VDL started with a virtualized server environment, based on Xen and CentOS. Later, the company implemented KVM and Hyper-V. This heterogeneous environment proved difficult to maintain so VDL gradually switched to the Windows Hyper-V environment.

Before implementing the SIOS DataKeeper solution, VDL relied on shared storage (SAN) for its main storage. To improve performance, they decided to move to local storage based on solid-state disk (SSD) instead of traditional spinning disks.

However, VDL relies heavily on the availability of its ERP database. With only one primary data processing system, VDL needed a reliable, comprehensive disaster recovery solution to ensure the availability of its systems in the event of a sitewide disaster.

To prevent downtime, the company needed its servers to replicate data to a backup server for disaster protection. If one server fails, the other server takes over operation. This failover process sustains operations, maximizes uptime, and enables user productivity.

THE SOLUTION
To deliver full failover and disaster recovery protection, VDL built a Windows Server Failover Cluster (WSFC) system, with each node replicating data to the other. If one node fails, operation continues on the other server and no data is lost. The joint solution of Microsoft Hyper-V with SIOS DataKeeper Cluster Edition software provided the availability and disaster protection that was essential to VDL.

VDL uses SIOS DataKeeper Cluster Edition software to ensure continuous availability of applications, databases, and web services. SIOS DataKeeper software integrates with WSFC to create a “mirrored” server system between two Windows cluster nodes. If the primary node fails, WSFC transfers all operations to the other node while enabling continuous access to applications and data (which is protected at the volume level). SIOS DataKeeper software enables disaster recovery without the long downtime and recovery time associated with traditional backup and restore technology. SIOS DataKeeper works with Microsoft WSFC to monitor system and application health, maintain client connectivity, and provide uninterrupted data access, giving VDL the reliable, fault-resilient system the company needed.

SIOS DataKeeper Cluster Edition further extends the capabilities of Microsoft Cluster Services and Windows Server Failover Clustering. SIOS DataKeeper Cluster Edition also supports real-time replication of Hyper-V virtual machines between physical servers across either LAN or WAN connections.

For companies like VDL, SIOS DataKeeper Cluster Edition software reduces the cost of deploying clusters by enabling them to create a SANless cluster that eliminates the cost, complexity, and single point of failure risk of a SAN in a traditional shared storage cluster. The cluster implementation ran smoothly and took less than a day. Following a thorough evaluation of the VDL server configuration and testing, the installation team found that the SANless cluster with SIOS DataKeeper Cluster Edition software met all of their criteria for disaster recovery, performance, and high availability. During the system failover test, the network services team easily failed over and failed back the system quickly and easily.

THE RESULTS
VDL now has a comprehensive high availability/disaster recovery solution that keeps its mission-critical applications such as its web services and ERP database always available. SIOS DataKeeper software provides continuous real-time, host-based, block-level replication delivering continuous access to customer and inventory records. VDL deployed two SIOS DataKeeper clusters that protect a file server, print server, SQL Server (ERP), Microsoft Dynamics NAV web services, NiceLabel NiceWatch label service, and iSCSI server.

One two-node cluster works as a file server and iSCSI server, while the other supports a SQL Server cluster and Dynamics NAV web services. The IT infrastructure consists of three Hyper-V hosts with 60 VMs installed on it, one BackupExec server, 50 desktop users, and 25 mobile barcode scanners, which are connected via web services to the ERP system. Every host contains 240GB SSDs in a RAID 60 configuration with a total of 3TB local storage. The systems are connected through 10 Gigabit interfaces.

High Availability for Education

In the wake of the global pandemic, distance learning has become a key teaching format in postsecondary education, as well as primary and secondary education. In postsecondary education, distance learning enables global outreach for colleges and universities to attract a diverse student body. Thus, uptime has become increasingly important in education, with students and professors requiring access to various systems including library databases, student records, and high-performance computing (HPC)—for example, to support medical research, testing applications, and more. Downtime can also be costly as students (potentially from around the world) rush to register online, vying for limited class space.

Major University Gives SIOS LifeKeeper for Linux Top Marks

When a leading university in New York decided to revamp its enterprise resource planning (ERP) system, it hoped to improve performance, especially during peak registration periods, and reduce overall total cost of ownership (TCO). The university serves more than 10,000 students and uses an Oracle database to maintain all the information for student registration on an HP/UX SAN-based storage environment with replication of its full SAN architecture between two fabrics in a cluster.

THE CHALLENGE
The university needed an alternative solution that would be more cost-effective, deliver faster performance, and provide high availability and data protection for its mission-critical Oracle database. Realizing that it needed a more robust clustering solution than it could get with a typical Linux solution, the university turned to SIOS. “We wanted a suitable clustering replacement,” says the university’s assistant vice president of IT. “SIOS explained that they could integrate and set up a cluster with SanDisk Fusion ioMemory products. They were completely proactive.”

THE SOLUTION
SIOS LifeKeeper for Linux provides application failover, SIOS DataKeeper provides data replication, and the SIOS Oracle Application Recovery Kit offers extra protection for the school’s database right out of the box. SIOS’s strategic alliance with Western Digital also delivers a high-performance SanDisk flash solution for the school’s clustered environment.

Employing servers configured with SanDisk Fusion ioMemory-based IO accelerators, the university could replace its bulky and expensive SAN-based setup with a streamlined server set running Linux. The integration of ioMemory-based IO accelerators with both the servers and SIOS LifeKeeper for Linux offers better performance and availability than traditional legacy solutions, making the SIOS solution a perfect fit.

THE RESULTS
The cost of three servers and the SIOS solution was less than the cost of one of the university’s old HP/UX servers. The school used the savings to add more memory. The university also eliminated risk because the shared disks in a traditional SAN-based cluster can be a single point of failure. That’s why it originally had two SANs. SIOS replication with SanDisk Fusion ioMemory eliminated this single point of failure, cost far less than fabric switches, and provided more data copies. The combination of SanDisk’s flash-based Fusion ioMemory for storage and SIOS replication improved the school’s TCO, reduced data center costs, and reduced its environmental impact thanks to reduced power and cooling demands.

Learn more about SIOS high availability solutions.

Reproduced with permission from SIOS

How to Get Started Successfully with SIOS Documentation

February 16, 2023 by Jason Aw Leave a Comment

How to Get Started Successfully with SIOS Documentation

Introduction:

Documentation is very important as it provides the foundation for how a product or service functions, enables quick troubleshooting, and provides a wealth of information that can help identify a problem and provide understanding of its solution/workaround. Navigating that wealth of information, can sometimes feel like trying to find the small eyeglass repair kit in the kitchen junk drawer of thingamabobs, screw drivers, tape spools, random nuts and bolts and other items. How can you make the most out of the useful tools the SIOS documentation site offers? Here are three tips to help get started with getting the most out of the SIOS documentation site.

Tip #1: How do I get to SIOS documentation?

How to reach our documentation site? Well there are three ways:

The first way is just by going directly to docs.us.sios.com.

2. The second way to get to our documentation site is via us.sios.com at the top left hand area of the screen:

3. The third and last way is through the support topic next to the documentation tab in the screenshot above. From the “Support” tab you will be taken to this screen:

Select “Product Documentation” to be brought to docs.us.sios.com.

When initially landing on our doc page, you may want to Select the “Technical Documentation” of the product purchased:

All the links lead to a page within our documentation, but this is the best place to get started. Please be sure to select the correct “Operating System” when selecting a topic. We have documentation that includes both Windows AND Linux.

In case you need further clarification on the products that SIOS offers for both Windows and Linux, we have provided names and abbreviations of the products that we offer below. By following the link of the product(s) purchased or the product that is of most interest to you, you will find the summary of each product offered along with different paths of how you may want to get started as seen on each product’s “Technical Documentation” page.

Once you have landed within our documentation, the next step would be understanding the features and recovery kits that would work best for your project environment or searching topics that interest you. Here is the list of each of the products that SIOS offers:

Products SIOS Offers:

Linux

LifeKeeper for Linux (LK/LK4L)
LifeKeeper Single Server for Linux (LKSSP/LKSSP4L)

Windows

LifeKeeper for Windows (LK-W/LK4W)
LifeKeeper Single Server for Windows (LKSSP4W)
DataKeeper Standard Edition (DK)
DataKeeper Cluster Edition (DKCE)

Application Recovery Kits – Tools and utilities that allow LifeKeeper for Windows/Linux to manage and control a specific application.

All Application Recovery Kits are available for use with our LifeKeeper product.

Why is it important to know my product and accompanying recovery kits? It is important to acquire knowledge of most options surrounding the purchased product. One, it helps with understanding the Recovery Kits attached for the best utilization and ease of use of our product. Two, if support is needed for additional assistance, you will be able to know the outs and ins of what is being presented so that support can enlighten you in something possibly new you may not know about the product providing even more self-sufficiency and self-resolution.

For more information about products and resources, please visit our product page at us.sios.com/products.

If the two initial paths mentioned earlier do not work for you when landing on the initial page linked above, please utilize our left nav menu or search bar to find the topic you are looking for.

Tip #2: How do I search for a particular topic?

Knowing what to search for and ways to search can be a challenge. You can locate the search bar after selecting a product by looking at the top right hand corner of our documentation.

Also be sure to take note of the version number and product. (See product names and acronyms above):

Tip #2a: How do I find documentation on older product versions?

As we update the product with every release, we archive older versions on our documentation homepage. At the bottom of docs.us.sios.com there is a section of “All Supported Releases of Windows/Linux Products”.

Right below the grouping of “All Supported Releases of Windows/Linux Products” is our Product Support Schedule where we update the releases that are still being supported.

Once you have an understanding of which release you need documentation for you will be able to navigate to a certain topic or solution within our documentation no problem.

You can also locate information via Google that will help specify a problem: Insert the problem, followed by “site:us.sios.com” (ex. “split brain”site:us.sios.com”) and can be helpful in finding information on what you are looking for.

Within the last year, here is a list of the majority of the words searched. As seen below the range is incredible!

Our documentation contains information on status updates, error codes, commands, specific issues/solutions, etc.

Tip #3: Commonly Requested Topics – Staying Informed

Depending on what you are coming to the documentation to look at, you may want to consider our most viewed topics to help you find the information you may be looking for. The main viewed topics are our Technical documentation, support matrix, release notes, quick start guide, and product support schedule. If you encounter an issue or when coming from a case you may want to take a look at these sections to make sure there was not an update or bug fix in the version you are running that could have led to your issue.

Within the last year, these are the most commonly searched/visited topics in DataKeeper Cluster Edition, LifeKeeper for Windows and LifeKeeper for Linux:

DataKeeper Cluster Edition top 5 topics:

LifeKeeper for Windows top 5 topics:

LifeKeeper for Linux top 5 topics:

The top topics SIOS recommends taking a look at are our Upgrade, Solutions/Video Solutions, Release Notes, Known Issues and Workarounds, and Best Practices sections. Solution pages are both present in Windows and Linux documentation.

It is very important to stay informed on our new releases and upgrades whenever possible.

Considering upgrading to the latest version is important because issues in earlier versions are usually being worked on or fixed in the newer version. Utilizing the release notes is very important as well as it lists a wide variety of information that include bug fixes, known issues, recently unsupported items, discontinued features and new features, etc. Check it out!

Conclusion:

After reading this we hope that we have provided helpful information to better assist with getting started successfully with our documentation site. We hope we have helped assist in understanding where to look, key components to pay attention to, and ways of staying up-to-date on what is new with our product(s). When an issue arises we hope that our documentation will provide a quicker and easier resolution. Our goal is to not only try and help with solving the problem at hand but also understanding the why behind it as well. Please let us know how we can improve our documentation even more. Also, please see our second blog, diving further into exploring how to use documentation to help resolve a specific issue.

Reproduced with permission from SIOS

Multi-Cloud High Availability for Business-Critical Applications

February 12, 2023 by Jason Aw Leave a Comment

Multi-Cloud High Availability for Business-Critical Applications

Cloud computing has become ubiquitous over the last decade with 99% of organizations using at least one public or private cloud according to the Flexera 2021 State of the Cloud Report. While AWS, Microsoft Azure, and GCP are the top three public cloud providers today, many organizations—whether by design or by accident—have adopted a multi-cloud strategy that allows them to pick and choose which cloud services are most compelling and best suited to their unique business requirements. According to the Flexera report, 92% of enterprises today have a multicloud strategy and use an average of 2.6 public and 2.7 private clouds, including Software-as-a-Service (SaaS), Platform-as-a-Service (PaaS), and Infrastructure-as-a-Service (IaaS) offerings.

What is multicloud?

A multi-cloud is simply an environment that consists of two or more public and/or private clouds (including SaaS, PaaS, and IaaS). The different services in a multi-cloud environment may interoperate (in which case it might be a hybrid cloud) or may not necessarily interoperate (essentially operating as separate cloud silos). Remember, although all hybrid clouds are multi-clouds, not all multi-clouds are hybrid clouds.

The Evolution (and Wide Adoption) of Multi-Cloud as a Strategy

A multi-cloud environment consists of a combination of any two or more public or private cloud offerings including SaaS, PaaS, and IaaS. Thus, an organization’s multi-cloud strategy may consist of an enterprise workload running on Amazon Elastic Cloud Compute (EC2) and using Microsoft 365 for email and back-office applications. Or an organization may connect a custom database hosted in a private cloud to Salesforce, a public cloud SaaS offering.

A hybrid cloud environment consists of a mix of on-premises, private cloud, and public cloud environments. According to the Flexera report, 80% of enterprises have a hybrid cloud strategy (see Figure 4). Multi-cloud environments often evolve as a result of shadow IT, in which different departments procure cloud services to meet their individual needs without necessarily consulting a centralized IT department. For example, your marketing team may have started using Salesforce long before IT deployed its first workload in AWS, while your HR and finance departments were busy adding Workday and Concur to the mix of SaaS applications that your organization now depends on. Or perhaps you have application development teams that work on different projects across the globe. One development team may prefer Azure DevOps, whereas another team may prefer the open source tools in AWS. Thus, your multi-cloud strategy may have evolved purely by accident—which isn’t necessarily a bad thing.

Your different departments are empowered to select best-of-breed solutions to meet their needs while your app dev teams can maximize productivity and reduce time-to-market working in their preferred development environments.

Multi-cloud environments also evolve by design, for example, due to regulatory requirements, mergers and acquisitions, or to implement high availability and disaster recovery strategies.

Regulatory language can be vague and confusing. For example, the Financial Conduct Authority (FCA) regulations on outsourcing IT state that firms must be able to “know how they would transition to an alternative service provider and maintain business continuity.” This statement implies that regulated firms need to at least plan for a secondary cloud environment. Given the risk-averse nature of many heavily regulated firms, these types of issues have led many to adopt a multicloud strategy.

Integrating IT systems and consolidating data centers and cloud environments after a merger or acquisition is a significant challenge. There are a number of factors that can complicate this challenge, including existing contracts with cloud providers or co-location providers. Similar to consolidating physical data centers, consolidating cloud workloads can be a major effort that doesn’t deliver significant business value, so it’s frequently delayed for higher-priority projects.

Finally, multi-cloud strategies are often adopted to support high availability and disaster recovery requirements. In evaluating major public cloud outages across AWS and Azure, most outages are typically limited to a single cloud region at a time (and are most commonly software-related).

More and more organizations (34% according to the Flexera report) have taken the added step of deploying their mission-critical workloads across multiple public cloud providers. This can be much easier for static workloads, such as websites and applications that can run independently of one another. For distributed systems, such as databases and directory services (for example, Active Directory), multi-cloud disaster recovery can be far more challenging.

Understanding Unique Challenges in Multi-Cloud Environments

Multi-cloud environments are more complex and thus more challenging to manage than single cloud deployments. Some unique challenges in multi-cloud environments include:

• End-to-end visibility: Ensuring complete visibility is a challenge in any IT environment—and it’s exponentially more complex and challenging in a highly dynamic multi-cloud environment. However, end-to-end visibility is critical to troubleshooting performance issues and bottlenecks, securing your digital footprint, and identifying single points of failure in mission-critical systems and applications.

• Security and identity management: Ransomware and other cybersecurity threats are top of mind for every IT leader today. While moving to a public cloud platform generally improves the security posture of an organization by shifting certain security responsibilities (such as data center and physical security) to the public cloud provider and providing on-demand access to services like encryption and network segmentation, it can also make it easier to make costly mistakes. For example, network misconfigurations can be common—thousands of data breaches have been caused by improperly configured AWS S3 storage buckets. Identity management is yet another challenge. For example, Azure Active Directory may be quite familiar to organizations that have previously used Active Directory in their on-premises environments, but extending identity management beyond Azure to AWS, GCP, and SaaS offerings (such as Salesforce, ServiceNow, Workday, and others) can introduce new challenges.

• Application and data portability: The ability to dynamically move applications and data across different public cloud platforms in a hybrid (multi-cloud) environment is key to many multi-cloud strategies. Although public cloud providers don’t necessarily build their services to restrict application and data portability, they don’t necessarily work together to facilitate this capability and there may be costs involved. Different cloud providers also use different technologies for their various service offerings.

• Multi-cloud silos: If organizations don’t plan and design their multi-cloud deployments for application and data portability, they can end up with siloed applications and storage, essentially re-creating a common problem in traditional on-premises data center environments, across multiple cloud platforms. At the very least, organizations need multi-cloud security and management tools that allow them to effectively manage their risks and usage/costs across different cloud platforms.

According to the Flexera 2021 State of the Cloud Report, 81% of organizations cite security as the top challenge in their cloud deployments, followed by managing cloud spend (79 %). Yet only 42% of organizations use multi-cloud cost management tools and only 38% use multi-cloud security tools.

Addressing High Availability and Disaster Recovery in Multi-Cloud Environments

While there are many challenges to multi-cloud deployments, they can provide additional availability, especially in the event of a major cloud outage, and disaster recovery. If your organization is pursuing a multicloud strategy, you should work with a trusted, cloud-agnostic partner to help you design and implement your multi-cloud deployment using a holistic approach.

For high availability and disaster recovery, you also need a cloud-agnostic technology solution that spans your multi-cloud environment, irrespective of the cloud platforms you use. You always want to avoid a scenario where your high availability solution causes more downtime in your environment than a standalone solution. Early versions of SQL Server clustering presented this conundrum—to add disk space, you had to incur downtime that wouldn’t have occurred on a standalone solution. While failing over something like a static website can be trivial, moving a multi-tier application stack is extremely complicated in terms of networking and data synchronization. You also need to avoid failing over to a less secure cloud environment that has potentially been misconfigured due to a lack of understanding the nuances between different security solutions across cloud providers.

So What Should I Do?

Finally, in every public cloud, there are a handful of services that can increase costs quickly. These services are charged according to usage-based pricing and can mean steep cost increases after only a few days. One way to mitigate this risk is to ensure you’re taking advantage of the cost monitoring services and alerts that are in each of your cloud platforms.

While multi-cloud deployments aren’t for all organizations, many will go down this path. Understanding networking and security are among your biggest technical hurdles, and managing governance and costs are key functional challenges. Testing is critical to ensure your multicloud cluster solution works. It’s important to use a high availability clustering solution that enables simple switchover and switchback and to understand how each of your applications will work with failover, and most importantly to regularly test that failover to understand any networking or data hurdles.

Reproduced with permission from SIOS