SIOS APAC Portal

January 13, 2022	Why You Need Business Continuity Plans Why You Need Business Continuity Plans FaceBook, Instagram And WhatsApp Just Had A Really Bad Monday It’s the end of the work day here on the east coast and I see that the Facebook is still unavailable. Facebook acknowledged the problem in the following two Tweets. I can pinpoint the time that Facebook went offline for me. I was trying to post a comment on a post and my comment was not posting. I was a little annoyed, and almost thought the poster had blocked me, or was deleting my comment. This was at 11:45 am EDT. 5+ hours Facebook for me is still down. While we don’t know the exact cause of the downtime, and whether it was user error, some nefarious assault, or just an unexpected calamity of errors, we can learn a few things about this outage at this point. Downtime Is Expensive While we may never know the exact cost of the downtime experienced today, there are a few costs that can already measured. As of this writing, Facebook stock went down 4.89% today. That’s on top of an already brutal September for Facebook and other tech stocks. The correction may have been inevitable, but the outage today certainly didn’t help matters. But what was the real cost to the company? With many brands leveraging social media as an important part of their marketing outreach, how will this outage impact future advertising spends? Minimally I anticipate advertisers to investigate other social media platforms if they have not done so already. Only time will tell, but even before this outage we have seen more competition for marketing spend from other platforms such as TickTock. Plan For The Worst-Case Scenario Things happen, we know that and plan for that. Business Continuity Plans (BCP) should be written to address any possible disaster. Again, we don’t know the exact cause of this particular disaster, but I would have to imaging that an RTO of 5+ hours is not written into any BCP that sits on the shelf at Facebook, Instagram or WhatsApp. What’s in your your BCP? Have you imagined any possible disaster? Have your measured the impact of downtime and defined adequate recovery time objective (RTO) and recovery point objective (RPO) for each component of your business? I would venture to say that it’s impossible to plan for every possible thing that can go wrong. However, I would advise everyone to revisit your BCP on a regular basis and update it to include disasters that maybe weren’t on the radar the last time you reviewed your BCP. Did you have global pandemic in your BCP? If not, you may have been left scrambling to accomodate a “work from home” workforce. The point is, plan for the worst and hope for the best. Communications In A Disaster Communications in the event of a disaster should be its own chapter in your BCP. One Facebook employee told Reuters that all internal tools were down. Facebook’s response was made much more difficult because employees lost access to some of their own tools in the shutdown, people tracking the matter said. Multiple employees said they had not been told what had gone wrong. https://www.reuters.com/technology/facebook-instagram-down-thousands-users-downdetectorcom-2021-10-04/ A truly robust BCP must include multiple fallback means of communication. This becomes much more important as your business spreads out across multiple building, regions or countries. Just think about how your team communicates today. Phone, text, email, Slack might be your top four. But what if they are all unavailable, how would you reach your team? If you don’t know you may want to start investigating other options. You may not need a shortwave radio and a flock of carrier pigeons, but I’m sure there is a government agency that keeps both of those on hand for a “break glass in case of emergency” situation. Summary You have a responsibility to yourself, your customers and your investors to make sure you take every precaution concerning the availability of your business. Make sure you invest adequate resources in creating your BCP and that the teams responsible for business continuity have the tools they need to ensure they can do their part in meeting the RTO and RPO defined in your BCP. Reproduced with permission from Clusteringformeremortals
January 9, 2022	Fixing Your Cloud Journey Fixing Your Cloud Journey In some way or another, the world-changing events of 2020 and 2021 have reshaped nearly everything that we knew, and high availability was no exception. Despite closures and restrictions, many IT teams traded on-prem data centers for the cloud. Many are asking, ‘Now what?’ Here are five things to do to fix your cloud journey in 2022. Add high availability In the push to the cloud many IT and business leaders found themselves rushing to move services and applications from data centers that they were closing due to COVID-19 into the cloud. Others rushed to the cloud, not because of data center closures, but to deal with the wave of exploding demand. For some, the journey to the cloud was so fast that HA wasn’t included, and now they’ve discovered the hard way that applications still crash in the cloud and that unexpected outages and unplanned downtime are still the nemesis of AWS, Azure and GCP as much as they were in their previous data center. The first step in fixing your cloud journey is to add a c. This will mean several things to your enterprise: Designing and architecting a highly available and redundant architecture Choosing software and services that will protect critical components and applications Defining and documenting associated processes and procedures, and at least a minimal governance Deploying production copies for quality assurance, procedural testing, and chaos testing Expand for higher availability for disaster recovery Of course, not everyone made the move to cloud without considering some form of HA. Some IT teams had the foresight to not leave HA on-premises, but in the rush to cloud moved all of their critical servers to the same cloud Availability Zone. While having some HA protections is better than complete vulnerability, if you’ve only deployed your servers and applications in a single Availability Zone (AZ), now is the time to expand to multi-AZ for your standby cluster node, or even build in disaster recovery by deploying a third node in a different region. SIOS’ has helped dozens of customers plan multiple-AZ architectures and add disaster recovery solutions. Build your team Overnight some companies, and their IT teams, went from being fully on-premises to wrestling with Cloud Formation Templates, QuickStart Guides, IAM roles, internal load balancers, Overlay IPs, and deciphering what exactly that VM size means. Now is the time to build a team to support the journey to the cloud. This will mean several things: Adding capacity. Unless you were able to pull off a complete lift and shift, you likely have the same staff managing cloud and on-premises applications. Legacy solutions are known for being temperamental and requiring a lot of work to keep them stable and available.To navigate the cloud journey ahead you’ll need capacity capable of addressing availability requirements, understanding cloud architecture, and plotting the course forward for enterprise needs. Augmenting skills with training. Give your team training for the cloud. To manage and plan the course forward, look for ways to augment the IT excellence within your organization with additional training on cloud solutions, architecture, best practices, and trade-offs. A confidently trained staff will not only pay dividends in increased availability, but they will also pay dividends by addressing availability, maintenance, and growth in an economic, scalable and logical way. Translation: they’ll avoid wasting money as they build out the rest of your cloud infrastructure. Integrating automation and analytics As VP of Customer Experience at SIOS Technology Corp. I have worked with several companies that made the move to the cloud in 2021 without sacrificing HA, DR or their team. If you took achieving the required number of nines of uptime (99.99%) seriously and having a disaster plan was non-negotiable then it’s time to add the rigor of analytics and additional monitoring. Ensure that your availability solution has application-aware automation and orchestration for recovery in the event of a disaster or unplanned downtime. Add analytics and automation to solidify your solution and take your cloud migration up another notch from one of reactive failovers to proactive notification and mitigation of the failure before it occurs. Imagine being notified of underperforming applications, or of increasing latency, errors, or VM non-responsive behavior in time to avoid downtime in the peak business times. Analytics are also important as they can reveal systems and applications that may have escaped your original availability architecture. Update processes and governance Many things we think of as a failure are rooted in a failure of process. Make sure that your organization’s processes are up to date, well-documented, properly communicated and adhered to. These processes should contain a few key minimums related to who, what, when, where, and how all tied back to the business strategies, goals, and organizational needs as they pertain to the customer. Make sure that ownership and sign-off processes for your new cloud environment are well-documented. I have seen firsthand the frustration that comes from conflicting, clashing or unresolved roles and responsibilities for customers who have moved from hardware teams that acquire infrastructure to cloud teams. Muddling through a migration is one set of pain points, digging out of a disaster without clear governance is a much bigger, more costly issue. If you’ve made the leap to cloud, staying there and making it work for you is the next part of the journey. If your cloud journey was sudden or rocky, consider these five points for fixing your cloud journey and know that SIOS Technology can help you improve not only your high availability in the cloud, but also your processes for running in the cloud. Reproduced with permission from SIOS
January 6, 2022	How to Install a SIOS DataKeeper Cluster Edition License Key How to Install a SIOS DataKeeper Cluster Edition License Key Once you have installed SIOS DataKeeper Cluster Edition software and have activated your license, you will need to install your license key before you can get started. This 4 minute video will review how to install SIOS DataKeeper Cluster Edition software and demonstrate how to activate your license to get started protecting your critical applications. Watch as a SIOS support representative demonstrates each of the three key prerequisites required to install SIOS licenses: ensuring you have the latest version and updates of SIOS DataKeeper software; use our simple license key manager to validate your activated licenses from purchased entitlements, download and apply license keys and start your SIOS DataKeeper software. This video also walks through the process of access our SIOS Documentation portal, where you can find release notes, installation guides, technical documentation and information detailing SIOS DataKeeper Cluster Edition as well as a wide range of topics on everything SIOS. View tips and convenient insights on how to complete steps fast and simply. Now you and begin protecting your critical applications with SIOS DataKeeper clustering software Reproduced from SIOS
January 1, 2022	Four Avoidance Strategies for Improving Cluster Resilience, Performance, and Outcomes Four Avoidance Strategies for Improving Cluster Resilience, Performance, and Outcomes Simple Steps for Deployment in SIOS Protection Suite Cluster Environment Avoiding something – we’ve all done it before. An old flame we see in the store while walking with our spouse, a salesperson when we aren’t “ready to buy”, and even a boss while we are out on “vacation”. When I was the manager of a development team, I caught a glimpse of a direct report browsing in a store while they were supposed to be out of the office sick. They ducked between clothing racks and scurried down the next aisle and hurried away. We’ve all done it before, and in some cases, for mental health, physical health, or reasons that remain private and personal, we all need some measures of avoidance. Even in HA. So, how do you add avoidance to your High Availability environment, and why? Four Reasons To Use An Avoidance Strategy In High Availability 1. Better Performance (minimizing server overload) One reason to use avoidance strategies in HA is to increase application and server performance. Consider the case of three servers running production workloads, let’s call them Server Alpha, Server Beta, Server Gamma. Servers Alpha and Beta are running critical applications backed by a database, while Server Gamma is running reports and data transformation jobs. In the event of a failure of Server Alpha, a failover to Server Beta would traditionally occur. However, because server Beta is already running a large workload, the resulting additional application load might result in an undesirable server overload and poor performance for both applications. So it might be wise to deploy an avoidance strategy to make sure that Server Gamma is chosen as the failover target. 2. Performance Optimization Consider again the scenario of three servers, Alpha, Beta, and Gamma. Servers Alpha and Beta are scaled to handle peak workloads, while Server Gamma is a cost-optimized server. In the event of a failure of Server Alpha and Server Beta, a failover will occur to the cost-optimized server, Gamma. However, this server is not scaled to handle peak workloads, nor the workloads of both Server Alpha and Server Beta at the same time. In this instance, an avoidance strategy can be used to optimize performance by automatically moving one or both of the workloads from Server Gamma as soon as another host is available. 3. High Availability Optimization HA Optimization is another scenario for deploying avoidance strategies. Like the performance optimization strategy, HA optimization is used to ensure that your environment can survive most failure scenarios and that your applications are optimized to provide the highest level of availability possible at any point in time. HA optimization is important for an application such as SAP with replicated enqueue processes. In any SAP environment, you do not want the ASCS (ABAP SAP Central Service) and ERS (enqueue replication services) instance residing on the same server for extended periods of time because of the risk of lost locks and canceled jobs. To prevent this from occurring you can use an avoidance strategy that causes the ERS and ASCS instances to always run on opposite cluster nodes. Consider the case of three servers running production workloads, let’s call them Servers Alpha, Beta, Gamma. Server Alpha is running the ASCS instance, while Server Beta is running the ERS instance. Server Gamma functions as a third node for failovers of both Server Beta (ERS) and Server Alpha (ASCS). If Beta crashes, you wouldn’t want the ERS resource running on the same node as the ASCS instance. To ensure this operation, you can deploy an avoidance strategy that automatically checks first and ensures the two applications are on separate servers, and maintain SAP ASCS/ERS best practices for lock failover. 4. DR Avoidance Suppose you have two data centers: City Alpha and City Beta which are about 70 miles apart with most of your clients centrally located between them. However, due to recent changes in internal organizations, mergers/closures and acquisitions, and governance requirements, your IT team has to add a third data center that is located in City Gamma, which is about 350 miles from Alpha and Beta. Now the resources which were primarily protected in Alpha and Beta are also extended to the Gamma location. Given that most of the users and teams are near the Alpha and Beta locations and even the most extreme users are located in neighboring cities, your team needs to avoid a failover to the Gamma location. Like the other strategies, a DR avoidance seeks to optimize performance, in/out regional data costs, latency, and client access by avoiding the DR node should only one node within either region fail. It would also ensure that even if both nodes fail after different times, failover always occurs to the other node in the cluster or data center before moving to DR. So, how do you deploy an avoidance strategy? Many providers have affinity rules that can be configured, while others use a combination of server priorities or manual steps. In the case of the SIOS Protection Suite for Linux, you can use a number of built-in methods including: 1. Resource prioritization In the event of a failure, resources will fail over to the server where they have the lowest remaining priority and cascade to any additional servers (Alpha, Beta, and Gamma). Server Alpha is the primary server for Resource.HR, Server Beta is the primary server for Resource.MFG, and Server Gamma is the backup server for all resources/servers. Using resource prioritization, Resource.HR would have a priority of one (1) on Server Alpha and a priority of two (2) on Server Gamma. While Resource.MFG could have a priority one (1) on Server Beta and a priority of two (2) on Server Gamma. If customers wanted to optimize the use of the environment, then Resource.HR could have a priority of three (3) on Server Beta and Resource.MFG could have a priority of three (3) on Server Alpha. In the event of a failure of Server Alpha, the resource Resource.HR would fail to Server Gamma first before trying to come in-service (be restored) on Server Alpha. SIOS Protection Suite for Linux (UI and CLI) allow users to specify a priority for each server and resource combination. 2. Policy or affinity rules Policy rules can also be used to prevent a resource recovery from occurring on a given server and thereby allowing a resource to avoid a specified server that may be running a more critical or resource-intensive workload. Typical policies include: Constraint policies that will block an application from a specific server by default. Resource policies that will block an application from a server that does not have sufficient resources Temporal policies that define a time period that resources are allowed or disallowed from a system Custom policies that define preferred servers or possible application ownership abilities within the cluster. The SIOS Protection for Linux CLI allows users to specify policy rules which can disable failover to a specific resource for a specified server, provide temporal policies guarding failures, disable failures of a specific application type, constraint policies, and custom policies. Specific Avoidance Resources The most granular way to establish a resource avoidance strategy is to deploy specific avoidance scripts within each hierarchy. This method will allow the user to configure specific applications, (eg app1 and app2), to avoid one another whenever possible while allowing other applications to run without restriction. In the case of our three servers, Alpha, Beta, and Gamma, and three resources app1, app2, and app3 this method would provide the greatest flexibility. In this example, app1 and app2 will seek to avoid collocation when a server fails, but app3 will fail to the next available node based on priorities without any collocation restrictions. For additional examples of avoidance strategies and resources, consider the SIOS Protection Suite for Linux documentation. If a customer has two applications, app1 and app2, that they require to run on different nodes whenever possible, the customer can create two avoidance terminal leaf node resources using the SIOS Protection Suite for Linux gen/app resource and the ‘/opt/LifeKeeper/lkadm/bin/avoid_restore’ script. – Cassius Rhue, VP, Customer Experience Reproduced from SIOS
December 28, 2021	Windows Clustering Windows Clustering Windows Clustering How to Achieve High Availability in Windows To mitigate system downtime and ensure high availability for Windows, IT best practice recommends that you cluster servers (or nodes) so that if one node fails, one or more other nodes automatically take over-processing. This is also referred to as Windows clustering. Clustering software is required that monitors the health of the primary node and initiates recovery actions if it detects an issue. HA clustering also requires a way to ensure that, in the event of a failure, the secondary node is accessing the most current versions of data in storage. In most cases, this is achieved by connecting all nodes of the cluster to the same shared storage. The cluster nodes should be separated geographically to protect applications from sitewide and regional disasters. In Windows Server environments, Microsoft includes Windows Server Failover Clustering (WSFC) in the Windows Server platform. What is Windows Server Failover Clustering? With WSFC, each active node has a standby node that has the same hardware specifications and shares the same storage. A third node is often configured as a “witness” server whose sole purpose is to ensure that the primary node is operational and, if an issue is detected, to signal the need to failover operation to the standby node. In addition to monitoring the health of the cluster, the nodes in a WSFC also work together to collectively provide:[1] Resource management – Individual nodes provide physical resources such as SAN and network interfaces. The hosted applications are registered as a cluster resource and can configure startup and health dependencies upon other resources. Failover coordination – Each resource is hosted on a primary node and can be automatically or manually transferred to one or more secondary nodes. Nodes and hosted applications are notified when failover occurs so that they can appropriately react. WSFC works with Microsoft Always On Availability Groups and Always On Failover Clustering to coordinate failover In Microsoft SQL Server environments. How SIOS DataKeeper Complements WSFC WSFC requires shared storage to ensure all cluster nodes are accessing the most up-to-date data in the event of a failover. Often, companies use expensive SAN hardware to assure data redundancy. SANs represent a single point of failure risk. And, if you want to run your application in the cloud with the same Windows Server Failover clustering protection, there is no SAN available. SIOS DataKeeper Cluster Edition seamlessly integrates with and extends WSFC and SQL Server Always On Failover clustering by eliminating the need for shared storage. It provides performance-optimized, host-based replication to synchronize local storage in all cluster nodes, creating a SANless cluster. While WSFC manages the cluster, SIOS DataKeeper performs synchronous or asynchronous replication of the storage giving the standby nodes immediate access to the most current data in the event of a failover. SIOS DataKeeper not only eliminates the cost, complexity, and single-point-of-failure risk of a SAN, but also allows you to use the latest in fast PCIe Flash and SSD in your local storage for performance and protection in a single cost-efficient solution. With SIOS DataKeeper, you can also balance network bandwidth and CPU utilization for each application. If fast replication is critical, SIOS DataKeeper can achieve more than 90 percent bandwidth utilization to accelerate data synchronization. If minimizing network impact is your top priority, SIOS DataKeeper offers integrated compression and bandwidth throttling. In addition, SIOS DataKeeper’s Target Snapshots feature lets you run point-in-time reports from a secondary node to offload workloads that can impact performance on the primary node. This lets you query and run reports faster and make faster decisions. Working with WSFC, SIOS DataKeeper Cluster Edition protects business-critical Windows environments, including Microsoft SQL Server, SAP, SharePoint, Lync, Dynamics, and Hyper-V using your choice of industry-standard hardware and local attached storage in a “shared-nothing” or SANless configuration.[2] SIOS DataKeeper also provides high availability and disaster recovery protection for your business-critical applications in cloud environments, such as Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Services without sacrificing performance. SIOS Protection Suite – Protecting a Windows Environment Without WSFC SIOS Protection Suite for Windows includes DataKeeper, SIOS LifeKeeper, and optional application Recovery Kits for leading application and infrastructure operations. It is a tightly integrated clustering solution that combines high availability failover clustering, continuous application monitoring, data replication, and configurable recovery policies to protect your business-critical applications and data from downtime and disasters. Distributed metadata and notifications The WSFC service and node’s metadata/status are hosted on each node in the cluster. When changes occur on any node, updated information is automatically propagated to all other nodes. SIOS Protection Suite does not require WSFC as SIOS monitors the health of the application environment, including servers, operating systems, and databases. It can stop and restart an application both locally and on another cluster server at the same site or in another location. When a problem is detected, SIOS Protection Suite automatically performs the recovery actions and automatically manages cascading and prioritized failovers. With SIOS Protection Suite, you can use your choice of SAN or SANless clusters using a wide array of storage devices, including direct-attached storage, iSCSI, Fibre Channel, and more. SIOS Protection Suite for Windows can meet your high availability and disaster recovery needs within a single site and across multiple sites. Popular SIOS Windows Clustering Solutions Some of the most popular SIOS Windows clustering solutions – for SQL Server, SAP, and cloud-based environments – are discussed in more detail below. Windows Clustering for SQL Server, SAP, S/4HANA, and Oracle SIOS provides comprehensive SAP-certified protection for both applications and data, including high availability, data replication, and disaster recovery. To protect SAP in a Windows environment, SIOS Protection Suite includes SIOS LifeKeeper, which monitors the entire application stack. SIOS protects your Oracle Database whether you are using it with SAP or running standalone Oracle applications – you simply select the Application Recovery Kit that matches your configuration. Windows Clustering in the Cloud Whether you need SIOS DataKeeper to enable Windows Server Failover Clustering in the cloud or SIOS Protection Suite for Windows for application monitoring and failover orchestration, as well as efficient, block-level data replication, SIOS delivers complete configuration flexibility. SIOS allows you to create a cluster in any combination of physical, virtual, cloud, or hybrid cloud infrastructures. For example, working with WSFC, SIOS DataKeeper can: Protect critical on-premise or hybrid business applications to a high availability Windows environment in AWS, Azure, or Google Cloud. Protect cloud applications, such as SQL Server and SAP, by creating a Windows cluster in AWS, Azure, or Google Cloud. Provide site-wide, local, or regional high availability and disaster recovery protection by failing over application instances across cloud availability zones or regions. SIOS DataKeeper Cluster Edition can provide high availability cluster protection across cloud Conclusion SIOS provides offerings that support a breadth of applications, operating systems, and infrastructure environments, providing a single solution that can handle all your high availability needs. Here are just a few examples that demonstrate the power of SIOS. Perth Stadium in Western Australia implemented SIOS DataKeeper with WSFC to provide high availability for their Hyper-V virtual machines. PayGo (paygoutilities.com), based in the U.S., implemented SIOS DataKeeper with WSFC to provide high availability for SQL Server on AWS. Toyo Gosei, based in Japan, implemented SIOS DataKeeper with WSFC to provide high availability and disaster recovery for their SAP application on Azure. For more information on high availability/disaster recovery solutions to support your Windows environment click here [TM(1] . References https://www.techopedia.com/definition/24358/windows-clustering https://searchwindowsserver.techtarget.com/definition/Windows-Server-failover-clustering https://docs.microsoft.com/en-us/sql/sql-server/failover-clusters/windows/windows-server-failover-clustering-wsfc-with-sql-server?view=sql-server-ver15 [1] https://docs.microsoft.com/en-us/sql/sql-server/failover-clusters/windows/windows-server-failover-clustering-wsfc-with-sql-server?view=sql-server-ver15 [2] A shared-nothing architecture (SN) is a distributed-computing architecture in which each update request is satisfied by a single node (processor/memory/storage unit). https://en.wikipedia.org/wiki/Shared-nothing_architecture Reproduced from SIOS

Results 246-250 of 959
< Page 50 of 192 >

Join Our Mailing List

First Name Last Name Email Address
Search