SIOS SANless clusters - Page 2 of 183 - SIOS SANless clusters High-availability Machine Learning monitoring

June 21, 2024	Find it faster: 8 Secrets to Navigating Product Documentation Find it faster: 8 Secrets to Navigating Product Documentation Here are a few ways you can find exactly what you are looking for in Product Documentation. Product documentation is very thorough and finding what you need can appear overwhelming at first glance but it is easy to navigate when you know how. Let’s use SIOS documentation as an example. It is here to answer your questions and help resolve issues that you may be running into. There are several tips to search for what you are looking for in SIOS product documentation: When you go to docs.us.sios.com you will first need to select Windows or Linux depending on which product you are using to get you to the correct documentation. Are you using Windows or Linux Products? Choose the type of information you are looking for. Once you have selected your Product (Windows or Linux), you will select the Topic that relates to what you are looking for. Do you need information on upgrading to the latest SIOS version? Have questions about installation? Need an application recovery kit? You can also hover over each topic to see what’s included. Here’s an example of the topics that are available in the Linux documentation. Try these best practices for using search most effectively. Once you select a topic, you can search for the information you are looking for. Here are a few ways to search most efficiently… Searching for the right keyword – Searching for a word with a similar meaning to the document you are looking for. Use words that are likely to be used in the document. For example, if you are growing or expanding a volume, you may want to use the search term “resize”. Ensure you are searching for the SIOS product that you are using – SOIS provides HA/DR protection for Windows and Linux operating systems. Double-check that you are searching in the documentation for the correct product. For example, LifeKeeper for Linux vs LifeKeeper for Windows. Include the correct product version. Ensure your search matches the version of the SIOS software you are using you can check the version you are running in the GUI under About, Help. Search for terms related to SIOS HA – Commercially available HA and replication solutions will often have unique terms and concepts. For example, one provider may use the phrase takeover to indicate moving resources to a standby node, while another vendor will use the terminology switchover. Be sure that you are searching the documentation using SIOS terminology. Refer to concepts and terminology in the docs <insert link> to get an understanding of the SIOS specific key terms and terminology. Ask for help from the support team. Some support tools and procedures require the Support team’s assistance and therefore are not documented in the public documentation. Searching for SIOS or SIOS Technology – Many years ago, SIOS Technology Corp. was called SteelEye Technology. Be sure to use the correct company name in searching product names. Avoid searching for acronyms – It’s important to search for words related to your query instead of abbreviations or acronyms. Search for the product error code – The quickest way to resolve an issue is to search for the error code you are getting in the GUI, from the command line, or the error log. This will return specific information on what the code means and how best to resolve it. Note: We’d love to hear your feedback on SIOS product documentation! Feel free to post comments and suggestions in the feedback section at the bottom of every topic in the documentation pages. Reproduced with permission from SIOS
June 11, 2024	Webinar: Achieving HA/DR Objectives in the Cloud Webinar: Achieving HA/DR Objectives in the Cloud Register for the On-Demand Webinar There still seems to be confusion about cloud SLAs. Cloud availability SLAs cover infrastructure availability, but what about applications like SAP, SQL Server, and Oracle? Do your applications need availability, high availability, or disaster recovery protection in the cloud? This Actual Tech Media MegaCast session covers how to achieve HA/DR objectives for your mission-critical applications in the cloud. Reproduced with permission from SIOS
June 5, 2024	Strategies for Optimizing IT Systems for High Availability Strategies for Optimizing IT Systems for High Availability Maintaining high availability (HA) of IT systems is essential for organizational success. From critical database management to ensuring seamless customer experiences, achieving uninterrupted operations presents unique challenges that require strategic planning. Here are some key strategies that organizations can leverage to optimize their IT systems for high availability. Common Challenges in Optimizing IT Systems for High Availability There are a couple different areas that start to pose challenges for the IT system. One that comes up very often is compatibility with Antivirus (AV) solutions. Oftentimes the issue stems from the Antivirus being overprotective of the systems and quarantining files that are critical to application or HA solutions functioning. Of course, it’s always important to verify compatibility between solutions, to go a step further though – it is always good for everyone who administers the system to be familiar with how the AV solution works and understand the procedure to configure/request changes to the AV solution so critical applications aren’t interrupted. In addition to the AV solution, firewall configuration also comes up – oftentimes with HA solutions additional communication is transmitted over the network to orchestrate cluster behavior. As a result, there are usually specific rules that need to be added to accommodate the HA solution to prevent erroneous cluster-recovery actions by the HA solution. Finally, the principles of access control become slightly more complex when configuring highly available systems. While the individual teams (IE, DB team, SAP team, cloud team – however things are distributed) each need permissions over their respective domain, any administrators who manage the HA solution may see that they have additional privileges accessible through the HA solution (IE, initiating failover of an application, creating communication between nodes, locking/unlocking storage, etc.). As a result, it is important to consider the actions available through the HA solution when delegating access permissions. It may be pertinent to have HA controls allowed only for root-level users, or you may define a procedure for taking actions via the HA solution so teams are notified and actions may be tracked. Regardless, in the view of the principle of least privilege, HA solutions present complexity that should be considered to ensure that applications and systems are only accessible and mutable by the delegated parties. The Role of Failover and Disaster Recovery Strategies in Ensuring System Uptime Failover capabilities and disaster recovery (DR) strategies both have significant impacts on the uptime of critical systems. Obviously, HA can provide failover capabilities to ensure single-server issues will not cause an outage for an application suite, and when configured properly – the failover can be nearly seamless. This allows recovery to proceed on the faulting system while standby systems come into a primary role to pick up the load. Of course, disaster recovery can be tightly interwoven with the HA strategy. If redundancy is already being configured – why not ensure that this redundancy exists across fault domains. If observed properly, applications can be highly available and fault-tolerant. When analyzing these outcomes from an IT perspective, properly configured HA and DR strategies can ensure that systems are utilized to their fullest potential, with minimal downtime. Natural disaster or technological failure in a region where applications are hosted is far less likely to propagate to other regions. Leveraging the planned redundancy in tandem with your disaster recovery plan can result in covering more functionality requirements with fewer resources – as careful planning can ensure redundancy and fault tolerance are both handled by the deployment of a standby site. Balancing Cost-Effectiveness and High Availability: Strategies for Organizations Configuring a clustered environment or a highly available system can get costly. Usually, at least one standby system is running alongside the primary system and accruing costs despite not handling a workload – but the costs can be mitigated. Here are a few ways I would suggest going about this: Consider using a managed shared storage solution. If you don’t need redundant copies of data, you can save on storage by using shared storage. Something like Amazon EFS could mean you only need to pay for half of the storage versus a replicated disk configuration. Consider the use case for a DR system. Oftentimes, these systems are simply stop-gap solutions while a primary site is recovered. Resources don’t run on the DR site for long periods of time, and so – depending on the workload – you may be able to provision a smaller system on your DR site to save on compute costs. Of course, you would need to communicate the design decision here with stakeholders so everyone is aware the DR site is not a long-term hosting solution – but provided your workload and workforce can handle the added restriction, saving on instance sizes can be accomplished. In the same vein, orchestrator and/or quorum systems that would not host workloads but only coordinate within a cluster may be able to be significantly smaller than the systems workloads are delegated to. Consider using a solution of Scaling up or scaling out. Scaling up means increasing the compute capacity of a singular machine – in cloud environments this relates to a smaller instance increasing its resource pool to that of a larger instance when the workload overwhelms the smaller instance. Scaling out means increasing the number of workers that will be sharing the load of your application when the compute power is necessary. Obviously, the use case dictates when and where scaling up or scaling out is a better solution – but by being familiar with the software and environment at hand you will be able to make decisions and configure the systems to act appropriately when the time comes. Another thing to consider with a scaling solution is to consider the aggressiveness of your descaling rules. To save costs, ensure instances will scale back down to an appropriate resource pool – and evaluate the rules that dictate scale down behavior to ensure that you are not leaving excessive resources provisioned longer than needed. Establish strong communication between IT teams, stakeholders, Cybersecurity teams, and HA Vendors. Ensuring that there is a basis of communication can facilitate a cooperative rollout of any technologies or upgrades to the environment. Additionally, by keeping communication active all teams will be more apprised to the activities occurring on systems. Keeping all teams up to date is crucial and can make it much easier to diagnose issues or begin a rollback procedure if necessary. Finally, maintaining strong communication also ensures that best practices can be efficiently shared between teams such that teams can work cooperatively rather than operating on different principles. Implementing High Availability: Best Practices The first and largest practice I would recommend for anyone deploying systems is to maintain a test environment. Keep the test environment as close to identical to the production environment as possible and perform dry-runs of any procedures that will occur on the production environment so teams are well versed in procedures and runbooks when a production rollout occurs. This practice also feeds into the other best practices I would provide for systems. By maintaining your test environment you are also maintaining a system that can be used to pre-test any changes. The test environment is the perfect place to verify product compatibility and ensure that any considerations for mutual operation between technologies are well established. A fantastic example I see time and time again is configuring exclusions for Antivirus software – there are cases where these exclusions do not get configured and the production environment suffers outages because antivirus might quarantine a file that gets access very frequently. Finally, make sure you are auditing your configuration regularly. Review various aspects such as security groups, access controls, firewall rules and Software compatibility (especially between HA, protected applications, and Antivirus). Maintain a strong log of the findings and any changes made as a result of these audits – keeping track of these details gives a solid record that can be reviewed if there seems to be a configuration change causing an issue. Additionally, when requesting support from vendors these audits can be a fantastic tool to share to reach a full root cause analysis sooner. Most of all, these audits will serve to provide a record of how things should be configured – if there were ever any changes from the ordained configuration, one can refer back to the results of past audits to re-align systems with the organization’s standard for system configuration. SIOS understands optimizing IT systems for high availability is crucial for organizational success. By addressing compatibility challenges with antivirus solutions and fine-tuning firewall configurations, organizations can enhance system resilience and uptime. Contact us today for more information. Reproduced with permission from SIOS
May 26, 2024	SIOS Technology helps strike the balance between high availability and cloud costs SIOS Technology helps strike the balance between high availability and cloud costs Finding the right balance between high availability and cost optimization can be challenging. Dave Bermingham, Senior Technical Evangelist at SIOS Technology, talks about some of the key factors influencing cloud costs and some of the strategies for optimizing costs. He says, “We focus on practical and effective strategies that will help reduce the costs associated with not only deploying high availability, but also in minimizing unexpected downtime besides minimizing downtime associated with planned maintenance.” Key factors influencing cost in cloud environments Key factors influencing the cloud cost and optimizing costs in high availability configurations include efficient resource management, strategic architecture decisions, and continuous monitoring. Bermingham discusses how it is crucial to choose the right instance, type, and size to match the workload requirements and how autoscaling can help reduce costs and optimize cloud spend. Bermingham highlights the importance of considering data transfer costs if the high availability solutions are deployed across multiple time zones and strategies you can use to minimize the charges. Other key considerations include optimizing storage and implementing effective governance and cost management policies. Finding the balance between high availability and cost optimization in the cloud Bermingham explains that although high availability will incur some expense this counteracts the costs associated with any downtime, which can be substantial. It is important to strike a balance between high availability and minimizing cloud costs by creating systems that are modular and scalable and with an operational strategy that embraces a DevOps culture and utilizes CI/CD practices. Cloud cost optimization and high availability challenges Bermingham highlights common pitfalls of optimizing costs without compromising high availability, such as underestimating the complexity of cloud cost management and neglecting the importance of application performance monitoring. Inadequate training on cloud cost optimization best practices and implementing HA solutions can often lead to inefficient resource utilization and unplanned downtime. How SIOS Technology’s high availability solutions help Bermingham explains how SIOS Technology can help address these challenges with HA solutions that simplify and automate HA in different cloud environments to minimize costs, minimize downtime, and manage maintenance. Reproduced with permission from SIOS
May 22, 2024	SIOS LifeKeeper for Linux v 9.8.1 improves the way companies manage HA/DR SIOS LifeKeeper for Linux v 9.8.1 improves the way companies manage HA/DR In today’s tech-driven landscape, companies are seeking innovative solutions to effectively maintain their complex application environments. In this video, Todd Doane, sales engineer at SIOS Technology, explains how the latest version of SIOS LifeKeeper for Linux helps companies in safeguarding critical enterprise systems against downtime and disasters. “The release features a new Web Management Console. It’s self-contained and does not require additional installations or third-party add-ons,” says Doane. Reproduced with permission from SIOS