May 27, 2025 |
SIOS LifeKeeper Demo: How Rolling Updates and Failover Protect PostgreSQL in AWSSIOS LifeKeeper Demo: How Rolling Updates and Failover Protect PostgreSQL in AWSThis week, Dave Bermingham, Director of Customer Success at SIOS Technology, walks through how LifeKeeper for Linux delivers high availability for PostgreSQL databases running in AWS.
High availability (HA) and zero-downtime maintenance have long been holy grails for enterprises running mission-critical databases in the cloud. Dave recently showcased how LifeKeeper for Linux solution tackles these challenges for PostgreSQL databases in AWS. The demo, centered on minimizing downtime during planned maintenance and automating recovery from unplanned failures, highlights the growing demand for resilient cloud architectures. Reproduced with permission from SIOS
|
May 21, 2025 |
How to Assess if My Network Card Needs ReplacementHow to Assess if My Network Card Needs ReplacementA network interface card (NIC), often referred to as a network card, is a vital component of any server infrastructure. It enables systems in a cluster to communicate with each other and the outside world. If your NIC is experiencing issues, it can compromise the health of your cluster, lead to false node failures, or increase the risk of split-brain scenarios. Recognizing the signs of a failing NIC early can save time, reduce downtime, and maintain high availability. In this blog, we’ll explore how to assess whether your network card needs replacement, the symptoms to look out for, and the tools that can aid you in diagnosing the issue. Common Symptoms of a Failing NIC1. Intermittent ConnectivityOne of the first signs of NIC failure is unstable or sporadic connectivity. You may notice dropped packets, high latency, or difficulty reaching external hosts. These issues can cause nodes in a LifeKeeper cluster to temporarily lose connection and trigger unnecessary failovers. 2. Degraded Network SpeedIf a system is underperforming on network-related tasks such as slow replication, sluggish application response, or delayed heartbeat communication, it may be due to a faulty NIC that is no longer operating at its rated speed (e.g., 1 Gbps vs. 10 Gbps). In clustered environments, slow replication is especially concerning because it delays data synchronization between nodes. This not only increases recovery time in the event of a failover but also raises the risk of data loss or inconsistent state across systems if a complete failure occurs before the replication finishes. 3. System Logs Showing Network ErrorsFrequent kernel or system log messages related to the NIC driver or interface, such as “link down,” “NIC reset,” or “device not responding,” are red flags. These messages indicate the OS is having trouble communicating with the card at a hardware or driver level. 4. Unusual Heat or Physical DamageWhile not common, physical inspection may reveal damage such as scorch marks or excessive heat emission. Hardware issues at this level can quickly deteriorate performance or cause complete failures, which is certainly not desirable in any environment. 5. Issues in Virtual or Cloud EnvironmentsIn virtualized and cloud environments, NIC behavior can be affected not just by the underlying hardware but also by the configuration of the hypervisor or virtual networking layer. For example, virtual NICs assigned through VMware or Hyper-V may show degraded performance if incompatible/outdated drivers are used, or even if the VM is assigned an adapter type that is not optimized for the desired workload. Network Card Troubleshooting Tools for Windows and LinuxDiagnosing NIC issues early helps minimize downtime and prevent unnecessary failovers. The following are essential tools for identifying hardware or driver-related NIC issues, including options for both Linux and Windows environments:
When to Replace Your NIC?It may be time to replace your NIC if:
Preventative Measures to Avoid Network Card FailuresTo avoid NIC-related failures:
Final Thoughts on Maintaining Network Interface Card HealthThe NIC may not be the most glamorous piece of hardware, but its health is critical to a stable, highly available environment. Knowing when and how to assess a network card’s performance helps prevent unexpected downtime, ensures seamless failover behavior, and keeps your cluster communication resilient. SIOS Technology Corporation provides high availability cluster software that protects & optimizes IT infrastructures with cluster management for your most important applications. Request a demo today. Author: Aidan Macklen, Customer Experience Engineer Intern at SIOS Technology Corp. Reproduced with permission from SIOS |
May 18, 2025 |
SIOS Technology to Demonstrate High Availability Clustering Software for Mission-Critical Applications at Red Hat Summit, Milestone Technology Day and XPerience Day, and SQLBits 2025SIOS Technology to Demonstrate High Availability Clustering Software for Mission-Critical Applications at Red Hat Summit, Milestone Technology Day and XPerience Day, and SQLBits 2025All practitioners are invited to provide input on high availability and disaster recovery trends as SIOS gathers insights for its 2025 HA/DR Practices Survey Report SAN MATEO, Calif. – May 6, 2025 – SIOS Technology Corp., a leading provider of application high availability (HA) and disaster recovery (DR) solutions, today announced it will demonstrate its high availability clustering software for business-critical applications at four leading technology events this spring. SIOS also announced that it is inviting all IT practitioners to participate in its newly launched 2025 HA/DR Practices Survey, designed to gather insights into current trends, challenges, and strategies for ensuring application uptime and data protection
At each event, SIOS experts will demonstrate how SIOS LifeKeeper and DataKeeper software provide high availability and disaster recovery for critical applications like SQL Server, SAP, and Oracle. Attendees will learn how SIOS clustering software ensures application uptime, eliminates data loss, and simplifies HA/DR across physical, virtual, cloud, and hybrid environments. SIOS clustering software enables IT teams to create highly available application environments without the need for shared storage. Through intelligent application monitoring, real-time data replication, and automated failover and recovery, SIOS ensures business continuity with minimal complexity and reduced cost. With support for Windows and Linux in any infrastructure, SIOS solutions are trusted by enterprises worldwide to protect mission-critical operations. SIOS Launches Survey to Gather Insights on HA/DR Practices As part of its commitment to advancing resilience strategies in the enterprise, SIOS is launching its 2025 HA/DR Practices Survey to collect insights into the challenges, priorities, and real-world strategies used by IT professionals to ensure application uptime and data protection. The results will be compiled into the SIOS 2025 State of High Availability and Disaster Recovery Report, providing valuable benchmarks for the industry. All practitioners, including attendees of the Red Hat Summit, Milestone Technology Day, Milestone XPerience Day, and SQLBits, are invited to participate in the survey here. # # # About SIOS Technology Corp. SIOS Technology Corp. high availability and disaster recovery solutions ensure availability and eliminate data loss for critical Windows and Linux applications operating across physical, virtual, cloud, and hybrid cloud environments. SIOS clustering software is essential for any IT infrastructure with applications requiring a high degree of resiliency, ensuring uptime without sacrificing performance or data – protecting businesses from local failures and regional outages, planned and unplanned. Founded in 1999, SIOS Technology Corp. (https://us.sios.com) is headquartered in San Mateo, California, with offices worldwide. SIOS, SIOS Technology, SIOS DataKeeper, SIOS LifeKeeper and associated logos are registered trademarks or trademarks of SIOS Technology Corp. and/or its affiliates in the United States and/or other countries. All other trademarks are the property of their respective owners. Media Contact: Beth Winkowski Reproduced with permission from SIOS |
May 12, 2025 |
Application Intelligence in Relation to High AvailabilityApplication Intelligence in Relation to High AvailabilityApplication Intelligence in the context of High Availability (HA) refers to the system’s ability to understand and respond intelligently to the behavior and health of applications in real time to maintain continuous service availability. What is Application Intelligence?So, what is Application Intelligence? Application intelligence involves monitoring, analyzing, and reacting to several factors. These can include application state, like whether the application is up or down? Performance metrics include response time, error rates, throughput, and memory usage. Application dependencies, such as databases or external services. Lastly, they look at user behavior or patterns. Using Application Intelligence takes a more holistic view of the application. It uses various data points to make educated decisions about the state of the application itself, not just the infrastructure. Let’s take the example of a web server; it’s not simply enough to know if the server is running, but is the site accessible without any errors? Is the response slow at all? Are users refreshing multiple times and trying to access it? Is the database the website relies on also up and running and accessible? All the above are examples of the factors that application intelligence considers to be successful. How LifeKeeper Uses Application IntelligenceSo, how does LifeKeeper use application intelligence to enhance high availability for critical applications? Let’s break it down. LifeKeeper uses application-specific recovery kits (ARKs) that contain knowledge for each application (SAP, SQL, PostgreSQL, Oracle, etc.). This allows LifeKeeper to handle the startup/shutdown procedures of each application, monitor the health and status of both the application and any dependencies, as well as orchestrate intelligent failover/failback operations without corrupting any data. Users can group together related resources in a hierarchical relationship within LifeKeeper, which allows LifeKeeper to understand the dependencies between different application components (when a service relies on an IP or database, for example). This ensures LifeKeeper failovers happen in the correct order and recovery actions don’t break the application or leave it in an inconsistent or broken state. Additionally, LifeKeeper does deep health checks, not just determining if the server is up, but also more detailed checks, such as whether a database is accepting connections or if a web service is returning expected responses. It can even monitor if certain expected background processes are running. LifeKeeper also uses application-specific configuration files to ensure data configuration consistency across nodes and that application settings are preserved or restored correctly. Lastly, LifeKeeper has the ability to use custom scripts to further fine-tune these deep checks to support less common or homegrown applications intelligently as well. PostgreSQL ARK: A Real-World Example of Application IntelligenceTo take a deeper dive, we can look at how PostgreSQL ARK uses Application Intelligence. The PostgreSQL ARK uses specific logic to monitor, start, stop, and failover PostgreSQL via knowledge of the specific PostgreSQL startup and shutdown commands, awareness of critical config files like postgresql.conf and pg_hba.conf and understanding the data directory layout and lock file behavior. Intelligent Monitoring and Ordered Failover for PostgreSQLAdditionally, it doesn’t just check that PostgreSQL is running, it also checks if the database is responding to queries, the correct data directory is accessible, and if there is any corruption in the transaction logs? It uses dependency tracking to make sure that the resources PostgreSQL often depends on are available such as the Virtual IP for client connections and the mounted storage for its data directory. This ensures that LifeKeeper can bring up the resources in the correct order in case of a failover, such as mounting the disk first, bringing up the IP, and then starting PostgreSQL before verifying the service health. Preventing Split-Brain and Ensuring Data IntegrityLastly, LifeKeeper uses application intelligence to avoid split-brain (a phenomenon where more than one node thinks it’s the ‘primary’ node) scenarios by avoiding starting two active PostgreSQL servers with the same data directory and avoiding data corruption by not failing over when writes are still in progress. These are examples of all the different ways LifeKeeper and the various ARKs have implemented application intelligence to make the combined product as resilient as possible. Strengthen Application Resilience with Intelligent High AvailabilityIn summary, LifeKeeper’s built-in application intelligence enables precise, fast, and reliable failover and recovery by understanding how applications behave and what they need to run correctly. Ensure application resilience and uninterrupted service—request a demo or start your free trial today to experience how SIOS LifeKeeper uses application intelligence to protect your critical workloads. Author: Cassy Hendricks-Sinke, Principal Software Engineer, Team Lead Reproduced with permission from SIOS |
May 8, 2025 |
Transitioning from VMware to NutanixTransitioning from VMware to Nutanix10 Considerations for Choosing a High Availability Solution in a Nutanix EnvironmentIf you’re planning a move from VMware to Nutanix, making sure your critical applications stay up and running should be at the top of your list. While Nutanix offers great benefits like simplified management and better performance, its built-in high availability only covers the virtual machine—not the applications themselves. This paper shares ten key insights to help you plan ahead and avoid downtime during and after your migration. You’ll get practical guidance on choosing the right clustering solutions for both Windows and Linux, how to handle shared storage in Nutanix, and what to consider if you’re running a mix of operating systems. Whether you’re moving to Nutanix AHV, or managing hybrid environments, learn how to simplify your HA strategy, reduce risk, and keep your most important systems protected. Reproduced with permission from SIOS |
- Results 1-5 of 963
- Page 1 of 193 >