Application availability Archives - SIOS SANless clusters

High Availability & the Cloud: The More You Know

October 25, 2021 by Jason Aw Leave a Comment

High Availability & the Cloud: The More You Know

While researching reasons to migrate to the cloud, you’ve probably learned that the benefits of cloud computing include scalability, reliability, availability, and more. But what, exactly, do those terms mean? Let’s consider high availability (HA), as it is often the ultimate goal of moving to the cloud for many companies.

The idea is to make your products, services, and tools accessible to your customers and employees at any time from anywhere using any device with an internet connection. That means ensuring your critical applications are operational – even through hardware failures, software issues, human errors, and sitewide disasters – at least 99.99% of the time (that’s the definition of high availability).

While public cloud providers typically guarantee some level of availability in their service level agreements, those SLAs only apply to the cloud hardware. There are many reasons for application downtime that aren’t covered by SLAs. For this reason, you need to protect these applications with clustering software that will detect issues and reliably move operations to a standby server if necessary. As you plan what and how you will make solutions available in the cloud, remember that it is important that your products and services and cloud infrastructure are scalable, reliable, and available when and where they are needed.

Quick Stats on High Availability in the Cloud in 2021

Now that we’ve defined availability in the cloud context, let’s look at its impact on organizations and businesses. PSA, these statistics may shock you, but don’t fret. We’ve also got some solutions to these pressing and costly issues.

As much as 80% of Enterprise IT will move to the cloud by 2025 (Oracle).
The average cost of IT downtime is between $5,600 and $11,600 per minute (Gartner; Comparitech).
Average IT staffing to employee ratio is 1:27 (Ecityworks).
22% of downtime is the result of human error (Cloudscene).
In 2020, 54% of enterprises’ cloud-based applications moved from an on-premises environment to the cloud, while 46% were purpose-built for the cloud (Forbes).
1 in 5 companies don’t have a disaster recovery plan (HBJ).
70% of companies have suffered a public cloud data breach in the past year (HIPAA).
48% of businesses store classified information on the cloud (Panda Security).
96% of businesses experienced an outage in a 3-year period (Comparitech).
45% of companies reported downtime from hardware failure (PhoenixNAP).

What You Can Do – Stay Informed

If you are interested in learning the fundamentals of availability in the cloud or hearing about the latest developments in application and database protection, join us. The SIOS Cloud Availability Symposium is taking place Wednesday, September 22nd (EMEA) and Thursday, September 23rd (US) in a global virtual conference format for IT professionals focusing on the availability needs of the enterprise IT customer. This event will deliver the information you need on application high availability clustering, disaster recovery, and protecting your applications now and into the future.

Cloud Symposium Speakers & Sessions Posted

We have selected speakers presenting a wide range of sessions supporting availability for multiple areas of the data application stack. Check out the sessions posted and check back for additional presentations to be announced! Learn more

Register Now

Whether you are interested in learning the fundamentals of availability in the cloud or hearing about the latest developments in application and database protection, this event will deliver the information you need on application high availability clustering, disaster recovery, and protecting your applications now and into the future.

Reproduced from SIOS

Beginning Well is Great, But Maintaining Uptime Takes Vigilance

September 28, 2021 by Jason Aw Leave a Comment

Beginning Well is Great, But Maintaining Uptime Takes Vigilance

Author Isabella Poretsis states, “Starting something can be easy, it is finishing it that is the highest hurdle.” It is great to have a kickoff meeting. It is invigorating, and exciting. Managers and leaders look out at the greenfield with excitement and optimism is high. But, this moment of kickoff, and even the Champagne popping moment of a successful deployment are but just the beginning. Maintaining uptime requires ongoing vigilance.

High availability and the elusive four nines of uptime for your critical applications and databases aren’t momentary occurrences, but rather, a constant endeavor to end the little foxes that destroy the vineyard. Staying abreast of threats, up-to-date on the updates, and properly trained and prepared is the work from which your team “is never entitled to take a vacation.”

For those who want to stay vigilant in maintaining uptime, here are five tips:

1. Monitor the Environment

Very little in enterprise software still follows the “set it and forget it” mindset. Everything, since the day you uncorked the grand opening champagne to now, has been moving toward a state of decline. If you aren’t monitoring the servers, workloads, network traffic, and hardware (virtual or physical), you may lose uptime and stability.

2. Perform Maintenance

One thing that I have always noticed in over twenty plus years of software development and services is that all software comes with updates. Apply them. Remember to execute sound maintenance policies, including taking and verifying backups. One tech writer suggested the only update you regret is the one you failed to make.

3. Learn Continuously

My first introduction to high availability came when I unplugged one end of the Token Ring for a server in our lab as an intern, fresh from the CE-211 lab. The administrator was in my face in minutes. After an earful, he gave me an education. Ideally, you and your team want to learn without taking down your network, but you do absolutely want to keep learning. Look into paid courses on existing technology, new releases, emerging infrastructure. Check your vendors for courses and items related to your process, environment, software deployments and company enterprise. Free courses for many things also exist if money is an issue.

4. Multiply the learning

In addition to continuous learning, make a plan to multiply the learning. As the VP of Customer Experience at SIOS we have seen the tremendous difference between teams who share their learning and those who don’t. Teams that share their learning avoid gaps in knowledge that compromise downtime. The best way to know that you learned something is to teach it to somebody else. As you learn, share the learning with team members to reduce the risk of downtime due to error, and for that matter vacation.

5. End well . . .before the next beginning

All projects, servers, and software have an ending. End well. Decommission correctly. Begin the next phase, deployment, software relationship, etc well by closing up loose ends, documenting what went well, what did not, and what to do next. Treat your existing vendors well. You just may need them again later. Understand the existing systems and high availability solutions before proceeding with a new deployment. This proper ending helps you begin again from a better starting place headed towards a stronger outcome.

Keeping the system highly available is a continuous process. Set it and forget it is a nice catch phrase, but the reality is that uptime takes vigilance, continual monitoring, proper maintenance, and constant.

-Cassius Rhue, VP, Customer Experience

Reproduced with permission from SIOS

Fifty Ways to Improve Your High Availability

April 5, 2021 by Jason Aw Leave a Comment

Fifty Ways to Improve Your High Availability

I love the start of another year. Well, most of it. I love the optimism, the mystery, the potential, and the hope that seems to usher its way into life as the calendar flips to another year. But, there are some downsides with the turn of the calendar. Every year the start of the New Year brings ‘____ ways to do_____. My inbox is always filled with, “Twenty ways to lose weight.” “Ten ways to build your portfolio.” “Three tips for managing stress.” “Nineteen ways to use your new iPhone.” The onslaught of lists for self improvement, culture change, stress management, and weight loss abound, for nearly every area of life and work, including “Thirteen ways to improve your home office.” But, what about high availability? You only have so much time every week. So how do you make your HA solution more efficient and robust than ever. Where is your list? Here it is, fifty ways to make your high availability architecture and solution better:

Get more information from the cluster faster
Set up alerts for key monitoring metrics
Add analytics. Multiply your knowledge
Establish a succinct architecture from an authoritative perspective
Connect more resources. Link up with similar partners and other HA professionals
Hire a consultant who specializes in high availability
100x existing coverage. Expand what you protect
Centralize your log and management platforms
Remove busywork
Remove hacks and workarounds
Create solid repeatable solution architectures
Utilize your platforms: Public, private, hybrid or multi-cloud
Discover your gaps
Search for Single Points of Failure (SPOFs)
Refuse to implement incomplete solutions
Crowdsource ideas and enhancements
Go commercial and purpose built
Establish a clear strategy for each life cycle phase
Clarify decision making process
Document your processes
Document your operational playbook
Document your architecture
Plan staffing rotation
Plan maintenance
Perform regular maintenance (patches, updates, security fixes)
Define and refine on-boarding strategies
Clarify responsibility
Improve your lines of communication
Over communicate with stakeholders
Implement crisis resolution before a crisis
Upgrade your infrastructure
Upsize your VM; CPU, memory, and IOPs
Add redundancy at the zone or region level
Add data replication and disaster recovery
Go OS and Cloud agnostic
Get training for the team (cloud, OS, HA solution, etc)
Keep training the team
Explore chaos testing
Imitate the best in class architectures
Be creative. Innovation expands what you can protect and automate.
Increase your automation
Tune your systems
Listen more
Implement strict change management
Deploy QA clusters. Test everything before updating/upgrading production
Conduct root cause analysis exercises on any failures
Address RCA and Closed Loop Corrective Action reports
Learn your lesson the first time. Reuse key learnings.
Declutter. Don’t run unnecessary services or applications on production clusters
Be persistent. Keep working at it.

So, what are the ideas and ways that you have learned to increase and improve your enterprise availability? Let us know!

-Cassius Rhue, VP, Customer Experience

Reproduced from SIOS

The New Normal Will Still Include High Availability

March 21, 2021 by Jason Aw Leave a Comment

The New Normal Will Still Include High Availability

The Importance Of Uptime In A Post Pandemic World

As vaccines roll into production and roll out to facilities and communities, and companies are beginning to prepare for reentry to normal. Many articles and writers, both in technical and non-technical spheres, are predicting that ‘normal’ in the post-pandemic era will look a lot different from the ‘normal’ we were used to in 2020. Experts vary. But genuinely agree that every business and type of industry will see a change in what was ‘normal’. This change will affect everything from academics to manufacturing plants to financial institutions to houses of worship. While the new normal will potentially look different than it did when we abruptly left these places in 2020, some things will still be a part of the new normal.

Four Reasons High Availability Will Still Be Included In The New Normal of 2021.

New database and application systems

Predictions abound that home delivery, home schooling, home entertainment, and even home gyms will be a booming part of the future. This boom will lead to new businesses. It means that these new businesses will deploy cloud services and applications which need to be highly available to handle the additional online shopping. Not to mention the growth in shipping, and related services from manufacturing to accounting. For these new businesses and services, downtime is not an option that they will be willing to accept. Cloud availability SLAs – which only address infrastructure availability – will need to be subsidized with application-level HA. New databases and application systems will require 99.99% availability as an essential requirement.

Existing database and application systems

The predicted boom of everything “At Home” will definitely lead to a spike in new businesses with their new databases, applications and services. But, the rise in these new businesses and upstarts will not mean that existing companies will fold their tents and vacate the space. Instead, the boom of new competitors to the various “At Home” spaces will drive an even greater urgency for existing businesses to fortify their databases and applications. The businesses that exist in this space will need to expand to keep up with competition and growth. As they expand, maintaining high availability will be a key focus for their existing database and application systems as they transition to cloud, hybrid cloud, or a multitude of hosting solutions. These existing applications will neither abandon HA in the current locations, nor consider facing a disaster in any new permutation without high availability and disaster recovery.

IoT Management systems

New at home businesses will spark growth requiring higher availability. The shipping industry will continue to expand and generate new systems requiring availability. In fact, whether the prediction of continued work-at-home boom pan out or not, the IoT boom will almost certainly come to fruition. As more customers shop online, ship what was once hand delivered, they would require more assistance with their products. These eager and anxious customers will want more ways to track and get updates on the location of the package. Additional checkpoints, check in hubs, and IoT devices would probably be added as mainstays of the new normal. At the same time, the plans will need to include making sure the additional systems are reliable and available. It is even more important to generate the currency of trust for the customer and data for the enterprise.

New Causes for Downtime

As the whole world looks forward to a return to “normal”, this return will generate new challenges and opportunities. Alongside these new technologies, and expanded deployments of database and application 2021 will experience new causes for downtime, old nemesis causing disasters, and other unexpected outages. Applications – even those with more robust monitoring and capabilities – will still experience the old nemesis of coding bugs that lead to crashes or integration issues that lead to instability or hangs. Systems, cloud or on-premise, will still be susceptible to hardware faults, human faults, the occasional simple maintenance that isn’t so simple, and forces like mother nature or the new guy with elevated privileges and reduced knowledge. Ushered in with opportunity will be new disasters that require thoughtful approaches to highly available clusters, solutions and services.

Yes, a lot will have changed for companies since things shut down in early 2020. However, the new normal that any business returns to will definitely require high availability.

Writers Note:

For many families, the post pandemic world will be drastically different with chairs that sit empty, beds that are no longer filled, and laughter and memories that ended in heart-rending pain. To all of these families, SIOS Technology Corp. extends our deepest and most heartfelt sympathies for your loss and our prayers for your comfort and grief.

– Cassius Rhue, Vice President, Customer Experience

Reproduced from SIOS

How To Build A Highly Available Server Solution?

March 16, 2021 by Jason Aw Leave a Comment

How To Build A Highly Available Server Solution?

A key component to any high availability solution is figuring out how to redirect the client traffic. Almost every user-based application needs to connect to the server. Redirecting the client traffic will allow users to connect without having to know where the application or the database actually resides.

Most solutions recommend network-based IP redirection or network based DNS redirection. This works. However, the best solution for a high availability server in our experience is the use of a virtual IP address that can be switched from one server to another. The server is listening to connections from the virtual IP address, where it’s hosted on one server today and switched to another on another day.

To take it one step further, you can automate the failover. This is where the system makes decisions and switches the application when there is a failure detected. Bear in mind this step is key to building a highly available solution.

Benefits of Buy vs. Build High Availability Solution

This can be implemented using scripts and logic to check the status of processes and virtual IP addresses from one server to another. But one of the challenges we face in a buy vs build high availability solution is how much time we really have to spend in build. This includes time for script coding, API development such as cloudwatch API or lambda functions. Let’s not forget testing, and maintenance.

When I was younger, I was eager to write that code. But after working for large Fortune 100 companies, and getting yelled at by a high level manager, when one of my scripts didn’t work at 3 am in the morning, I feel differently. This issue was exacerbated when I discovered an issue for a code I wrote a year ago. My managers wanted the highly available solution to work 100%. If it didn’t work, time to call up someone and yell at them.

SIOS Automates High Availability

Isn’t it cheaper in the long run to buy the solution and spend a little time to tweak it to fit into our setting? This is where SIOS high availability (HA) solutions come in, whatever the application or database. SIOS has the code to switch the stack of the processes from one server to another. This gives users and managers the peace of mind that comes from automating the failover orchestration and high availability.

There are two things that I love about the SIOS HA umbrella are. One, the code for the virtual IP where the IP address is added to the server and the application is restarted to listen to the connections. The second is enabled through the use of the application agnostic API set that SIOS provides. This allows anyone to protect any application by the use of plugins. Contact SIOS today to learn more about high availability solutions specific to your environment.

– Edmond Melkomian, PMP, MCSD, consultant, SIOS technology, Inc.

Reproduced from SIOS