March 19, 2022 |
Improving Your Cloud Adoption JourneyImproving Your Cloud Adoption Journey
1. Add high availability to the cloudIn the push to the cloud many IT and business leaders found themselves rushing to move services and applications from data centers that they were closing due to COVID-19 into the cloud. Others rushed to the cloud, not because of data center closures, but to deal with the wave of exploding demand from the sudden increase in remote working. For some, the journey to the cloud was so fast that high availability wasn’t included, Now they’ve discovered (the hard way) that applications still crash in the cloud and that unexpected outages and unplanned downtime are still the nemesis of AWS, Azure and GCP – just as they were in their previous data center. The first step in fixing your cloud journey is to add high availability. This will mean several things to your enterprise:
2. Expand for higher availability for disaster recoveryOf course not everyone made the move to cloud without considering some form of high availability. Some IT teams had the foresight to not leave HA on-premises, but in the rush to cloud moved all of their critical servers to the same cloud Availability Zone. While having some HA protections is better than complete vulnerability, if you’ve only deployed your servers and applications in a single Availability Zone (AZ), now is the time to expand to multi-AZ for your standby cluster node, or even build in disaster recovery by deploying a third node in a different region. SIOS has helped dozens of customers plan multiple-AZ architectures and add disaster recovery solutions. 3. Build your cloud journey teamOvernight some companies, and their IT teams, went from being fully on-premises to wrestling with Cloud Formation Templates, QuickStart Guides, IAM roles, internal load balancers, Overlay IPs, and deciphering what exactly that VM size means. Now is the time to build a team to support the journey to the cloud. This will mean several things: a. Adding capacity. Unless you were able to pull off a complete lift and shift, you likely have the same staff managing cloud and on-premises applications. Legacy solutions are known for being temperamental and requiring a lot of work to keep them stable and availableto navigate the cloud journey ahead you’ll need capacity capable of addressing availability requirements, understanding cloud architecture, and plotting the course forward for enterprise needs. b. Augmenting skills with training. Give your IT team training for the cloud. To manage and plan the course forward, look for ways to augment the IT excellence within your organization with additional training on cloud solutions, architecture, best practices, and trade-offs. A confidently trained staff will not only pay dividends in increased availability, but they will also pay dividends by addressing availability, maintenance, and growth in an economic, scalable and logical way. Translation: they’ll avoid wasting money as they build out the rest of your cloud infrastructure. 4. Integrate automation and analytics to ensure uptimeAs VP of Customer Experience at SIOS Technology Corp. I have worked with several companies who made the move to cloud in 2021 without sacrificing HA, DR or their team. If you took achieving the required number of nines of uptime (99.99%) seriously, and having a disaster plan that was non-negotiable, then it’s time to add the rigor of analytics and additional monitoring. Ensure that your availability solution has application aware automation and orchestration for recovery in the event of a disaster or unplanned downtime. Add analytics and automation to solidify your solution and take your cloud migration up another notch from one of reactive failovers, to proactive notification and mitigation of the failure before it occurs. Imagine being notified of underperforming applications, or of increasing latency, errors, or VM non-responsive behavior in time to avoid downtime in the peak business times. Analytics are also important as they can reveal systems and applications that may have escaped your original availability architecture. 5. Update IT processes and governanceMany things we think of as a failure are rooted in a failure of process. Make sure that your organization’s processes are up to date, well-documented, properly communicated, and adhered to. These processes should contain a few key minimums related to who, what, when, where and how all tied back to the business strategies, goals, and organizational needs as they pertain to the customer. Make sure that ownership and sign-off processes for your new cloud environment are well-documented. I have seen firsthand the frustration that comes from conflicting, clashing, or unresolved roles and responsibilities for customers who have moved from hardware teams that acquire infrastructure to cloud teams. Muddling through a migration is one set of pain points, digging out of a disaster without clear governance is a much bigger, more costly issue. If you’ve made the leap to cloud, staying there and making it work for you is the next part of the journey. If your cloud journey was sudden or rocky, consider these five points for improving your cloud journey and know that SIOS Technology can help you improve not only your high availability in the cloud, but also your processes for running in the cloud. -Cassius Rhue, VP, Customer Experience Reproduced with permission from SIOS |
March 14, 2022 |
How To Move A Mirror from One Network to AnotherHow To Move A Mirror from One Network to AnotherGreg Tucker, SIOS Senior Product (Windows) Support Engineer provides an 8-minute tutorial demonstrating how to move a mirror on a two-node system.
Upon completion of the tutorial Greg shares all SIOS Support contact info for further questions or inquiries. How To Move A Mirror from One Network to Another | SIOS Reproduced with permission from SIOS |
March 8, 2022 |
Highly Available or Highly Vulnerable? A Checklist for High AvailabilityHighly Available or Highly Vulnerable? A Checklist for High AvailabilityIt’s no secret that businesses of all sizes have an ever-growing need for IT systems. But IT systems are only effective for these businesses and their clients if they are operational, resilient and highly available. As enterprises look to build out their enterprise availability, having a baseline for weighing and assessing your vulnerability can be the difference that produces a successful merger of infrastructure, software, services and support that increases your success. Sometimes, the most basic of checklists can help you sort through whether or not your solution is highly available or highly vulnerable? Does your organization have the proper infrastructure to support high availability?
They deploy software but have instability within the network infrastructure, servers, and datacenter itself. Cloud addresses a lot of the infrastructure issues, but not all cloud platforms are architected the same. Be sure to understand your datacenter, on-premises or cloud. Does your organization have a runbook (or playbook) in place that covers design, architecture, and process?
If you answered, what is a runbook or playbook then your first step is to find or create one. A runbook (or playbook) helps your organization maintain systems and processes with respect to the highly available system architecture. Some companies use automated tools to create scripts that deploy and configure servers, others use a version-controlled document to outline how all things work together to provide resilience and success. Your team needs to have a place that newcomers and existing team members can go to to understand the environment, the process, and the tools being used. Does your organization have resources dedicated to maintaining high availability best practices?
“I didn’t set these systems up,” the IT Admin stated, “I just inherited these systems with some other servers.” The lament was an honest and often observed phenomenon in organizations. Whether it is the result of mergers and acquisitions, cost reductions, outsourcing, or general staff turnover, a key component of a highly available enterprise is sufficient staffing. A key to a highly vulnerable enterprise is a lack of staffing, undertrained or undersupported staffing. Does your organization have proper change management controls in place?
Change management is important. Change management controls and polices are an absolute must in reducing risk and making sure that your systems are available. A user without proper restraints can add packages or updates that destroy stability, or make changes that disrupt the organization for hours. In addition, not having a defined policy often creates drift between what is expected (documented) and the actual (what is in place). Change management is also critical to ensure that your standby cluster is at the same patch and software levels as the primary/source system, and that QA (or Pre-Production) are not grossly deviating from Production. Does your organization have proper access controls in place?
Our Services team joined a customer call and waited, and waited, and waited for the administrator with permissions to run a set of elevated commands to join the session to configure and update their software. Weeks later, our team joined a different customer call and watched in horror as multiple users, all with administrative privileges, ran a bevy of commands on the same cluster. The difference in the two calls pointed out with stunning clarity that access controls are important. A highly available enterprise needs to ensure that proper access controls are in place that prevents users from running elevated commands that could damage the configuration or diminish its operation. Be sure that users have limits on what they can do based on their roles, needs, and even experience. Does your company have a regular test process?
Testing takes time, but in my role of assisting customers with their cloud migrations and high availability deployments, the time has always been well spent. Often, the difference between the highly available and the highly vulnerable can come down to the customer or partner’s test process. As solutions become more complex, testing and validation are becoming more and more essential to reducing risk and vulnerabilities. If everything goes from design to production, you’re running a highly vulnerable system. But, if you’ve got tests and checkpoints, a process to verify changes before they make it into production your risks are significantly reduced. As VP of Customer Experience, our services team worked with a banner customer who deployed their systems for an entire year in QA before completing their go-live migration. Over that year they simulated outages, disasters, customer loads, downtime, maintenance, patching strategies, backups, recovery from backup, and a bevy of other test suites. Consequently, they’ve had remarkable results in performance, process adherence, high availability, and enterprise success. While no checklist will be able to cover every potential vulnerability in high availability, answering these questions will give you a strong foundation for understanding if your enterprise is highly available or highly vulnerable. Reproduced with permission from SIOS |
March 3, 2022 |
Disney’s Encanto – Lessons on High Availability, IT Teams & downtimeLessons on High Availability, IT Teams, and defeating downtime from Disney’s EncantoOver the weekend I’ve joined the masses of people who have tuned in to Disney’s Encanto and become a fan of the story, a student of the lessons and opportunities, and an absolute fan of Lin-Manuel Miranda. What does Disney’s Encanto provide in relation to High Availability, Clustering, and Resiliency? Lessons on High Availability, IT Teams, and defeating downtime from Disney’s Encanto In Encanto you quickly learn that the Family Madrigal is a special family. In one of the opening songs, “The Family Madrigal” we understand that all of the members of the family have unique and special gifts; superhuman strength, the ability to hear for miles, prophecy and prediction, the power to conjure beautiful flowers and plants, the ability to shape-shift, the ability to heal, and the ability to control the weather. Well, everyone it seems has a ‘gift’ except Mirabel. Lesson 1: You don’t need superhuman gifts to make a difference.Mirabel, while not gifted like the other siblings and members of the family, is the central figure in understanding the health, and disease of the family. Moreover, she is able to help the family put things back together when it all falls apart, without the other gifts. You need High Availability, but you don’t have to break the budget, develop supernatural abilities, or depend on a miracle to achieve it. As the movie continues, Pepa’s youngest son Antonio is readied for his gift ceremony. However, during the party and celebration Abuela notices cracks in the foundation of Casita. But her warnings go unheeded. Lesson 2: Don’t ignore the cracks.When Mirabel sees the cracks it leads her on a quest to find out what is endangering Casita and how she can help. Initially, she is ignored by the others and even rebuked. How will you respond if you see cracks or shortcomings in your IT infrastructure, or cracks in your architecture and design? Will you ignore the cracks, pretend they aren’t seen or even rebuke the team for finding them? Don’t ignore the cracks. Responding to the first sign of an issue is most often the perfect way to prevent a greater issue. On her quest to find answers and save the miracle’s magic, Dolores tells Mirabel to talk to her super-strong older sister, Luisa who initially suggests that everything is okay and that there is absolutely nothing wrong. But Luisa eventually begins to reveal that the weight of knowing there is an is becoming too much for her to carry alone. Lesson 3: The weight of HA is too big for a single person or team.As Luisa put it, “It is pressure that breaks the camel’s back, pressure that’ll never stop.”. Developing an High Availability solution, designing and architecting for resilience and data availability is not a simple process, and it is definitely not a task for a single person or single team. Your DBA, IT Admin, and ERP Administrators cannot handle the weight of maintaining critical enterprise availability alone. Likewise, a one-dimensional approach cannot carry the weight of four (4) nines of availability. Instead, it takes a fully aligned team working in concert with a complete HA solution to understand, design, develop, and deploy the tools and techniques. How well are the roles and responsibilities on your IT teams distributed and defined? Ensure no one is bearing the responsibility for HA alone. When Mirabel seeks Bruno for the answers she is looking for, everyone says, “We Don’t Talk About Bruno.” Bruno’s gift is precognition, but because of his warnings and seemingly negative visions, he disappeared. Lesson 4: Don’t be afraid of the person who sees trouble ahead.As VP of Customer Experience, I’ve helped customers perform health assessments for their infrastructure and clustering solutions. When the health check completes, not all customers are happy to hear that they have issues to resolve. We all do all we can to avoid the bad news. But, ignoring upgrades, forgetting to do maintenance, and downplaying risks identified by the Bruno of your team will not make the trouble disappear. In fact, it may make your worst fears a reality. Mirabel eventually finds a secret passage leading to Bruno and discovers that Bruno never left, but felt that he had to destroy her vision to protect her and himself. Lesson 5: Corporate culture can crush or create higher availabilityYour culture can either crush or create a space for higher availability and resiliency. Mirabel asks Bruno if he has been patching the cracks in Casita, but Bruno replies that he is afraid of the cracks. Lesson 6: Don’t be afraid of the cracksHA requires continuous, coordinated ongoing effort. An essential part of the effort is finding solutions and fixes for those IT cracks that could jeopardize your application or the gaps between architecture and execution. Even as Bruno (or Hernando) tries to patch the cracks, it is apparent that the foundational issues are too much for spackle and superficial solutions. Lesson 7: Spackle won’t fix a foundational problemTake a look at your infrastructure and look at the ways in which problems are being addressed. Are you deploying workarounds, band-aids, and temporary “hacks”, or are you looking at architectural and foundational solutions that address the root cause of the problem with your clusters, enterprise availability, and execution during disasters? Lesson 8: Find your JorgeIf you’ve been deploying more hacks and workarounds than root cause solutions, find your Jorge. Find a skilled team member, partner, or solution provider and give them permission to grapple with implementing the foundational solution that will fix the problem or strengthen the infrastructure. Bruno sees another vision that Casita could be saved if Mirable hugged Isabela. Mirabel offers Isabela an opportunity to blossom but Abuela doesn’t see it that way. An argument between Mirabel and Abuela ensues,and Abuela blames Mirabel for the cracks in ‘Casita’. Mirabel blames Abuela for her impossible demands, unrealistic expectations, and misplaced hopes. Lesson 9: Blame creates more problemsPass the Blame is a great party game, but it is not great for HA, cluster resilience, or data protection. I once helped a customer whose organization illustrated the unproductiveness of blame. After a proof of concept cluster hit an issue causing a delay, the Project Manager blamed the application team for the delay. The applications team blamed the backup administrator, who in turn blamed the infrastructure admin. Throughout the blaming session, their cluster remained unavailable, the proof-of-concept remained stalled, and the only progress being made was in the cracks of anger growing between teams. It was only when they put these differences aside that they could make the adjustments they needed to resolve their issue and continue with a successful POC. ‘Casita’ collapses and Mirabel runs away. Later, Alma finds Mirabel and after reconciling they join the family and village in building back Casita Better than ever. Lesson 10: Build it back strongerOf course, the final scenes of Encanto are filled with lessons in the confession of Alma (Abuela) such as:
But the most important of the final lessons is to build back better, stronger, and together. After every unplanned or planned outage, there will be lessons learned from root cause analysis, experiences and fresh understanding. As a result of this, there will also be an opportunity to build back a stronger solution and architecture for your high availability and disaster recovery. Consider the case of a customer who was able to create a standard deployment pipeline and QA system after discovering an outage was caused by code deployed directly to production. Or another customer who uncovered that disk and database warnings were being suppressed for weeks before the outage. Don’t waste the time and opportunity that comes when you have downtime. Be sure to work together to avoid the silos, dependencies on single strengths, or placing the hope of your infrastructure on the wrong thing. Of course, you should watch the whole movie for yourself, but there are even more lessons for HA as you walk through the magic and music of the movie and pick up on the lives and lessons from a few of the other characters
The movie closes with a great reunion and Mirabel and the Madrigals stand in front of the finished house. When Mirabel touches the doorknob to the door the ‘Casita’ springs back to life and the home along with the magical gifts of the family all return. Try these ten lessons for High Availability from Encanto, enjoy the movie, and remember “There is nothing you can’t do… together” with your team of customers, partners, solution providers, and administrators. |
February 27, 2022 |
How To Activate a License for SIOS Protection Suite for LinuxHow To Activate a License for SIOS Protection Suite for LinuxSince you have acquired your SIOS Protection Suite for Linux software, you will need to activate your license. This seven-minute video will help you get started. It walks you through all of the steps needed to begin running your SIOS Protection Suite for Linux software. Watch as a SIOS support representative demonstrates the steps that are necessary to install SIOS licenses: how to insert entitlement/activation IDs, how to obtain and insert host IDs, and activation file download. The video illustrates where to access software for download, how to view and validate host name and ID from purchased or trial entitlements, and how to download the activation files contained in your welcome email to complete the process. You will also learn how to access our SIOS Documentation portal, where you can find release notes, installation guides, technical documentation and in depth information on SIOS Protection Suite for Linux as well as a wide range of topics for every SIOS product. Receive helpful tips and convenient insights on how to complete the steps quickly and easily. See how simple it is to start running SIOS Protection Suite for Linux. How To Activate a License for SIOS Protection Suite for Linux Reproduced with permission from SIOS |