Azure Archives - Page 2 of 18 - SIOS SANless clusters

How to use Azure Site Recovery (ASR) to replicate a Windows Server Failover Cluster (WSFC) that uses SIOS DataKeeper for cluster storage

October 14, 2022 by Jason Aw Leave a Comment

How to use Azure Site Recovery (ASR) to replicate a Windows Server Failover Cluster (WSFC) that uses SIOS DataKeeper for cluster storage

Intro

So you have built a SQL Server Failover Cluster Instance (FCI), or maybe an SAP ASCS/ERS cluster in Azure. Each node of the cluster resides in a different Availability Zone (AZ), or maybe you have strict latency requirements and are using Placement Proximity Groups (PPG) and your nodes all reside in the same Availability Set. Regardless of the scenario, you now have a much higher level of availability for your business critical application than if you were just running a single instance.

Now that you have high availability (HA) covered, what are you going to do for disaster recovery? Regional disasters that take out multiple AZs are rare, but as recent history has shown us, Mother Nature can really pack a punch. You want to be prepared should an entire region go offline.

Azure Site Recovery (ASR) is Microsoft’s disaster recovery-as-a-service (DRaaS) offering that allows you to replicate entire VMs from one region to another. It can also replicate virtual machines and physical servers from on-prem into Azure, but for the purpose of this blog post we will focus on the Azure Region-to-Region DR capabilities.

Setting up Azure Site Recovery

We are going to assume you have already built your cluster using SIOS DataKeeper. If not, here are some pointers to help get you started.

Failover Cluster Instances with SQL Server on Azure VMs

SIOS DataKeeper Cluster Edition for the SAP ASCS/SCS cluster share disk

We are also going to assume you are familiar with Azure Site Recovery. Instead of yet another guide on setting up ASR, I suggest you read the latest documentation from Microsoft. This article will focus instead on some things you may not have considered and the specific steps required to fix your cluster after a failover to a different subnet.

Paired Regions

Before you start down the DR path, you should be aware of the concept of Azure Paired Regions. Every Region in Azure has a preferred DR Region. If you want to learn more about Paired Regions, the documentation provides a great background. There are some really good benefits of using your paired region, but it’s really up to you to decide on what region you want to use to host your DR site.

Cloud Witness Location

When you originally built your cluster you had to choose a witness type for your quorum. You may have selected a File Share Witness or a Cloud Witness. Typically either of those witness types should reside in an AZ that is separate from your cluster nodes.

However, when you consider that, in the event of a disaster, your entire cluster will be running in your DR region, there is a better option. You should use a cloud witness, and place it in your DR region. By placing your cloud witness in your DR region, you provide resiliency not only for local AZ failures, but it also protects you should the entire region fail and you have to use ASR to recover your cluster in the DR region. Through the magic of Dynamic Quorum and Dynamic Witness, you can be sure that even if your DR region goes offline temporarily, it will not impact your production cluster.

Multi-VM Consistency

When using ASR to replicate a cluster, it is important to enable Multi-VM Consistency to ensure that each cluster node’s recovery point is from the same point in time. That ensures that the DataKeeper block level replication occurring between the VMs will be able to continue after recovery without requiring a complete resync.

Crash Consistent Recovery Points

Application consistent recovery points are not supported in replicated clusters. When configuring the ASR replication options do not enable application consistent recovery points.

Keep IP Address After Failover?

When using ASR to replicate to your DR site there is a way to keep the IP address of the VMs the same. Microsoft described it in the article entitled Retain IP addresses during failover. If you can keep the IP address the same after failover it will simplify the recovery process since you won’t have to fix any cluster IP addresses or DataKeeper mirror endpoints, which are based on IP addresses.

However, in my experience, I have never seen anyone actually follow the guidance above, so recovering a cluster in a different subnet will require a few additional steps after recovery before you can bring the cluster online.

Your First Failover Attempt

Recovery Plan

Because you are using Multi-VM Consistency, you have to failover your VMs using a Recovery Plan. The documentation provides pretty straightforward guidance on how to do that. A Recovery Plan groups the VMs you want to recover together to ensure they all failover together. You can even add multiple groups of VMs to the same Recovery Plan to ensure that your entire infrastructure fails over in an orderly fashion.

A Recovery Plan can also launch post recovery scripts to help the failover complete the recovery successfully. The steps I describe below can all be scripted out as part of your Recovery Plan, thereby fully automating the complete recovery process. We will not be covering that process in this blog post, but Microsoft documents this process.

Static IP Addresses

As part of the recovery process you want to make sure the new VMs have static IP addresses. You will have to adjust the interface properties in the Azure Portal so that the VM always uses the same address. If you want to add a public IP address to the interface you should do so at this time as well.

Network Configuration

After the replicated VMs are successfully recovered in the DR site, the first thing you want to do is verify basic connectivity. Is the IP configuration correct? Are the instances using the right DNS server? Is name resolution functioning correctly? Can you ping the remote servers?

If there are any problems with network communications then the rest of the steps described below will be bound to fail. Don’t skip this step!

Load Balancer

As you probably know, clusters in Azure require you to configure a load balancer for client connectivity to work. The load balancer does not fail over as part of the Recovery Plan. You need to build a new load balancer based on the cluster that now resides in this new vNet. You can do this manually or script this as part of your Recovery Plan to happen automatically.

Network Security Groups

Running in this new subnet also means that you have to specify what Network Security Group you want to apply to these instances. You have to make sure the instances are able to communicate across the required ports. Again, you can do this manually, but it would be better to script this as part of your Recovery Plan.

Fix the IP Cluster Addresses

If you are unable to make the changes described earlier to recover your instances in the same subnet, you will have to complete the following steps to update your cluster IP addresses and the DataKeeper addresses for use in the new subnet.

Every cluster has a core cluster IP address. What you will see if you launch the WSFC UI after a failover is that the cluster won’t be able to connect. This is because the IP address used by the cluster is not valid in the new subnet.

If you open the properties of that IP Address resource you can change the IP address to something that works in the new subnet. Make sure to update the Network and Subnet Mask as well.

Once you fix that IP Address you will have to do the same thing for any other cluster address that you use in your cluster resources.

Fix the DataKeeper Mirror Addresses

SIOS DataKeeper mirrors use IP addresses as mirror endpoints. These are stored in the mirror and mirror job. If you recover a DataKeeper based cluster in a different subnet, you will see that the mirror comes up in a Resync Pending state. You will also notice that the Source IP and the Target IP reflect the original subnet, not the subnet of the DR site.

Fixing this issue involves running a command from SIOS called CHANGEMIRRORENDPOINTS. The usage for CHANGEMIRRORENDPOINTS is as follows.

emcmd <NEW source IP> CHANGEMIRRORENDPOINTS <volume letter> <ORIGINAL target IP> <NEW source IP> <NEW target IP>

In our example, the command and output looked like this.

After the command runs, the DataKeeper GUI will be updated to reflect the new IP addresses as shown below. The mirror will also go to a mirroring state.

Conclusions

You have now successfully configured and tested disaster recovery of your business critical applications using a combination of SIOS DataKeeper for high availability and Azure Site Recovery for disaster recovery. If you have questions, or would like to consult with SIOS to help you design and implement high availability and disaster recovery for your business critical applications like SQL Server, SAP ASCS and ERS, SAP HANA, Oracle or other business critical applications, please reach out to us.

Reproduced with permission from SIOS

Multi-Cloud Disaster Recovery

October 30, 2021 by Jason Aw Leave a Comment

Multi-Cloud Disaster Recovery

If this topic sounds confusing, we get it. With our experts’ advice, we hope to temper your apprehensions – while also raising some important considerations for your organisation before or after going multi-cloud. Planning for disaster recovery is a common point of confusion for companies employing cloud computing, especially when it involves multiple cloud providers.

It’s taxing enough to ensure data protection and disaster recovery (DR) when all data is located on-premises. But today many companies have data on-premises as well as with multiple cloud providers, a hybrid strategy that may make good business sense but can create challenges for those tasked with data protection. Before we delve into the details, let’s define the key terms.

What is multi-cloud?

Multi-cloud is the utilization of two or more cloud providers to serve an organization’s IT services and infrastructure. A multi-cloud approach typically consists of a combination of major public cloud providers, namely Amazon Web Services (AWS), Google Cloud Platform (GCP), and Microsoft Azure.

Organizations choose the best services from each cloud provider based on costs, technical requirements, geographic availability, and other factors. This may mean that a company uses Google Cloud for development/test, while using AWS for disaster recovery, and Microsoft Azure to process business analytics data.

Multi-cloud differs from hybrid cloud which refers to computing environments that mix on-premises infrastructure, private cloud services, and a public cloud.

Who uses multiple clouds?

Regulated industries – Many organizations run different business operations in different cloud environments. This may be a deliberate strategy of optimizing their IT environments based on the strengths of individual cloud providers or simply the product of a decentralized IT organization.

Media and Entertainment – Today’s media and entertainment landscape is increasingly composed of relatively small and specialized studios that meet the swelling content-production needs of the largest players, like Netflix and Hulu. Multi-cloud solutions enable these teams to work together on the same projects, access their preferred production tools from various public clouds, and streamline approvals without the delays associated with moving large media files from one site to another.

Transportation and Autonomous Driving – Connected car and autonomous driving projects generate immense amounts of data from a variety of sensors. Car manufacturers, public transportation agencies, and rideshare companies are among those motivated to take advantage of multi-cloud innovation, blending both accessibility of data across multiple clouds without the risks of significant egress charges and slow transfers, while maintaining the freedom to leverage the optimal public cloud services for each project.

Energy Sector – Multi-cloud adoption can help lower the significant costs associated with finding and drilling for resources. Engineers and data scientists can use machine learning (ML) analytics to identify places that merit more resources to prospect for oil, to gauge environmental risks of new projects, and to improve safety.

Multi-cloud disaster recovery pain points:

Not reading before you sign. Customers may face issues if they fail to read the fine print in their cloud agreements. The cloud provider is responsible for its computer infrastructure, but customers are responsible for protecting their applications and data. There are many reasons for application downtime that are not covered under cloud SLAs. Business critical workloads need high availability and disaster recovery protection software as well.

Developing a centralized protection policy. A centralized protection policy must be created to cover all data, no matter where it lives. Each cloud provider has its unique way of accessing, creating, moving and storing data, with different storage tiers. It can be cumbersome to create a disaster recovery plan that covers data across different clouds.

Reporting. This is important for ensuring protection of data in accordance with the service-level agreements that govern it. Given how quickly users can spin up cloud resources, it can be challenging to make sure you’re protecting each resource appropriately and identifying all data that needs to be incorporated into your DR plan.

Test your DR plan. Customers must fully screen and test their DR strategy. A multi cloud strategy compounds the need for testing. Some providers may charge customers for testing, which reinforces the need to read the fine print of the contract.

Resource skill sets. Finding an expert in one cloud can be challenging; with multi-cloud you will either need to find expertise in each cloud, or the rare individual with significance in multiple clouds.

Overcoming the multi-cloud DR challenge

Meeting these challenges requires companies to develop a data protection and recovery strategy that covers numerous issues. Try asking yourself the following strategic questions:

Have you defined the level of criticality for all applications and data? How much money will a few minutes of downtime for critical applications cost your organization in end user productivity, customer satisfaction, and IT labor?
Will data protection and recovery be handled by IT or application owners and creators in a self-service model?
Did you plan for data optimization, using a variety of cloud- and premises-based options?
How do you plan to recover data? Restoring data to cloud-based virtual machines or using a backup image as the source of recovery?

Obtain the right multi-cloud DR solution

The biggest key to success in data protection and recovery in a multi-cloud scenario is ensuring you have visibility into all of your data, no matter how it’s stored. Tools from companies enable you to define which data and applications should be recovered in a disaster scenario and how to do it – whether from a backup image or by moving data to a newly created VM in the cloud, for example.

The tool should help you orchestrate the recovery scenario and, importantly, test it. If the tool is well integrated with your data backup tool, it can also allow you to use backups as a source of recovery data, even if the data is stored in different locations – like multiple clouds. Our most recent SIOS webinar discusses this same point; watch it here if you’re interested. SIOS Datakeeper lets you run your business-critical applications in a flexible, scalable cloud environment, such as Amazon Web Services (AWS), Azure, and Google Cloud Platform without sacrificing performance, high availability or disaster protection. SIOS DataKeeper is available in the AWS Marketplace and the only Azure certified high availability software for WSFC offered in the Azure Marketplace.

12 Questions to Uncomplicate Your Cloud Migration

September 10, 2021 by Jason Aw Leave a Comment

12 Questions to Uncomplicate Your Cloud Migration

Cloud migration best practices

The “cloud is becoming more complicated,” it was the first statement in an hour-long webinar detailing the changes and opportunities with the boom in cloud computing and cloud migration. The presenter continued with an outline of cloud related things that traditional IT is now facing in their journey to AWS, Azure, GCP or other providers.

There were nine areas that surfaced as complications in the traditional transition to cloud:

Definitions
Pricing
Networking
Security
Users, Roles, and Profiles
Applications and Licensing
Services and Support
Availability
Backups

As VP of Customer Experience for SIOS Technology Corp I’ve seen how the following areas can impact a transition to cloud. To mitigate these complications, consumers are turning to managed service providers, cloud solution architects, contractors and consultants, and a bevy of related services, guides, blog posts and related articles. Often in the process of turning to outside or outsourced resources the complications to cloud are not entirely removed. Instead, companies and the teams they have employed to assist or to transition them to cloud still encounter roadblocks, speed bumps, hiccups and setbacks.

Most often these complications and slowdowns in migrating to the cloud come from twelve unanswered questions:

What are our goals for moving to the cloud?
What is your current on-premise architecture? Do you have a document, list, flow chart, or cookbook?
Are all of your application, database, availability and related vendors supported on your target cloud provider platform?
What are your current on-premises risks and limitations? What applications are unprotected, what are the most common issues faced on-premises?
Who is responsible for the cloud architecture and design? How will this architecture and design account for your current definitions and the definitions of the cloud provider?
Who are the key stakeholders, and what are their milestones, business drivers, and deadlines for the business project?
Have you shared your project plan and milestones with your vendors?
What are the current processes, governance, and business requirements?
What is the migration budget and does it include staff augmentation, training, and services? What are your estimates for ongoing maintenance, licensing, and operating expenses?
What are your team’s existing skills and responsibilities?
Who will be responsible for updating governance, processes, new cloud models, and the various traditional roles and responsibilities?
What are the applications, services, or functions that will move from IaaS to SaaS models?

Know Your Goals for the Cloud

So, how will answering these twelve questions will improve your cloud migration. As you can see from the questions, understanding your goals for the cloud is the first, and most important step. It is nearly universally accepted that “a cloud service provider such as AWS, Azure, or Google can provide the servers, storage, and communications resources that a particular application will require,” but for many customers, this only eliminates “he need for computer hardware and personnel to manage that hardware.” Because of this fact, often customers are focused on equipment or data center consolidation or reduction, without considering that there are additional cloud opportunities and gaps that they still need to consider. For example, cloud does eliminate management of hardware, but it “does not eliminate all the needs that an application and its dependencies will have for monitoring and recovery,” so if your goal was to get all your availability from the cloud, you may not reach that goal, or it may require more than just moving on premises to an IaaS model. Knowing your goals will go a long way in helping you map out your cloud journey.

Know Your Current On-Premises Architecture

A second critical category of questions needed for a proper migration to the cloud, (or any new platform) is understanding the current on-premises architecture. This step not only helps with the identification of your critical applications that need availability, but also their underlying dependencies, and any changes required for those applications, databases, and backup solutions based on the storage, networking, and compute changes of the cloud. Answering this question is also a key step in assessing the readiness of your applications and solutions for the cloud and quantifying your current risks.

A third area that will greatly benefit from working through these questions occurs when you discuss and quantify current limitations. Frequently, we see this phase of discovery opening the door to limitations of current solutions that do not exist in the cloud. For example, recently our services team worked with a customer impacted by performance issues in their SQL database cluster. A SIOS expert assisting with their migration inquired about the solution and architecture, and VM sizing decisions. After a few moments, a larger more application sized instance was deployed correcting limitations that the customer had accepted due to their on-premise restrictions on compute, memory, and storage. Similarly we have worked with customers who were storage sensitive. They would run applications with smaller disks and a frequent resizing policy, due to disk capacity constraints. While storage costs should be considered, running with minimal margins can become a limitation of the past.

Understand Business and Governance Changes

The final group of questions help your team understand schedules, business impacts, deadlines, and governance changes that need to be updated or replaced because they may no longer apply in the cloud. Migrating to the cloud can be a smooth transition and journey. However, failing to assess where you are on the journey and when you need to complete the journey can make it into a nightmare. Understanding timing is important and can be keenly aided by considering stakeholders, application vendors, business milestones, and business seasons. Selfishly, SIOS Technology Corp. wants customers to understand their milestones because as a Service provider it minimizes the surprises. But, we also encourage customers to answer these questions as they often uncover misalignment between departments and stakeholders. The DBAs believes that the cutover will happen on the last weekend of the month, but Finance is intent on closing the books over the final weekend of the same month; or the IT team believes that cutover can happen on Monday, but the applications team is unavailable until Wednesday, and perhaps most importantly the legal team hasn’t combed through the list of new NDAs, agreements, licensing, and governance changes necessary to pull it all together.

As customers work through the questions, with safety and empathy, what often emerges is a puzzle of pieces, ownership, processes, and decision makers that needs to be put back together using the cloud provider box top and honest conversations on budget, staffing, training, and services. The end result may not be a flawless migration, but it will definitely be a successful migration.

For help with your cloud migration strategy and high availability implementation, contact SIOS Technology Corp.

– Cassius Rhue, VP, Customer Experience

Learn more about common cloud migration challenges.

Read about some misconceptions about availability in the cloud.

Reproduced from SIOS

Cloud Migration Best Practices for High Availability

March 25, 2021 by Jason Aw Leave a Comment

Cloud Migration Best Practices for High Availability

In 2020 we have seen more enterprises migrating more of their mission-critical applications, ERPs and databases to the cloud. However, not all of these migrations have been smooth. I have personally witnessed cloud migration projects dramatically slowed and even stopped due to a lack of planning for application availability, the complexity of retrofitting ‘DIY High Availability’, misunderstanding related to what a ‘lift and shift’ entails and unexpected costs.

There are a number of best practices, cloud checklists, and other ways for organizations to prepare for the cloud. The following best practices should be factored into every migration strategy for high availability clustering for those who have either hit pause on their 2020 cloud migration, or plan to forge ahead in 2021.

Cloud Migration Best Practices

Gather the requirements

Many organizations moving to the cloud think that the cloud is an on-premises architecture moved to the cloud. This misunderstanding in cloud migration often leads to stalls and delays when networking, storage, disk speeds, and system sizes for on-premises collide with the cloud reality. A smoother transition to cloud begins by gathering the real requirements for the infrastructure, governance and compliance, security, sizing, and related controls and resources.

Design and Document

In the design phase, the architecture of on-premises environments is mapped to the cloud environment that has been chosen for maximum availability and thoroughly documented. In this phase, as the architecture takes shape and you identify the strategy for IPs, load balancers, IOPS, and data availability. Teams need to look at how availability native to the cloud needs to be augmented with a robust application and infrastructure availability solution capable of automating complexities of the cloud. At SIOS, our experts in AWS and Azure clustering and availability work with customers to swap on-premises NFS for AWS EFS, Azure ANF, or a standalone NFS cluster tier. Additionally, a key part of the successful implementation in this phase will be documenting everything. Documentation is an often-neglected, but essential element of migration success.

Plan for High Availability

Achieving high availability in the cloud requires understanding the requirements, creating the design, and documenting a plan that lays out a strategy for achieving those requirements. A basic plan should include staffing, staff training, deploying a QA system testing, pre-production steps, deployment, post deployment validation, and on-going iterations. The best outcomes for cloud migration arise from a deliberate, planned process; not an ad hoc, break-fix approach.

Staff

How well is your team staffed for the cloud migration? Traditional help desk, client/server IT, or IT teams may not be enough for the cloud migration. If your team is new to the cloud, it may be time to consider adding more resources or professional services-based solutions. Migrating to the cloud can be taxing, tedious, and difficult without the proper insight, information, or training. Does your staff need to incorporate training related to the cloud environment? And while you are looking into training and professional services to assist your IT team, check with your vendor for training related to the availability solution. Many vendors provide flexible training for the HA solution and cloud training can be obtained with the cloud vendors or popular sites such as Udemy.

Deploy QA

The QA deployment phase is the phase in which the team executes the plans for deploying the actual systems into the cloud. Successful deployment teams validate their plans and strategy, understand the data migration process, uncover any missing dependencies, and prepare for the next step in the process, especially testing. When this step is skipped or skimped on, the once-promising migrations often stall or fail. When you reach the QA system deployment phase, your team will do the heavy lifting of the initial migration and configuration of the applications, databases, and critical data in the cloud.

Test Your High Availability

Testing in your QA environment is a critical step. These tests are not a waste of time; they are a time saver. Deploying environments in the cloud is often easier than deploying on-premises. Your QA environment can be scripted with tools like Ansible, deployed quickly as templates from the cloud marketplace or a cloned image, or deployed and built from cloud formation templates. Once deployed, disaster scenarios can be ironed out and optimized before a disaster, not in them. Test scenarios can be leveraged to identify overprovisioning, under-provisioning or bottlenecks with networking or disk speeds. A full test scenario can also be used as a part of an on-boarding strategy for new staff. Additionally, testing should be performed on snapshots and backups as well.

Deploy Production

When the testing phase completes, and your team has validated the test results, the next phase is to move from QA to pre-production, and from pre-production to go-live. The testing phase is the last phase of the heavy lifting involving final user acceptance testing, a final cutover and update of the production data, and then the users.

Review, Revise, and Repeat

A successful migration does not end once you reach the go-live phase, but continues through the lifecycle phases. In the post go-live phase of the cloud migration strategy, your team continues to review, revise, and repeat the steps from ‘Gather’ through ‘Deploy Production’. In fact, your team should repeat this process again and again, based on requirements specific to releases, application updates, security updates, related system maintenance, operating system versions, disaster recovery planning, as well as the requirements from your high availability vendor’s own best practices. The cloud platform is always evolving and adding new features, functionality, and updates that can enhance your existing HA solution and architecture. Reviewing, revising, and repeating the process will be a necessary step in successful onboarding.

In 2021 we’ll see more enterprises migrating more mission-critical applications, ERPs and databases to the cloud. A key major factor in their success will be utilizing cloud migration best practices to avoid delays and failures throughout the process. Understanding your business requirements and needs, documenting the design and plan, deploying in a QA environment with purpose built clustering solutions, and executing extensive testing before go-live will be essential. Contact SIOS Technology to understand how the SIOS Protection Suite can be included in your thoughtful cloud migration best practices.

-Cassius Rhue, VP, Customer Experience

Reproduced from SIOS

Six Reasons Your Cloud Migration Has Stalled

December 22, 2020 by Jason Aw Leave a Comment

Six Reasons Your Cloud Migration Has Stalled

More and more customers are seeking to take advantage of the flexibility, scalability and performance of the cloud. As the number of applications, solutions, customers, and partners making the shift increases, be sure that your migration doesn’t stall.

Avoid the Following Six Reasons Cloud Migrations Stall

1. Incomplete cloud migration project plans

Project planning is widely thought to be a key contributor to project success. The planning plays an essential role in helping guide stakeholders, diverse implementation teams, and partners through the project phases. Planning helps identify desired goals, align resources and teams to those goals, reduce risks, avoid missed deadlines, and ultimately deliver a highly available solution in the cloud. Incomplete plans and incomplete planning are often a big cause of stalled projects. At the ninth hour a key dependency is identified. During an unexpected server reboot an application monitoring and HA hole is identified (see below). Be sure that your cloud migration has a plan, and work the plan.

2. Over-engineering on-premises

“This is how we did it on our on-premises nodes,” was the phrase that started a recent customer conversation. The customer engaged with Edmond Melkomian, Project Manager for SIOS Professional Services, when their attempts to migrate to the cloud stalled. During a discovery session, Edmond was able to uncover a number of over-engineered items related to on-premises versus cloud architecture. For some projects, reproducing what was done on premises can be a resume for bloat, complexity, and delays. Analyze your architecture and migration plans and ruthlessly eliminate over-engineered components and designs, especially with networking and storage.

3. Under-provisioning

Controlling cost and preventing sprawl are an important and critical aspect of cloud migrations. However, some customers, anxious about per hour charges and associated costs for disks and bandwidth fall into the trap of under-provisioning. In this trap, resources are improperly sized, be that disks that have the wrong speed characteristics, compute resources with the wrong CPU or memory footprint, or clusters with the wrong number of nodes. In such under-provisioned cases, issues arise when User Acceptance Test (UAT) begins and expected/anticipated workloads create a log jam on undersized resources. Or a cost optimization of a target node is unable to properly handle resources in a failover scenario. While resizing virtual machines in the cloud is a simple process, these sizing issues often create delays while architects and Chief Financial Officers try to understand the impact of re-provisioning resources.

4. Internal IT processes

Every great enterprise company has a set of internal processes, and chances are your team and company are no exception. IT processes are usually key among the processes that can have a large impact on the success of your cloud migration strategy. In the past, many companies had long requisition and acquisition processes, including bids, sizing guides, order approvals, server prep and configuration, and final deployment. The cloud process has dramatically altered the way compute, storage, and network resources, among others, are acquired and deployed. However, if your processes haven’t kept up with the speed of the cloud your migration may hit a snag when plans change.

5. Poor High Availability planning

Another reason that cloud migrations can stall involves high availability planning. High availability requires more than a bundle of tools or enterprise licenses. HA requires a careful, thorough and thoughtful system design. When deploying an HA solution your plan will need to consider capacity, redundancy, and the requirements for recovery and correction. With a plan, requirements are properly identified, solutions proposed, risks thought through, and dependencies for deployment and validation managed. Without a plan, the project and deployment are vulnerable to risks, single point of failure issues, poor fit, and missing layers and levels of application protection or recovery strategies. Often when there has been a lack of HA planning, projects stall while the requirements are sorted out.

6. Incomplete or invalid testing

Ron, a partner migrating his end customer to the cloud, planned to go-live over an upcoming three day weekend. The last decision point for ‘go/no-go’ was a batch of user acceptance testing on the staging servers. The first test failed. In order to make up for lost time due to other migration snags, Ron and team skipped over a number of test cases related to integrating the final collection of security and backup software on the latest OS with supporting patches. The simulated load, the first on the newly minted servers, tripped a series of issues within Ron’s architecture including a kernel bug, a CPU and memory provisioning issue, and storage layout and capacity issues. The project was delayed for more than four weeks to address customer confidence, proper testing and validation, resizing and architecture, and apply software and OS fixes.

The promises of the cloud are enticing, and a well planned cloud migration will position you and your team to take advantage of these benefits. Whether you are beginning or in the middle of a cloud migration, we hope this article helps you be more aware of common pitfalls so you can hopefully avoid them.

– Cassius Rhue, Vice President, Customer Experience

Reproduced from SIOS