Google Cloud Platform Archives - SIOS SANless clusters

Measuring and Improving Write Throughput Performance on GCP Using SIOS DataKeeper for Windows

April 21, 2022 by Jason Aw Leave a Comment

Measuring and Improving Write Throughput Performance on GCP Using SIOS DataKeeper for Windows

Background

This post serves to document my findings in GCP in regards to write performance to a disk being replicated to GCP. But first, some background information. A customer expressed concern that DataKeeper was adding a tremendous amount of overhead to their write performance when testing with a synchronous mirror between Google Zones in the same region. The original test they performed was with the bitmap file on the C drive, which was a persistent SSD. In this configuration they were only pushing about 70 MBps. They tried relocating the bitmap to an extreme GCP disk, but the performance did not improve.

Moving the Bitmap to a Local SSD

I suggested that they move the bitmap to a local SSD, but they were hesitant because they believed the extreme disk they were using for the bitmap had latency and throughput that was as good or better than the local SSD, so they doubted it would make a difference. In addition, adding a local SSD is not a trivial task since it can only be added when the VM is originally provisioned.

Selecting the Instance Type

As I set out to complete my task, the first thing I discovered was that not every instance type supported a local SSD. For instance, the E2-Standard-8 does not support local SSD. For my first test I settled on a C2-Standard-8 instance type, which is considered “compute optimized”. I attached a 500 GB persistent SSD and started running some write performance tests and quickly discovered that I could only get the disk to write at about 140MBps rather than the max speed of 240MBps. The customer confirmed that they saw the same thing. It was perplexing, but we decided to move on and try a different instance type.

The second instance type we selected was an N2-Standard-8. With this instance type we were able to push the disk to its maximum throughput speed of 240 MBps when not replicating the disk. I moved the bitmap to the local SSD I had provisioned and repeated the same tests on a synchronous mirror (DataKeeper v8.8.2) and got the results shown below.

The Results

Diskspd test parameters

diskspd.exe -c96G -d10 -r -w100 -t8 -o3 -b64K -Sh -L D:\data.dat
diskspd.exe -c96G -d10 -r -w100 -t8 -o3 -b8K -Sh -L D:\data.dat
diskspd.exe -c96G -d10 -r -w100 -t8 -o3 -b4K -Sh -L D:\data.dat

MBps

The Data

Write Size	MB/s	MBps Percent Overhead
64k-Mirror	240.01	0.00%
64k-NoMirror	240.02
8k-Mirror	58.87	39.18%
8k-NoMirror	96.8
4k-Mirror	29.34	21.84%
4k-NoMirror	37.54

Write Size	AvgLat	AvgLat Overhead
64k-Mirror	6.247	-0.02%
64k-NoMirror	6.248
8k-Mirror	3.183	39.21%
8k-NoMirror	1.935
4k-Mirror	3.194	21.88%
4k-NoMirror	2.495

Conclusions

The 64k and 4k write sizes all incur overhead which could be considered as “acceptable” for synchronous replication. The 8k write size seems to incur a more significant amount of overhead, although the average latency of 3.183ms is still pretty low.

-Dave Bermingham, Director, Customer Success

Reproduced with permission from SIOS

Multi-Cloud Disaster Recovery

October 30, 2021 by Jason Aw Leave a Comment

Multi-Cloud Disaster Recovery

If this topic sounds confusing, we get it. With our experts’ advice, we hope to temper your apprehensions – while also raising some important considerations for your organisation before or after going multi-cloud. Planning for disaster recovery is a common point of confusion for companies employing cloud computing, especially when it involves multiple cloud providers.

It’s taxing enough to ensure data protection and disaster recovery (DR) when all data is located on-premises. But today many companies have data on-premises as well as with multiple cloud providers, a hybrid strategy that may make good business sense but can create challenges for those tasked with data protection. Before we delve into the details, let’s define the key terms.

What is multi-cloud?

Multi-cloud is the utilization of two or more cloud providers to serve an organization’s IT services and infrastructure. A multi-cloud approach typically consists of a combination of major public cloud providers, namely Amazon Web Services (AWS), Google Cloud Platform (GCP), and Microsoft Azure.

Organizations choose the best services from each cloud provider based on costs, technical requirements, geographic availability, and other factors. This may mean that a company uses Google Cloud for development/test, while using AWS for disaster recovery, and Microsoft Azure to process business analytics data.

Multi-cloud differs from hybrid cloud which refers to computing environments that mix on-premises infrastructure, private cloud services, and a public cloud.

Who uses multiple clouds?

Regulated industries – Many organizations run different business operations in different cloud environments. This may be a deliberate strategy of optimizing their IT environments based on the strengths of individual cloud providers or simply the product of a decentralized IT organization.

Media and Entertainment – Today’s media and entertainment landscape is increasingly composed of relatively small and specialized studios that meet the swelling content-production needs of the largest players, like Netflix and Hulu. Multi-cloud solutions enable these teams to work together on the same projects, access their preferred production tools from various public clouds, and streamline approvals without the delays associated with moving large media files from one site to another.

Transportation and Autonomous Driving – Connected car and autonomous driving projects generate immense amounts of data from a variety of sensors. Car manufacturers, public transportation agencies, and rideshare companies are among those motivated to take advantage of multi-cloud innovation, blending both accessibility of data across multiple clouds without the risks of significant egress charges and slow transfers, while maintaining the freedom to leverage the optimal public cloud services for each project.

Energy Sector – Multi-cloud adoption can help lower the significant costs associated with finding and drilling for resources. Engineers and data scientists can use machine learning (ML) analytics to identify places that merit more resources to prospect for oil, to gauge environmental risks of new projects, and to improve safety.

Multi-cloud disaster recovery pain points:

Not reading before you sign. Customers may face issues if they fail to read the fine print in their cloud agreements. The cloud provider is responsible for its computer infrastructure, but customers are responsible for protecting their applications and data. There are many reasons for application downtime that are not covered under cloud SLAs. Business critical workloads need high availability and disaster recovery protection software as well.

Developing a centralized protection policy. A centralized protection policy must be created to cover all data, no matter where it lives. Each cloud provider has its unique way of accessing, creating, moving and storing data, with different storage tiers. It can be cumbersome to create a disaster recovery plan that covers data across different clouds.

Reporting. This is important for ensuring protection of data in accordance with the service-level agreements that govern it. Given how quickly users can spin up cloud resources, it can be challenging to make sure you’re protecting each resource appropriately and identifying all data that needs to be incorporated into your DR plan.

Test your DR plan. Customers must fully screen and test their DR strategy. A multi cloud strategy compounds the need for testing. Some providers may charge customers for testing, which reinforces the need to read the fine print of the contract.

Resource skill sets. Finding an expert in one cloud can be challenging; with multi-cloud you will either need to find expertise in each cloud, or the rare individual with significance in multiple clouds.

Overcoming the multi-cloud DR challenge

Meeting these challenges requires companies to develop a data protection and recovery strategy that covers numerous issues. Try asking yourself the following strategic questions:

Have you defined the level of criticality for all applications and data? How much money will a few minutes of downtime for critical applications cost your organization in end user productivity, customer satisfaction, and IT labor?
Will data protection and recovery be handled by IT or application owners and creators in a self-service model?
Did you plan for data optimization, using a variety of cloud- and premises-based options?
How do you plan to recover data? Restoring data to cloud-based virtual machines or using a backup image as the source of recovery?

Obtain the right multi-cloud DR solution

The biggest key to success in data protection and recovery in a multi-cloud scenario is ensuring you have visibility into all of your data, no matter how it’s stored. Tools from companies enable you to define which data and applications should be recovered in a disaster scenario and how to do it – whether from a backup image or by moving data to a newly created VM in the cloud, for example.

The tool should help you orchestrate the recovery scenario and, importantly, test it. If the tool is well integrated with your data backup tool, it can also allow you to use backups as a source of recovery data, even if the data is stored in different locations – like multiple clouds. Our most recent SIOS webinar discusses this same point; watch it here if you’re interested. SIOS Datakeeper lets you run your business-critical applications in a flexible, scalable cloud environment, such as Amazon Web Services (AWS), Azure, and Google Cloud Platform without sacrificing performance, high availability or disaster protection. SIOS DataKeeper is available in the AWS Marketplace and the only Azure certified high availability software for WSFC offered in the Azure Marketplace.

Webinar: SQL Server on the Google Cloud Platform

May 19, 2019 by Jason Aw Leave a Comment

SQL Server on the Google Cloud Platform

Webinar: SQL Server on the Google Cloud Platform

Google Cloud Platform (GCP), and specifically their Google Compute Engine (GCE) offering is Google’s answer to Windows Azure and Amazon AWS in the Cloud IaaS space. Like other public cloud providers, Google is able to offer you on-demand compute, networking and storage for you to deploy applications in the cloud. Deploying SQL Server in the GPE requires some Google-specific knowledge to ensure your deployment is highly available. In this webinar, Dave Bermingham, Microsoft Cloud and Datacenter Management MVP, is joined by Google to discuss Google Cloud and how to deploy SQL Server in the Google Cloud Platform with an eye towards high availability and disaster recovery.