Take the hypothetical example of an organization that wants to self-manage a database cluster. For the purposes of this exercise, we’ll define a database cluster as a setup that has more than one node (for e.g., a self-managed sharded MySQL cluster). Most scenarios also apply to self-managed databases in the public cloud.
1. Hardware and licensing costs
On-prem databases require upfront investments in hardware such as servers and storage disks, apart from indirect costs associated with planning, purchasing, and deployment of hardware.
Initial costs such as services needed to install, configure and tune the database software also add to the upfront costs.
A clustered/sharded database adds more complexity to this process, particularly when hardware and software components are from different vendors.
Self-managed databases in the public cloud also require provisioning of server machines and storage disks based on the peak workload demands. So, you will incur higher costs even when the databases have less load.
You will have to pay licensing fees for some database technologies (e.g. Microsoft SQL Server licensing fees).
2. Operational costs and efficiency
Operating and maintaining the database: Operating the database takes up a non-trivial amount of time and resources. You will need to plan and execute operating system and database version upgrades, testing before upgrades, continuous tuning and optimization, and security patches.
Depending on your database size and the database’s availability SLO, the cost can be quite high. For example, a company self-managing their scaled and sharded MySQL database may end up spending significant resources (in terms of both dollar and people costs) in operations and maintenance.
Scaling up and scaling out: If you are self managing a database, how do you plan for special events like Black Friday, New Year’s Eve, the Superbowl, or new product launches? If you overprovision the hardware, that’s an unnecessary cost during non-peak periods. Also, overprovisioning hardware means overprovisioned licenses. If you underprovision, you’re at a risk of missing business opportunities. If your product becomes an overnight success (e.g., a mobile game launch), you have to worry about your database not keeping up with demand. When your database hits a vertical scaling limit (e.g., machine size), you have to scale out using sharding, which means you incur additional costs including rearchitecting, making changes to your app, operational costs and possible downtime. Even if re-architecture is not required, non linear scaling remains a concern.
Data resiliency: To prevent data loss and ensure business continuity during server failures, network failures, or natural disasters such as heat waves or hurricanes, you need to plan for data resiliency by setting up data replication across data centers in physically separated locations/regions. There’s a significant cost associated with this setup (hardware, software licensing costs, and potentially the licensing for the replication software), as well as an ongoing maintenance burden.
Geographic expansion: As your business grows to different geographical locations, you may need to replicate your database for reasons including data residency requirements and to reduce database interaction latency for clients/applications. Setting up replication, achieving data synchronization between replicas, avoiding split brain issues, and handling replica recovery from failures introduces significant operational costs.
Efficiency: Certain SaaS application providers (who are cloud customers as well) provision one database per customer. Managing a large number of customers translates into managing a large number of individual databases. This will cause inefficiencies, leading to higher operating costs.
3. People costs
As database maintenance tasks and database management complexity increase, you will need more experts to maintain your databases. Sometimes, these costs could be much higher than your overall infrastructure (hardware) costs.
4. Cost of outages
Direct revenue loss: This one is obvious — if your business is not up and running, it can cause a direct revenue loss. Even a partial database outage can cause revenue loss.
Operational toil: In case of a database-related outage, how much toil does it cause to your organization? How many resources does it take to be on call, respond, mitigate, conduct incident post-mortems and put in preventive strategies?
Contracts: Some business-critical workloads may have contractual and regulatory repercussions for outages or application downtime (e.g., stock exchanges, banks).
Brand trust issues: If your business is down due to database maintenance windows or database outages (low availability), you risk your brand being associated with lower reliability, and lower availability.
Opportunity costs: Are you spending your resources just managing databases and the associated infrastructure rather than leveraging them to enhance and build new capabilities to support the core business?
5. Productivity costs
Time to production: Do you have the right CI/CD pipeline to optimize your time to production? Do you need to set up multiple dev, pre-production and staging environments with different database instances as a path to production? What is the time and cost associated with maintaining multiple environments and a non-optimized (database-dependent) time to push a change to production? How much effort do you need to spend planning and preparing for applying schema changes?
Poor performance: Have you optimized your database for best application performance? A poorly tuned database can translate into poor application performance and wasted resources.
Slower application development: Does your database enable your organization to develop and launch applications quickly? Do you have a local copy of the database that you can download on developers’ machines (or laptops) for rapid development? Does the database have a familiar interface for faster development?
Increased Complexity: Do your developers have to spend significant effort working around the complexities of your database deployment? Do you need to allocate your rockstar developers to solving database scaling / reliability challenges instead of core business differentiators?
Calculating business benefits
When developing applications, ask yourself whether your database allows you to iterate quickly and push changes to the production environment quickly? The time it takes to deploy code to production will directly translate into how fast you can launch new applications or features and how fast you can innovate.
While we took the hypothetical example of a company running their own MySQL cluster, the list above applies to many other situations as well. When selecting a database, it is important to look at the total cost of ownership, or TCO. To solidify the points above, let’s look at one of our managed database services, Cloud Spanner, and its TCO.
Introduction to Cloud Spanner
Cloud Spanner is a distributed, globally scalable, strongly consistent relational database service built specifically for the cloud. It not only offers SQL benefits with relational database structure, but also offers horizontal scalability typically offered by non-relational databases — so that users get the best of both. This combination is designed to deliver high-performance transactions and strong consistency across rows, data centers, regions, and continents with an up to 99.999% availability SLA and enterprise-grade security. Users can focus on building applications and delivering innovation, while Spanner fully manages database infrastructure, sharding, replication, and failovers, with no scheduled downtime.
Spanner TCO
An ESG study on the economic benefits of Cloud Spanner concluded that hosting your data onCloud Spanner is 78% less expensive than using on-premises servers and up to 37% less expensive than using other cloud databases.