In my last post on Vitess we talked about why sharding gets hard as data volume grows. We showed that as traffic and total data size increases, the complexity of managing the backing database cluster also increases. Vitess was introduced as a solution to deal with these complexities. But Vitess actually does a lot more than this; in this post we will go over the main features of Vitess.
Feature Group 1: Improving MySQL
This section will address features in Vitess that help your MySQL instance(s) run better. These features do not relate to sharding or cluster management at all. In fact these features make Vitess useful, even for database deployments of a single node.
In order to understand what connection pooling is, let’s look at an example of an application that does not use connection pooling. Imagine that you were building a website which enabled people to upload and download recipes. Lets imagine that you have a very simple API that looks as follows
Upload(recipe blob) (id, error)
Download(id uuid) (blob, error)
In order to implement this API, you store user data in MySQL. On each request you create a new database connection, you read or write to the database and then you close the connection. This will work fine at a small scale. But once your website starts to become the next big thing, you will realize that creating and tearing down a database connection for each request is very expensive. You will encounter memory issues due to the overhead associated with each connection and latency issues due to the round trip(s) it takes to establish a connection. So sadly your recipe website will not be the next big thing unless you come up with a better way to manage the database access.
Connection pooling is the solution to this problem. Connection pooling is the practice of maintaining a set of long lived connections that get shared between many requests. Each of the connections in the pool is either in an active state meaning that it is being used currently or an idle state meaning that it is ready to be used. When a new request arrives, the pool is checked for idle connections, if one exists it is returned, otherwise a new connection is established. This method of managing access to the database reduces memory pressure and latency.
When using Vitess you get this feature for free. Vitess will manage the database connection pool for you.
Once your recipe website has achieved a one trillion dollar valuation, you will find that your customers desire more features. In order to support these features, the developers on your team will need to write more complex, and more expensive, MySQL queries. Despite this added complexity, you cannot afford any of these queries to bring down the whole database cluster. If a query ran on the cluster that was very expensive it could result in slowing down other queries. Or worse yet, if a bug is released which results in a thundering herd of expensive queries, your whole database could fall over and your trillion dollar valuation would be in trouble.
Vitess offers a collection of features that provide tunable limits to protect your database against these bad queries.
Time Based Limits: You can set a time bound on how long any query is allowed to run.
Result Limit: You can set a limit on the number of rows that a query can return. This avoids writing a query that returns tens of thousand of rows which would put pressure both on the application and the database.
Disallow List: You can add customized rules which reject certain queries before they even hit your database. This can help mitigate an ongoing outage and help safe guard your database in the future based upon past learnings.
Nondeterministic Query Rejection: Queries in MySQL can be nondeterministic (read this for more information). These nondeterministic queries are a bad idea to use because they are expensive and also make transaction based replication impossible.
Vitess enables enforcing and tuning all these limits on your database cluster. While this will cause some queries to fail until the application developer restructures their query, it is certainly better to enforce these limits than to allow for the possibility of your whole database falling over.
Feature Group 2: Sharding
Transparent sharding is the magic of Vitess that made it so popular. It is still mainly what Vitess is known for. Handling sharding is not a single feature, instead it’s a set of abstractions that sit between the application and the actual data on disk, that enable the application to largely forget about the fact that data is sharded. The purpose of this section is to explain these abstractions in terms of how the Vitess user interacts with them.
Mostly Transparent Sharding
When an application is accessing data in a sharded database cluster, the fundamental question the application needs an answer to is, “Which shard does the data I am reading or writing reside in?” The answer to this question is either a single specific shard or a group of shards. Based on the answer the application can target the query at the required shards. As we have already talked about in a previous post, this seemingly simple task gets hard as the backing database topology becomes more complex.
Vitess makes this sharding mostly transparent to the application. When an application sends a query to Vitess, it will arrive at a routing layer called VTGate. VTGate knows how to parse queries and figure out which shard(s) are involved in a query. VTGate will then handle routing to the correct shard(s), combine results from multiple shards if needed and return the result back to the user.
For the most part the application can access VTGate as a single unified MySQL instance that simply transparently handles sharding under the covers. However there are a few ways in which the application still needs to be aware that the under laying data is actually sharded:
Declaring Sharding Strategy: The application needs to define a piece of config called the VSchema. The VSchema basically declares how various tables should be sharded. While declaring these VSchemas is a simple matter of writing some config, it does expose the application to the concept that the underlying data is sharded. VTGate will consume these VSchemas and use them to determine the shards it should route queries to.
Higher Latency for Scatter Queries: Queries which span across multiple shards will encounter higher latencies. This becomes WAY more noticeable if the application wants to do a cross shard transaction. Vitess does have built in support for cross shards transactions but it uses a very slow 2PC protocol. Therefore it’s generally recommended that users of Vitess model their data in such a way that they can avoid scatter queries and cross shard transactions.
The way I think about Vitess’ sharding is that the application is aware that the data is sharded, but it does not need to care about how the sharding topology is maintained or accessed.
Moving Data Around
Not only does Vitess handle sharding for the application it also enables resharding data with minimal downtime. Resharding is a hard problem to solve correctly and once database topologies become sufficiently complex resharding can become an intractably hard problem. Lets just name a few of the subproblems that need to be solved when doing a reshard
- There must be some method to replicate data between source and targets.
- There must be a method to checkpoint progress.
- There must be a way to validate that data was copied correctly.
- There must be a way to redirect traffic.
These are each individually hard problems, and they scale up in complexity as the underlying database topology gets more complex. Vitess offers a resharding workflow that makes this whole progress very simple and introduces minimal downtime.
A key insight to have about Vitess’ resharding support is that it makes the choice of sharding key much less significant. Without the ability to reshard an application needs to think very carefully about the sharding key. The application needs to predict the type of queries that will need to be supported for years to come and make sure the sharding key lends itself well to supporting those queries. However with the ability to update the sharding key these decisions become much less significant.
In addition to supporting resharding Vitess also supports moving tables between databases. Fundamentally moving a table between databases looks a lot like resharding. In both cases there is a source of data and a target location of data. In both cases data must be replicated from source to target and then traffic must be switched over.
Feature Group 3: Reliability
The last feature group we are going to touch on is how Vitess helps improve the reliability of your database cluster. In order to understand this we must wrap our heads around some basics of what the Vitess architecture looks like.
The above diagram shows that a user interacts with Vitess by making calls to a collection of stateless nodes called VTGates. A VTGate node determines which shards need to be accessed to handle a query. A VTGate node then routes the user’s query to the applicable shard(s). Each shard is composed of multiple tablets. Each tablet host runs a single MySQL instance and a Vitess component called VTTablet. One of the responsibilities of VTTablet is to watch over the local MySQL instance and make sure it is healthy. Within a shard there is a single leader tablet and N secondary tablets. Only the leader can accept writes.
This architecture enables Vitess to offer high reliability to its users. In order to understand this let’s look at how this architecture enables recovering in the presence of failures.
What happens if VTGate host dies?
No big deal. These nodes are stateless so what ever compute orchestration engine is being used can simply replace the failed host.
What happens if a VTTablet host fails?
The VTGate hosts will fail to send requests to that host. The VTGates will consider this host unhealthy and will stop sending requests to it.
What happens if the MySQL process fails?
The VTTablet on the host will detect that the MySQL process failed and will report that to VTGate so that VTGate stops routing queries to the failed tablet.
What happens if the leader tablet dies for shard?
Vitess offers automatic reparenting. Basically what this means is Vitess can be configured to automatically elect a replica to be the new leader in the case the leader dies.
How do hosts get replaced?
This is something Vitess does not handle. One piece of the mental model that is important to consider for Vitess is that Vitess handles the sharding topology of the cluster, it does not handle acquiring and deploying the compute resources to back that topology. So if some tablet hosts die Vitess will take care of stopping routing to those hosts but Vitess will not take care of replacing the hosts.
That is all for this time — hope you enjoyed.