SQL Instances¶
This page documents how we deploy and configure SQL database instances.
We almost always make use of PostgreSQL SQL instances. PostgreSQL is best-of-breed among Relational Database Management Systems both in terms of performance and features.
Occasionally we are required to run a MySQL instance when it is the only database supported by a third-party application.
Although it can have some gnarly edges, the SQL data model is mature, well documented and there are a lot of resources on the Web which help you get to grips with it.
Example
The GitLab service as part of the University Developers' Hub stores all non-git state in a PostgreSQL database and serves over 1,000 users daily with a highly complex data model.
We make use of Cloud SQL managed SQL instances.
"Database" versus "instance"¶
There is a difference between a SQL instance and a database. A SQL instance corresponds to a single instance of a database server cluster. A single PostgreSQL cluster supports the creation of multiple databases. A given SQL instance can, therefore, have multiple databases within it.
Usually we configure one instance per product and create one or more databases within that instance for each web application.
Cloud SQL supports PostgreSQL instances, MySQL instances and SQL Server instances. We have yet to had a need for a SQL Server instance and so this facet of Cloud SQL is under-explored within the team.
PostgreSQL versions¶
Usually we use the latest version of PostgreSQL available when a product is first deployed. A notable omission in Cloud SQL is any method of automatic version upgrading. Instead, the recommended procedure is to dump the database as a SQL document in a Cloud Storage bucket, re-provision the database with the new engine version and restore the SQL dump.
Warning
This somewhat clunky upgrade procedure means it is almost impossible to easily update a service with zero downtime. In principle we could spin up a new PostgreSQL instance and use PostgreSQL's native streaming replication support to move from one instance to the other with zero downtime.
We have not yet done this and so this particular achievement is waiting for someone to unlock it.
Boilerplate support¶
In our standard boilerplate, the Cloud SQL instance is configured in sql.tf. Our boilerplate can generate configuration for PostgreSQL or MySQL instances.
SQL instance naming
Our boilerplate uses a random name for the SQL instance since, annoyingly, SQL instance names cannot easily be re-used for up to a week after deletion.
We make use of Google's terraform
module to
configure the instance. For non-production environments we usually go with a
db-f1-micro
tier instance with no failover server. For production we tune the
instance size to meet expected demand and configure PostgreSQL in a
highly-available "hot-failover" configuration. The cluster configuration is
managed for us by Cloud SQL.
Automatic backups are scheduled for 1AM each morning and we set the allowable maintenance window to start at 2AM on Sunday mornings.
"Off-site" backups¶
We have a custom tool which runs nightly to backup our production databases to a Google Cloud Storage bucket. This bucket is hosted in a Google project dedicated to this purpose. As such we have a degree of resilience against deleting the SQL instance or the host project, both of which would delete the automated nightly backups made by Cloud SQL itself.
No setup is required within our terraform for this. The service account which performs nightly backups is granted the appropriate IAM rights at the top-level Google Folder level and uses the Google API to automatically discover instances which need backing up.
The power of APIs
One of the advantages of using a Cloud-hosted product for our databases is that it is designed to be introspected, managed and provisioned via an API. Data Backup code.
Database users and service accounts¶
PostgreSQL has its own set of users and user rights. These are managed independently from service account users. We use the term database user to mean a user created in PostgreSQL and service account to mean an identity which is created within Cloud IAM.
Usually our web applications are deployed with a dedicated service account for that application. In order for the web application to be able to use the database a separate database user must be created for it.
We create database users directly using the google_sql_user
resource
with a password generated by terraform's random_password
resource.
This password can then be passed to the application securely and never leaves
the terraform state. The password can be rotated by simply terraform state
rm
-ing the password resource and running terraform apply
again.
We don't currently have a good way of configuring database user roles. Since
this happens infrequently we usually include a db_roles.sql
file at the top of
the project along with instructions on how to run the SQL against the database.
Future work
The terraform postgresql
provider
does advertise support for Google Cloud SQL instances but it currently
requires that credentials be passed in the GOOGLE_APPLICATION_CREDENTIALS
environment variable.
Although we can easily create credentials in terraform we cannot easily set the environment of the terraform process. As such passing credentials to the provider is non-trivial.
Finding a way to solve this problem cleanly is an open issue.
Connecting to SQL instances¶
Although SQL instances can, and often do, have public IP addresses, they cannot directly be connected to. Instead one needs to perform a mutual authentication dance involving the creation of temporary TLS certificates.
We could do that dance but Google provides a proxy utility which performs the required authentication and makes the database available over a local TCP or UNIX socket as if it were running locally.
The Cloud SQL proxy is where the world of service accounts meets the world of
SQL instances. The Cloud SQL proxy takes a set of credentials for a service
account. In order to be able to connect to the SQL instance, that service
account must have the cloudsql.client
role.
This role is often granted to the service account representing the web application.
Important
Being able to connect to a Cloud SQL instance is not the same as being able to connect your application to it. A database user and password must still be supplied by the web application when connecting to the instance.
Outside of the admin service accounts, only "owner" users have the right to connect directly to the database by default. There is an open issue tracking a modification to this policy.
Cloud Run¶
Cloud Run is used for deploying web applications. It has native support for connecting to Cloud SQL instances and causes the Cloud SQL instance to appear as a UNIX socket within the container.
Kubernetes¶
Google Kubernetes Engine does not directly support Cloud SQL connections but it is easy to create a sidecar container which runs the proxy making the instance available within the cluster. More information is available on the kubernetes deployment page.
Summary¶
In summary,
- We make use of a single PostgreSQL instance per product environment.
- We configure the amount of disk, CPU and memory per-environment in response to anticipated demand.
- SQL instances are configured via terraform and managed via Cloud SQL.
- We have a standard configuration for automated backup and maintenance windows along with an automated "off-site" backup mechanism.
- In order to connect your application to a SQL instance you must first create,
via terraform:
- a database using the google_sql_database resource,
- a database user and password using the google_sql_user and random_password resources, and
- an IAM binding granting the service account representing your application's
identity the
cloudsql.connect
role.
- If not using Cloud Run to deploy your application, you will need to arrange for the Cloud SQL connection proxy to be running in order to connect to the instance.