Skip to content

SQL Instances

This page documents how we deploy and configure SQL database instances.

We almost always make use of PostgreSQL SQL instances. PostgreSQL is best-of-breed among Relational Database Management Systems both in terms of performance and features.

Occasionally we are required to run a MySQL instance when it is the only database supported by a third-party application.

Although it can have some gnarly edges, the SQL data model is mature, well documented and there are a lot of resources on the Web which help you get to grips with it.

Example

The GitLab service as part of the University Developers' Hub stores all non-git state in a PostgreSQL database and serves over 1,000 users daily with a highly complex data model.

We make use of Cloud SQL managed SQL instances.

"Database" versus "instance"

There is a difference between a SQL instance and a database. A SQL instance corresponds to a single instance of a database server cluster. A single PostgreSQL cluster supports the creation of multiple databases. A given SQL instance can, therefore, have multiple databases within it.

Usually we configure one instance per product and create one or more databases within that instance for each web application.

Cloud SQL supports PostgreSQL instances, MySQL instances and SQL Server instances. We have yet to had a need for a SQL Server instance and so this facet of Cloud SQL is under-explored within the team.

PostgreSQL versions

Usually we use the latest version of PostgreSQL available when a product is first deployed. A notable omission in Cloud SQL is any method of automatic version upgrading. Instead, the recommended procedure is to dump the database as a SQL document in a Cloud Storage bucket, re-provision the database with the new engine version and restore the SQL dump.

Warning

This somewhat clunky upgrade procedure means it is almost impossible to easily update a service with zero downtime. In principle we could spin up a new PostgreSQL instance and use PostgreSQL's native streaming replication support to move from one instance to the other with zero downtime.

We have not yet done this and so this particular achievement is waiting for someone to unlock it.

Boilerplate support

In our standard boilerplate, the Cloud SQL instance is configured in sql.tf. Our boilerplate can generate configuration for PostgreSQL or MySQL instances.

SQL instance naming

Our boilerplate uses a random name for the SQL instance since, annoyingly, SQL instance names cannot easily be re-used for up to a week after deletion.

We make use of Google's terraform module to configure the instance. For non-production environments we usually go with a db-f1-micro tier instance with no failover server. For production we tune the instance size to meet expected demand and configure PostgreSQL in a highly-available "hot-failover" configuration. The cluster configuration is managed for us by Cloud SQL.

Automatic backups are scheduled for 1AM each morning and we set the allowable maintenance window to start at 2AM on Sunday mornings.

"Off-site" backups

We have a custom tool which runs nightly to backup our production databases to a Google Cloud Storage bucket. This bucket is hosted in a Google project dedicated to this purpose. As such we have a degree of resilience against deleting the SQL instance or the host project, both of which would delete the automated nightly backups made by Cloud SQL itself.

No setup is required within our terraform for this. The service account which performs nightly backups is granted the appropriate IAM rights at the top-level Google Folder level and uses the Google API to automatically discover instances which need backing up.

The power of APIs

One of the advantages of using a Cloud-hosted product for our databases is that it is designed to be introspected, managed and provisioned via an API. As such the code for our backup utility is very small: around 500 lines including documentation and comments.

Database users and service accounts

PostgreSQL has its own set of users and user rights. These are managed independently from service account users. We use the term database user to mean a user created in PostgreSQL and service account to mean an identity which is created within Cloud IAM.

Usually our web applications are deployed with a dedicated service account for that application. In order for the web application to be able to use the database a separate database user must be created for it.

We create database users directly using the google_sql_user resource with a password generated by terraform's random_password resource. This password can then be passed to the application securely and never leaves the terraform state. The password can be rotated by simply terraform state rm-ing the password resource and running terraform apply again.

We don't currently have a good way of configuring database user roles. Since this happens infrequently we usually include a db_roles.sql file at the top of the project along with instructions on how to run the SQL against the database.

Future work

The terraform postgresql provider does advertise support for Google Cloud SQL instances but it currently requires that credentials be passed in the GOOGLE_APPLICATION_CREDENTIALS environment variable.

Although we can easily create credentials in terraform we cannot easily set the environment of the terraform process. As such passing credentials to the provider is non-trivial.

Finding a way to solve this problem cleanly is an open issue.

Connecting to SQL instances

Although SQL instances can, and often do, have public IP addresses, they cannot directly be connected to. Instead one needs to perform a mutual authentication dance involving the creation of temporary TLS certificates.

We could do that dance but Google provides a proxy utility which performs the required authentication and makes the database available over a local TCP or UNIX socket as if it were running locally.

The Cloud SQL proxy is where the world of service accounts meets the world of SQL instances. The Cloud SQL proxy takes a set of credentials for a service account. In order to be able to connect to the SQL instance, that service account must have the cloudsql.client role.

This role is often granted to the service account representing the web application.

Important

Being able to connect to a Cloud SQL instance is not the same as being able to connect your application to it. A database user and password must still be supplied by the web application when connecting to the instance.

Outside of the admin service accounts, only "owner" users have the right to connect directly to the database by default. There is an open issue tracking a modification to this policy.

Cloud Run

Cloud Run is used for deploying web applications. It has native support for connecting to Cloud SQL instances and causes the Cloud SQL instance to appear as a UNIX socket within the container.

Kubernetes

Google Kubernetes Engine does not directly support Cloud SQL connections but it is easy to create a sidecar container which runs the proxy making the instance available within the cluster. More information is available on the kubernetes deployment page.

Summary

In summary,

  • We make use of a single PostgreSQL instance per product environment.
  • We configure the amount of disk, CPU and memory per-environment in response to anticipated demand.
  • SQL instances are configured via terraform and managed via Cloud SQL.
  • We have a standard configuration for automated backup and maintenance windows along with an automated "off-site" backup mechanism.
  • In order to connect your application to a SQL instance you must first create, via terraform:
  • If not using Cloud Run to deploy your application, you will need to arrange for the Cloud SQL connection proxy to be running in order to connect to the instance.