A Standard Google Cloud Deployment for the DevOps Team

Introduction

This document crystallizes an in-person discussion of the DevOps team on how to refine our Google Cloud deployments.

We wish to offer similarly structured deployments to the UIS and wider University in future so this standard has been created with one eye on how we can roll it out wider.

There is a terraform configuration which implements the configuration described in this document.

Diagram

The following draw.io diagram gives a flavour of the resources described in this document in diagrammatic form:

diagram

Typography

In this document:

  • Google Cloud resources are denoted thus.
  • UIS, University or account management resources are denoted thus.

Scope

This document refers only to the resources created as part of a deployment and the resources which are bootstrapped “from within”. We do not cover how external resources are created. For example, we assume that there already exists some institutional billing account and a project in Google Cloud where administrative resources are created. This project is termed the administration project.

High-level structure

There is a billing account associated with a parent institution. (We, as ever, punt the concept of an “institution” to a Higher Authority.)

There is a folder associated with each product. The folder has a human-readable product name and numerical folder id which is generated by Google. Resources associated with the folder are:

  • A label on the folder containing the cost centres associated with the product.
  • A configuration storage bucket for the product which is created within the administration project.
  • A folder admin service account for the product which is created within the administration project.
  • A set of folder admin service account credentials.
  • A folder admin service account secret which has the folder admin service account credentials as its payload which is created within the administration project.

The folder admin service account has the following roles:

  • on the folder:
  • Folder Admin (create, edit and delete resources in folder)
  • Project Creator (create sub-projects)
  • Project Owner (create, edit and delete resources in all sub-projects)
  • on the billing account:
  • Billing User (link but not unlink projects with billing accounts)
  • on the configuration storage bucket:
  • Storage Object Admin (list, create, edit and delete objects within the bucket)
  • on the folder admin service account secret:
  • Secret Manager Secret Accessor (access the secret payload)

In addition, there is a set of admin users which are given the same roles as the folder admin service account. These users can examine any of the resources in the Google Cloud Console and may click-ops the rest if necessary.

We give to the product owner the following:

  • The folder id of the folder.
  • The fully qualified name of the folder admin service account secret.
  • The name of the configuration storage bucket.
  • The id of the billing account.

None of these are treated as secret.

How these resources for a given product are created is out of scope but will likely be driven initially by a Giant Terraform Configuration of Death.

By labelling the folder with the cost centre we provide a nice mechanism for the billing account administrator to internally account who owes what on the invoice but, equally, the folder admin service account can modify the folder to remove or change the cost centre label.

DevOps Team Specific Resources

In addition to the above resources which are generic, we would want some resources which are specific to the DevOps team:

  • A DNS zone under gcp.uis.cam.ac.uk for the product.

We would need to give the folder service account the following roles on the project containing the DNS zone:

  • DNS Administrator

How we use this

Our standard terraform configuration will start with the folder id, folder admin service account secret name and configuration storage bucket name which has been given to us.

As a bootstrap step we need to create a “meta” project within which contains:

  • A gitlab token secret with an API token for GitLab as our robot user.

Note

As a shared resource, this token secret could exist in a single meta project for all of our products and we could give the folder admin service account access to the secret or we could copy it separately to a per-product meta project.

Our terraform configuration contains the following configuration hard-coded into the repository source:

  • The parent folder id.
  • At least one GitLab project id.
  • A location for the terraform state which is an object name within the configuration storage bucket.
  • The billing account id.
  • The gitlab token secret name.

The terraform configuration, acting as the folder admin service account will create:

  • A workspace-specific project. (This will be one of “development”, “staging” or “production”.)
  • A project admin service account which is given the Project Owner role on the project.
  • A workspace-specific DNS zone within the project. For example: “staging.raven.gcloud.automation.uis.cam.ac.uk”.
  • Records on the “gcloud.automation.uis.cam.ac.uk” DNS zone which delegates to the project-specific DNS zone.

As part of creating the project it will also:

  • enable any required APIs,
  • associate the project with the billing account.

Note

Admin users will inherit the Project Owner role from the parent folder.

The remainder of the terraform configuration acts as the project admin service account.

The terraform configuration, acting as the project admin service account will create:

  • A SQL instance.
  • A database within the SQL instance.
  • A database user within the SQL instance for the web application to use.
  • A project-specific webapp configuration bucket.
  • A webapp configuration secret which is a Django settings module which sets:
  • A generated Django secret key
  • The database user password
  • A webapp configuration object within the webapp configuration bucket which is a Django settings module which sets:
  • A path to the SQL instance as it will appear within a Cloud Run container
  • The name of the database.
  • The username of the database user.
  • A Cloud Run service for the web application with:
  • The name of the webapp configuration secret set as an environment variable.
  • The name of the webapp configuration object set as an environment variable.
  • A container image name set to “gcr.io/cloudrun/hello”.
  • Configuration for terraform which ignores changes to the container image name.
  • A Cloud Run domain mapping for the web application which maps the service address to the Cloud Run service.
  • A DNS record set within the project DNS zone for the webapp as returned from the Cloud Run domain mapping.
  • A GitLab CI service account and associated GitLab service account credentials which are added to the GitLab deployment project as CI variables.
  • The GitLab CI service account is granted the following:
  • on the Cloud Run service:
    • The run.services.update permission.

For configuration to succeed, the following manual bootstrapping must be performed:

  • Any service address domains (e.g. “account.raven.cam.ac.uk”) must be verified and the email address of the project admin service account must be added as a delegated owner.

When updating a deployment, a GitLab CI job will update the image name of the service which will cause a re-deploy. Terraform ignores image name changes and so this will not fight with terraform.