Serverless Web Applications¶
This page documents how we use the Cloud Run service to deploy web applications in a serverless manner.
How can there be no server?¶
Serverless is a term of art which refers to a method of executing code in the Cloud without having to directly manage the servers which the code runs on. Clearly, there are servers but their management is delegated to the Cloud provider who specifies a common interface which code running on the platform should support.
Even before the term was coined, "serverless" computing had been around since at least the 1980s. Nowadays, rather that having a broad, complex set of supported libraries and runtimes, we define a thin contract between the hosting platform and a container which packages the application code.
Where possible we architect our applications to follow the serverless computing contract. This ensures that we can easily port our applications between any hosting platform which supports the Knative specification. The serverless computing contract is also a good target to aim for even when deploying applications in kubernetes clusters as it provides a clean, orthogonal and well-specified interface between the application and the hosting environment.
Cloud Run is a managed Knative service with some extensions which make things convenient for our needs. Aside from simply hosting a containerised application on the web with auto-scaling, Cloud Run also supports:
- Automatic exposure of a Cloud SQL instance to the container.
- Associating the workload with a Cloud IAM identity allowing the use of default application credentials within the cluster.
- Wrapping the service in a Cloud Load Balancer allowing for custom TLS certificates, HTTP to HTTPS redirect and content caching.
Our standard module configures everything about the Cloud Run application apart from the Docker image URL specifying the version of the application to deploy. Historically we have separated deployment of an application (creating the Knative revision) from deployment of infrastructure (creating the Knative service). As such the actual application deployment often happens in GitLab CI jobs rather than in terraform configuration.
As we move to a more GitOps
model, this will most likely change. In a GitOps model we specify the exact
version of the application image to deploy, the terraform is run by GitLab
CI and "releasing" involves merging a change to
master which changes the
container image URL.
Our boilerplate configuration creates a dedicated service account identity for the application. This service account will be used by Google API libraries which use application-default credentials.
This service account is granted the following permissions:
- connecting to the SQL instance,
- reading the sensitive settings secret, and
- reading the non-sensitive settings storage object.
TLS certificates will also be provisioned for any domain specified in
local.webapp_custom_dns_name but records will not be created. Usually this
local is used to host the "friendly"
.cam.ac.uk domain for the service and
.cam.ac.uk must be created by other
means. Similarly the project admin
terraform service account must be verified as an owner of the domain with
The Cloud Run service itself is configured in webapp.tf. This file configures:
- a Google Secret Manager secret to hold sensitive configuration,
- a Google Cloud Storage object to hold non-sensitive configuration,
- a database user and password within the SQL database instance,
- the Cloud Run service itself, and
- a DNS record for the application if it is not behind a load balancer.
There is some terraform magic in the file to only set a custom DNS name if the application is not behind a load balancer. If the application is behind a load balancer, the Cloud Run service is configured to be "internal and load balancer only" and not accessible from the public Internet.
We start with default values for
min_scale suitable for a
lightly-used web application. In particular,
min_scale is initially zero which
allows the web-application to use no web hosting resources when it is not being
used. Generally we would increase
max_scale increase as the application gets
more use and would increase
min_scale if we are seeing latency spikes due to
application startup delays.
Even if the web application uses database connection pooling there is a
minimum of one connection per server process. As such one needs to make sure
max_scale multiplied by the number of server processes in the
container is less than the maximum connection count of the SQL instance.
For our webapp boilerplate, there are usually four server processes per container instance.
To avoid secrets appearing in the environment, we have
re-architected our applications to load some configuration at runtime. For our
Django projects, we make use of the
Python module and code in our settings
which load settings from YAML-formatted
documents. These documents are located at a set of comma-separated URLs passed
EXTRA_SETTINGS_URLS environment variable. The URLs can use use any
schemes supported by our geddit
Our boilerplate passes two URLs: a
gs://... URL pointing to non-sensitive
settings stored in a Cloud Storage object and a
sm://... URL pointing to
sensitive settings stored in a Secret Manager secret.
Note that the Cloud Storage object is non-public; it can only be read by the web application's service account. Despite this, it is not suitable for storing sensitive values since they will be visible to anyone browsing the bucket in the Google Cloud console.
When would we ever use the Cloud Storage object?
Secret manager secrets can support a maximum of 64KiB of content. For most of our applications the sensitive and non-sensitive configuration fits well within this limit and we put all settings within the secret for convenience. The Cloud Storage object is there to provide an "overflow" for non-sensitive values if we breach the 64KiB limit.
The settings themselves are encoded in a YAML document specified in
We use terraform's
yamlencode function to let us interpolate values without
worrying about character escaping problems. Common secrets such as database
credentials and Django secret keys are managed entirely by terraform using
We have an open issue to make use of functionality which has been added to Cloud Run which supports loading secrets automatically. This may cause our configuration method to change in the future.
When deploying third-party applications it is usually non-trivial to modify them
to load configuration from Secret Manager secrets. In this case we make use of a
tool called berglas. This tool
wraps the third-party applications and detects environment variables which
sm://... formatted URLs. These URLs are fetched and then, depending on
their format, the content is used to either replace the environment variable or
is written to a file on disk.
Cloud Load Balancer¶
In our boilerplate, if
application will be hosted behind a Cloud Load Balancer.
Using a Cloud Load balancer has the following advantages:
- We can have a static ingress IP which is occasionally useful if we need to have long-lived DNS records or if it is non-trivial to have dynamic records. (For example, the IP register database only refreshes the live configuration once per hour.)
- We can make use of Cloud Armor [sic] rules to provide dynamic protection for the application.
- Using Cloud CDN allows us to cache application static assets in Google's Content delivery network.
- We can bring our own TLS certificates if we cannot make use of Google's auto-provisioning or there is a requirement to support EV/OV certificates.
While we don't make use of the feature yet, Cloud Load Balancer allows us to weight incoming traffic and direct it to multiple backends which aids with smoothly moving load between services when using Blue-green deployment strategies.
Cloud Load Balancer is configured in webapp_load_balancer.tf and makes use of Google's terraform module. The configuration is pretty much a carbon-copy of the example in the upstream module. We create a DNS record for the application if Load Balancing is enabled.
Our boilerplate assumes there is a single web application named "webapp". For
some products this will be fine. For others we will need multiple applications.
Currently we support multiple applications by copying and renaming the various
webapp*.tf files and duplicating the
An example of this can be see in the identity platform
(DevOps only) where two applications are configured:
For the moment products with multiple web-applications are rare and the overhead associated with manual copy-and-paste is manageable. In future we'd like to provide a cleaner solution for this, possibly by means of a custom terraform module.
- We use Cloud Run to host our web applications where possible.
- Our standard boilerplate contains example terraform configuration to:
- create the Cloud Run service,
- place application configuration in a Secret Manager secret,
- connect the application to a SQL database,
- place it behind a Cloud Load Balancer, and
- provision TLS certificates.
- We use a serverless platform to allow for "scale to zero" workloads where we can tune the number of active instances, and thus the cost, automatically with demand.
- Third-party applications who cannot load their configuration directly from a Secret Manager secret are wrapped with the berglas tool.
- Creating multiple web-applications within a single product is currently a process of copy and pasting configuration.