Serverless Web Applications¶
This page documents how we use the Cloud Run service to deploy web applications in a serverless manner.
How can there be no server?¶
Serverless is a term of art which refers to a method of executing code in the Cloud without having to directly manage the servers which the code runs on. Clearly, there are servers but their management is delegated to the Cloud provider who specifies a common interface which code running on the platform should support.
Even before the term was coined, "serverless" computing had been around since at least the 1980s. Nowadays, rather that having a broad, complex set of supported libraries and runtimes, we define a thin contract between the hosting platform and a container which packages the application code.
Where possible we architect our applications to follow the serverless computing contract. This ensures that we can easily port our applications between any hosting platform which supports the Knative specification. The serverless computing contract is also a good target to aim for even when deploying applications in kubernetes clusters as it provides a clean, orthogonal and well-specified interface between the application and the hosting environment.
Cloud Run¶
Cloud Run is a managed Knative service with some extensions which make things convenient for our needs. Aside from simply hosting a containerised application on the web with auto-scaling, Cloud Run also supports:
- Automatic exposure of a Cloud SQL instance to the container.
- Associating the workload with a Cloud IAM identity allowing the use of default application credentials within the cluster.
- Wrapping the service in a Cloud Load Balancer allowing for custom TLS certificates, HTTP to HTTPS redirect and content caching.
Our boilerplate¶
We have some example code for deploying a web application within our boilerplate (Developer Hub users only). This makes use of our standard Cloud Run application terraform module.
Upcoming changes
Our standard module configures everything about the Cloud Run application apart from the Docker image URL specifying the version of the application to deploy. Historically we have separated deployment of an application (creating the Knative revision) from deployment of infrastructure (creating the Knative service). As such the actual application deployment often happens in GitLab CI jobs rather than in terraform configuration.
As we move to a more GitOps
model, this will most likely change. In a GitOps model we specify the exact
version of the application image to deploy, the terraform is run by GitLab
CI and "releasing" involves merging a change to master
which changes the
container image URL.
Our boilerplate splits web-application configuration into two parts: the service, the application configuration. The load balancer configuration was removed in a recent revision of the boilerplate. Now these resources are managed by Google Cloud Run terraform module.
Our boilerplate configuration creates a dedicated service account identity for the application. This service account will be used by Google API libraries which use application-default credentials.
This service account is granted the following permissions:
- connecting to the SQL instance,
- reading the sensitive settings secret, and
- reading the non-sensitive settings storage object.
DNS records are created for the application within the project's DNS zone. TLS certificates are provisioned automatically but the domain must first have been verified.
TLS certificates will also be provisioned for any domain specified in
local.webapp_custom_dns_name
but records will not be created. Usually this
local is used to host the "friendly" .cam.ac.uk
domain for the service and
records under .cam.ac.uk
must be created by other
means. Similarly the project admin
terraform service account must be verified as an owner of the domain with
Google.
Service configuration¶
The Cloud Run service itself is configured in webapp.tf. This file configures:
- a Google Secret Manager secret to hold sensitive configuration,
- a Google Cloud Storage object to hold non-sensitive configuration,
- a database user and password within the SQL database instance,
- the Cloud Run service itself, and
- a DNS record for the application if it is not behind a load balancer.
There is some terraform magic in the file to only set a custom DNS name if the application is not behind a load balancer. If the application is behind a load balancer, the Cloud Run service is configured to be "internal and load balancer only" and not accessible from the public Internet.
We start with default values for max_scale
and min_scale
suitable for a
lightly-used web application. In particular, min_scale
is initially zero which
allows the web-application to use no web hosting resources when it is not being
used. Generally we would increase max_scale
increase as the application gets
more use and would increase min_scale
if we are seeing latency spikes due to
application startup delays.
Important
Even if the web application uses database connection pooling there is a
minimum of one connection per server process. As such one needs to make sure
that max_scale
multiplied by the number of server processes in the
container is less than the maximum connection count of the SQL instance.
For our webapp boilerplate, there are usually four server processes per container instance.
Application configuration¶
To avoid secrets appearing in the environment, we have
re-architected our applications to load some configuration at runtime. For our
Django projects, we make use of the
externalsettings
Python module and code in our settings
modules
which load settings from YAML-formatted
documents. These documents are located at a set of comma-separated URLs passed
in the EXTRA_SETTINGS_URLS
environment variable. The URLs can use use any
schemes supported by our geddit
library.
Our boilerplate passes two URLs: a gs://...
URL pointing to non-sensitive
settings stored in a Cloud Storage object and a sm://...
URL pointing to
sensitive settings stored in a Secret Manager secret.
Note that the Cloud Storage object is non-public; it can only be read by the web application's service account. Despite this, it is not suitable for storing sensitive values since they will be visible to anyone browsing the bucket in the Google Cloud console.
When would we ever use the Cloud Storage object?
Secret manager secrets can support a maximum of 64KiB of content. For most of our applications the sensitive and non-sensitive configuration fits well within this limit and we put all settings within the secret for convenience. The Cloud Storage object is there to provide an "overflow" for non-sensitive values if we breach the 64KiB limit.
The settings themselves are encoded in a YAML document specified in
webapp_settings.tf.
We use terraform's yamlencode
function to let us interpolate values without
worrying about character escaping problems. Common secrets such as database
credentials and Django secret keys are managed entirely by terraform using
the random_password
resource.
Future changes
We have an open issue to make use of functionality which has been added to Cloud Run which supports loading secrets automatically. This may cause our configuration method to change in the future.
Third-party applications¶
When deploying third-party applications it is usually non-trivial to modify them
to load configuration from Secret Manager secrets. In this case we make use of a
tool called berglas. This tool
wraps the third-party applications and detects environment variables which
contain sm://...
formatted URLs. These URLs are fetched and then, depending on
their format, the content is used to either replace the environment variable or
is written to a file on disk.
The UIS Technical Design Authority have an example on their site of how to use berglas with a third-party application.
Cloud Load Balancer¶
In our boilerplate, if locals.webapp_use_cloud_load_balancer
is true
, the
application will be hosted behind a Cloud Load Balancer.
Using a Cloud Load balancer has the following advantages:
- We can have a static ingress IP which is occasionally useful if we need to have long-lived DNS records or if it is non-trivial to have dynamic records. (For example, the IP register database only refreshes the live configuration once per hour.)
- We can make use of Cloud Armor [sic] rules to provide dynamic protection for the application.
- Using Cloud CDN allows us to cache application static assets in Google's Content delivery network.
- We can bring our own TLS certificates if we cannot make use of Google's auto-provisioning or there is a requirement to support EV/OV certificates.
Future work
While we don't make use of the feature yet, Cloud Load Balancer allows us to weight incoming traffic and direct it to multiple backends which aids with smoothly moving load between services when using Blue-green deployment strategies.
Cloud Load Balancer is configured in webapp_load_balancer.tf and makes use of Google's terraform module. The configuration is pretty much a carbon-copy of the example in the upstream module. We create a DNS record for the application if Load Balancing is enabled.
Multiple web-applications¶
Our boilerplate assumes there is a single web application named "webapp". For
some products this will be fine. For others we will need multiple applications.
Currently we support multiple applications by copying and renaming the various
webapp*.tf
files and duplicating the local.webapp_...
settings.
Example
An example of this can be see in the identity platform
infrastructure
(DevOps only) where two applications are configured: card
and photo
.
For the moment products with multiple web-applications are rare and the overhead associated with manual copy-and-paste is manageable. In future we'd like to provide a cleaner solution for this, possibly by means of a custom terraform module.
Summary¶
In summary,
- We use Cloud Run to host our web applications where possible.
- Our standard boilerplate contains example terraform configuration to:
- create the Cloud Run service,
- place application configuration in a Secret Manager secret,
- connect the application to a SQL database,
- place it behind a Cloud Load Balancer, and
- provision TLS certificates.
- We use a serverless platform to allow for "scale to zero" workloads where we can tune the number of active instances, and thus the cost, automatically with demand.
- Third-party applications who cannot load their configuration directly from a Secret Manager secret are wrapped with the berglas tool.
- Creating multiple web-applications within a single product is currently a process of copy and pasting configuration.