Tasks principles and the ucam-faas tooling¶

This document provides a look into the details of the ucam-faas/tasks collections developer environment, explaining how ucam-faas works and some of the guiding principles behind it.

The following documentation may be more useful if you have a specific goal or want to learn how to get started:

The reference development guide explains how to write task functions using the ucam-faas library.
The how-to section contains guides covering some specific tasks:

The tasks developer environment shares many guiding principles and philosophies with the webapp developer environment. For a full rundown of some of the core tools, see the documentation present in that page.

Why `ucam-faas`?¶

Most cloud environments already have tools to perform a small task from a triggering event. GCP has Cloud Run Functions and AWS has an equivalent Lambda. We would naturally perhaps want to use Cloud Run Functions, as we deploy primarily to GCP, unfortunately Functions do not allow you to deploy a container directly, you provide Python code files.

Specifying the actual Python code itself, rather than a container, has a number of drawbacks:

It goes against our general principle of always delivering container images.
It bypasses the scanning that is setup for our container workflows.
It can be harder to maintain dependencies if the code is not treated carefully.

None of these are entirely insurmountable problems, but creating our own container-based solution for this, and then fitting this into our regular development patterns seemed an easier approach.

There are other benefits from this approach as well:

We can standardise logging behaviour.
We can include very useful functionality in the library to deserliase messages automatically.
We can control and package the infrastructure solution in a terraform module.

When `ucam-faas`?¶

ucam-faas is a tool to create a container that does a single job. It is specialised to do this, and should not be used for any other purposes. Technically speaking the tool sits on top of a Flask HTTP server - and so responds to HTTP requests. While this is true, making synchronous requests to a server and waiting for a response is a job for our standard webapp boilerplate.

Use ucam-faas when:

You want to run asychronous event-driven workloads.
You want to run scheduled workloads.

The Python framework¶

The ucam-faas Python framework is built on Google's Functions Framework (which is what GCP Cloud Run Functions uses under-the-hood).

The developer interface into the module is a set of decorator functions that allow user-written functions to be registered as task handling functions. These functions can manipulate the triggering message directly, or provide a compatible message type to validate and deserialise message contents into a Python object.

This is particularly valuable as GCP PubSub messages are CloudEvents containing a special GCP PubSub object, which then itself contains the actual message contents. Having to repeat the deserialisation of this object in each function would be expensive and hard to maintain, so packaging it with our library is a huge benefit.

For more details on using the library and developing tasks see the reference page or the ucam-faas repository itself.

The entrypoint¶

The ucam-faas module also provides a main entrypoint into the service. This is expected to be the ENTRYPOINT of the Docker image that is built to run these functions. It accepts a target function to run when triggered.

This entrypoint starts up a Flask server, and accepts HTTP POST requests at the root URL.

Note

The ucam-faas entrypoint does not support multiple functions to run in a single container. This is a deliberate design choice, made to prevent people creating an API with the tool and to allow for concurrency to be controlled at the infrastructure level.

It is, however, possible to build a single image containing multiple task functions, and deploy multiple copies using different docker command arguments when run. Entirely separate containers with different default arguments reduces confusion and the potential for error, however, so is the recommended approach.

The terraform module(s)¶

The ucam-faas terraform module provides a standardised way to deploy a built task function image. It is tightly coupled to the Python library, and it is expected that it is the only way that images of this kind are deployed.

It deploys the images into Google's Cloud Run service, and manages the triggering through PubSub and Cloud Scheduler.

For more details on setting up a deployment see the deployment guide.

Deployment architecture¶

For tasks triggered by a PubSub topic the (simplified) architecture is:

Simplified diagram of a ucam-faas PubSub triggered task. (download)

The terraform module handles creating the surrounding triggering mechanism, retries, and dead-lettering in a standard manner.

For tasks triggered by on a schedule (with a cron-style trigger) the architecture is:

Simplified diagram of a ucam-faas scheduled/cron triggered task. (download)

Here, the PubSub topic is created internally. This is done so that the ucam-faas functions themselves are always called in exactly the same manner, regardless of the triggering mechanism. It may seem strange to have the Cloud Schedule publish to an internal topic rather than call the function directly, but the consistency is valuable to keep the other infrastructure around alerting consistent between both trigger mechanism. It also means the Python function itself does not need to be declared as a scheduled or topic-based task.