Portable Units of Work in Distributed Systems

Two problems confronting builders of distributed systems are:

  • how do you encapsulate system logic into repeatable units of work with well-defined interfaces and execution requirements?
  • how do you enable portability of these units of work, once defined?

In guidepad, both of these questions are answered by one powerful abstraction: the Operation.

Functions as Data

Operations, at their core, are functions. They take some input in a defined schema, pass that input to a handler, and return some output in another defined schema. What guidepad does differently is treat each of these facts about a function as data and then stores those data in some storage engine. A concrete Operation can then be constructed from the stored definition at runtime.

Here's an example of how an Operation instance could be created with guidepad's CLI:

$ guidepad entity create operation - <<EOF
{"name": "my_demo_operation", "implemented_at": "guidepad.operations.builtin.demo.echo:echo"}

This demo operation definition is pretty simple - we created an operation named my_demo_operation and pointed its implementation to a Python function echo in the guidepad.operations.builtin.demo.echo module. Importantly, creating this definition makes the same operation immediately available to every user and process with access to the guidepad instance, no matter their location.

Let's test out running our new operation via the CLI:

$ guidepad operation chain 'my_demo_operation(input_str="hi")'
{'_id': 'f67dd602d9664cdb9cf8516dde94c3b2', 'output_str': 'hi'}

The operation chain command in guidepad allows users to call operations with inline Python syntax, passing the input as arguments to a function named after the operation (multiple operations can be chained together into simple pipelines with the | character). If we want to know a bit more about the operation, we can fetch its definition with entity list:

$ guidepad entity list operation '{"name": "my_demo_operation"}' -a name -a input_type -a output_type
┃ _id                              ┃ name              ┃ input_type ┃ output_type ┃
│ ada0b270ee9941acaf9e4cd9b4aa629c │ my_demo_operation │ echo_input │ echo_output │

Input and Output Definitions

You'll notice in this operation definition that the input_type and output_type are set to the names of two guidepad types, but in the creation of the operation we only provided the name and implementation_path. The types for the input and output were pulled in from the implementation:

### contents of echo.py
from guidepad.types.base_type import BaseType
from guidepad.types import attributes

class EchoInput(BaseType):

    input_str = attributes.String()

class EchoOutput(BaseType):

    output_str = attributes.String()

def echo(op_input: EchoInput) -> EchoOutput:

    return EchoOutput(output_str=op_input.input_str)

Guidepad noticed that the parameter and function return type hints were set to guidepad types, and used those as the intput and output types for the operation definition. Because the input and output schemas are types, and types are also data within guidepad, we can introspect the intput/output types (useful if one ever forgets the input a particular operation takes!) :

$ guidepad types describe-type echo_input
                                   echo_input attributes                                   
┃ attribute ┃ data_type ┃ default_value                                                   ┃
│ _id       │ string    │ <guidepad.types.attributes.UnsetValue object at 0x7fbb7c799cf0> │
│ input_str │ string    │ <guidepad.types.attributes.UnsetValue object at 0x7fbb7c799cf0> │

Defining IO at Runtime

One of the biggest powerups achieved by storing operation definitions as data is the ability to edit their behavior at runtime. What if a different handler was implemented that took the same input as echo, but also output the length of the input string? In guidepad we can switch between the two handlers by simply editing the operation definition:

guidepad entity edit operation my_demo_operation -a name
1 _id: ada0b270ee9941acaf9e4cd9b4aa629c
  2 active: true
  3 batch: false
 13 group: null
 14 implemented_at: guidepad.operations.builtin.demo.echo:echo2
 15 input_type: echo_input
 16 is_async: false
 17 name: my_demo_operation
 18 operations_api: null
 19 output_type: echo_output
 20 path: null
 21 provided_by: null
 22 type: null

entity edit opens the default editor and populates it with the YAML version of the entity in question. Here we update implemented_at to point at echo2. If we fetch the operation we can see the output type is now echo_output2, which again was pulled from the type hints on the function definition.

$ guidepad entity list operation '{"name": "my_demo_operation"}' -a name -a input_type -a output_type
┃ _id                              ┃ name              ┃ input_type ┃ output_type  ┃
│ ada0b270ee9941acaf9e4cd9b4aa629c │ my_demo_operation │ echo_input │ echo_output2 │

$ guidepad operation chain 'my_demo_operation(input_str="hi")'
{'_id': '16d2f1e0ce6348a885c0d7e64f330e10', 'output_str': 'hi', 'output_len': 2}

Executing the operation we can see that the output has changed to include the length of the input string. What if the developer of echo2 didn't use type hints? How could we change the output type of the operation to match the new return value of the handler? Since the input and output types on an operation are just pointers to types, we can define a new type inline in the attribute:

1 _id: ada0b270ee9941acaf9e4cd9b4aa629c
  2 active: true
  3 batch: false
 13 group: null
 14 implemented_at: guidepad.operations.builtin.demo.echo:echo2
 15 input_type: echo_input
 16 is_async: false
 17 name: my_demo_operation
 18 operations_api: null
 19 output_type: |
 20   {"name": "echo_output_2", "attributes": {"output_str": {"data_type": "string"}, "output_len": {"data_type": "int"}}}
 21 path: null                                                                                                    
 22 provided_by: null                                                                                             
 23 type: null
$ guidepad operation chain 'my_demo_operation(input_str="hi")'
{'_id': 'f5eb2c8cff9d402c8c29f121f93b01e2', 'output_str': 'hi', 'output_len': 2}

Because guidepad's datastores serve as the central point of truth, editing the definition in this way changes behavior across the entire system without any code changes/pushes required.

Distributed Execution

The previous section detailed how operations solve the first problem presented in this blog: defining re-usable bits of logic and typing their inputs and outputs at runtime. But what about making this logic portable? That's where guidepad's support for distributed execution of operations comes into play.

For the purposes of this demonstration, we'll lean on the list_service operation - an operation that comes pre-installed with every guidepad instance. It can be described with the entity list command:

guidepad entity list operation '{"name": "list_service"}'
┃ _id         ┃ name        ┃ group        ┃ type        ┃ active ┃ path         ┃ operations… ┃ provided_by  ┃ implemente… ┃ input_type   ┃ output_type ┃ framework_s… ┃ batch ┃ is_async ┃ disable_au… ┃
│ list_servi… │ list_servi… │ autogenerat… │ guidepad.b… │ True   │ autogenerat… │ <guidepad.… │ <guidepad.t… │ None        │ list_servic… │ list_servi… │ runtime_typ… │ False │ False    │ False       │
│             │             │              │             │        │              │ object at   │ object at    │             │              │             │              │       │          │             │
│             │             │              │             │        │              │ 0x7f0015fe… │ 0x7f0015fe9… │             │              │             │              │       │          │             │

Of course, we can run the operation locally and get some results:

$ guidepad operation chain 'list_service(name={"op": "$eq", "value": "my_local_ecr_helper"})'
{'_id': '86e800825f96412084e12aff3b11d074', 'count': 1, 'instances': [{'_id': 'a6e8865ca4764b39873f1d027093a6d9', 'name': 'my_local_ecr_helper', 'deployed_on': None, 'requirements': [], 'log_level': 'INFO', 'environment': None, 'environment_variables': [], 'current_states': ['1fbc923a3dc84acc9e264471e4d99cc6'], 'state_machine': None, 'state_plans': ['default_k8s_ecr_helper_deploy'], 'monitor_interval': 10, 'service_type': 'k8s_ecr_helper', 'version': None, 'service_hosts': [], 'user': None, 'aws_account_id': '', 'aws_region': '', 'aws_access_key': '', 'aws_secret_key': ''}]}

But we can also run the operation asynchronously in a distributed manner:

$ guidepad entity duplicate operation list_service -s name list_service_async -s is_async true
{'ok': True, 'message': 'Made a copy of operation-list_service with new id: 0cb5c6dd330b4c6e9cea6c95f8bd831a'}

$ guidepad operation chain 'list_service_async(name={"op": "$eq", "value": "my_local_ecr_helper"})'
{'invocation_id': 'a4e262a371804a58bd1d1725ae55c856'}

$ guidepad entity list operation_execution '{"operation": "0cb5c6dd330b4c6e9cea6c95f8bd831a"}'
┃ _id                              ┃ operation                                                                ┃ caller_id                        ┃ caller_type    ┃ started_at                       ┃ ended_at                         ┃ error ┃ output_artifact                                                          ┃ user                                                                     ┃
│ 7795c04f8ff24bef95d210ac7d001a9b │ <guidepad.types.attributes.ReferenceCollection object at 0x7f364a0063e0> │ 0d3fd45321e0450e996a456916ea5157 │ job_invocation │ 2023-11-28 16:32:52.123817+00:00 │ 2023-11-28 16:32:52.131200+00:00 │ False │ <guidepad.types.attributes.ReferenceCollection object at 0x7f364a006080> │ <guidepad.types.attributes.ReferenceCollection object at 0x7f364a006290> │

$ guidepad artifact retrieve 36dcf117820949918a793653444c52da async_list_service.json 

$ cat async_list_service.json 
{"_id": "294e782857d0430a9c09be79c962afd6", "count": 1, "instances": [{"_id": "a6e8865ca4764b39873f1d027093a6d9", "name": "my_local_ecr_helper", "deployed_on": null, "requirements": [], "log_level": "INFO", "environment": null, "environment_variables": [], "current_states": ["1fbc923a3dc84acc9e264471e4d99cc6"], "state_machine": null, "state_plans": ["default_k8s_ecr_helper_deploy"], "monitor_interval": 10, "service_type": "k8s_ecr_helper", "version": null, "service_hosts": [], "user": null, "aws_account_id": "", "aws_region": "us-east-1", "aws_access_key": "", "aws_secret_key": ""}]}

In the above snippet, we:

  1. created a copy of the built-in list_service operation, with a different name and is_async set to true
  2. executed the operation in a remote environment
  3. fetched the record of the operation's execution
  4. downloaded and inspected the stored results

How and where did the operation execute? When the operation was called, guidepad dynamically created and scheduled a work plan invocation with instructions on how to execute the operation, in one of the environments configured within our guidepad instance. When the operation completed, guidepad persisted its output as an artifact and linked it to the operation_execution record.

As a general rule, guidepad can execute operations in any environment for which it has a control plane implemented. We currently have control planes for Kubernetes, AWS, GCloud, Azure, and bare metal severs with the list constantly expanding. Users can utilize the plugin framework for guidepad to author their own control planes, should there be an exotic or niche use-case.

Wiring Operations to Services

Remote execution of operations through asynchronous work plan scheduling works for operations with a handler written in Python, but what about other languages? Let's say you had this simple REST API, written in Go:

<code class="language-go">package main

import (

// album represents data about a record album.
type album struct {
    ID     string  `json:"id"`
    Title  string  `json:"title"`
    Artist string  `json:"artist"`
    Price  float64 `json:"price"`

// albums slice to seed record album data.
var albums = []album{
    {ID: "1", Title: "Blue Train", Artist: "John Coltrane", Price: 56.99},
    {ID: "2", Title: "Jeru", Artist: "Gerry Mulligan", Price: 17.99},
    {ID: "3", Title: "Sarah Vaughan and Clifford Brown", Artist: "Sarah Vaughan", Price: 39.99},

func main() {
    router := gin.Default()
    router.GET("/list_albums", getAlbums)

// getAlbums responds with the list of all albums as JSON.
func getAlbums(c *gin.Context) {
    c.IndentedJSON(http.StatusOK, albums)

Using guidepad, you create a service definition for this API and deploy it to your environment of choice using the state plan/state machine framework. As part of that deployment, a service exposure can be created that provides access to the deployed service from outside the service's environment (these steps omitted for brevity, this blog is already quite long).

Then, you could create an Operation that is "provided by" the service:

guidepad entity create operation - <<EOF
{"name": "list_albums", "provided_by": "<UUID OF ServiceExposure>"}

When the list_albums operation is called, guidepad will fetch the current connection details from the ServiceExposure and perform RPC to execute the operation against the deployed service. The service exposure abstraction manages the negotiation of connection details in different environments, allowing for a single interface for accessing the functionality of the operation regardless of which environment(s) the providing service is deployed into.

Serverless Operations Anywhere

In a serverless framework, functionality is provided without concern to the infrastructure that functionality is dependent on. Through the operation abstraction's ability to execute functionality in heterogeneous compute environments, whether as an asynchronous call or RPC to a deployed service, teams can build serverless functionality on top of their already existing infrastructure. This frees organizations from the need to move to the cloud just to take advantage of offerings such as AWS Lambda and Fargate. Of course, if you'd like your operations to run on those frameworks they can be represented as environment types within guidepad and become targets for asynchronous execution (without changing anything about how your operations are invoked!).

With guidepad's built-in state machine functionality for services, users can even design sophisticated hot/cold behavior for the services backing their operations:

In the above diagram, the diamonds represent service states with the arrows between them being the permissible state transitions. Each transition is gated on a set of guidepad requirement entities, which must be satisfied before the transition can occur. These requirements are represented by the boxes with the white background. Guidepad is able to autonomously manage these state machines, transitioning services between states when the conditions are met. In this state machine, the service backing an operation would be automatically deployed if it is not deployed and an operation request is received, and then it would be undeployed if there were no requests within a specific time window. Another set of conditions would automatically scale the number of replicas of the service that were deployed, if a set throughput thresholds was violated.

This diagram would be represented by a set of data entities within a guidepad instance, allowing for real-time modification of the state machine without writing any files or pushing to repositories.

Recent Publications


Guidepad's ML Plugin

The guidepad-ML plugin is an extension of the guidepad platform that helps users with their end-to- end ML lifecycle.

Tommy O'Keefe

Jul 28, 2023 · 10 min read read


Guidepad's Managed Embeddings Service (Part 1)

This demo showcases the capabilities of our embeddings service. This notebook will interact with a set of APIs we offer, showing that the embeddings service can be utilized by any downstream application with internet access, or any user with their preferred programming language.

Tommy O'Keefe

Aug 8, 2023 · 10 min read read


Guidepad's Managed Embeddings Service (Part 2)

Let's explore how we can leverage our REST API to save documents, compute document embeddings using a combination of pretrained open-source language models, and generate custom embeddings for your documents.

Tommy O'Keefe

Aug 8, 2023 · 10 min read read