Automating Containerized Tasks Using Argo Workflows
Lightweight introduction to Argo toolset - starting with Workflows
One of the most used open source tools to assist container orchestration in the CNCF landscape is Argo. It offers 4 different offerings - Workflows, Rollouts, Events, and CD (Continuous Deployment). I got some time to play around with Argo Workflows and the same is discussed in this post.
Automation is the king today in almost all paths of life. It facilitates convenience and abstracts away the redundant and menial tasks, and lets us focus on better things. Almost in all the automation solutions, and especially in DevOps, automation often involves a series of steps and decisions undertaken by the system to achieve a certain result.
These series of steps and decisions are known as workflows. Argo Workflows mean the same in the context of containers. With Argo Workflows, it is possible to program a workflow to be executed by multiple containers either in sequence or in parallel.
Pre-requisites
Apart from the kubectl, Docker, and minikube installation, working with Argo Workflows requires the Argo CLI to be installed as well. As far as configuration files are concerned, it follows the format similar to, but not same as Kubernetes resources. Argo introduces CRDs for K8s, and Workflows is an example of the same. We will see how later in this post.
Why do we need workflows?
Imagine a scenario where we have to process an image (picture) in a series of steps to apply multiple effects followed by changing the dimensions of the photograph, etc. Let us assume that every modification is performed by an individual programs which are containerised. If we know the sequence of the transformations, then we can use Argo Workflows to spin up corresponding container images, pass the image as input for processing, and then pass the processed image to the next container, and so on.
Basic Argo Workflow example
You might have used Docker to run single containers, or kubectl to run an instance of a container in a pod within Kubernetes. The example below shows how to achieve the same using Argo Workflow. Note that this example is just to familiarise you with how Argo templates are used to create container instances.
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: image-inverter-
spec:
entrypoint: invertimage
templates:
- name: invertimage
container:
image: docker/invertimage
command: [ invert ]
args: [ "picture.jpg" ]
Here, we specify the apiVersion
as argoproj.io/v1alpha1
and kind
as Workflow
. This lets K8s know where to get the resource definitions from and what kind of resources will be created in the K8s environment. We also supply a generateName attribute under metadata which helps generate a unique name by appending a string of random alphanumeric characters.
The spec section defines an entry point. Every workflow requires an entry point, where the execution begins. The templates attribute which follows, defines the entrypoint
further with respect to its container image repository location, command to run in the container and some arguments.
Apply this yaml template by running the Argo CLI command as below.
argo submit <filename>.yaml
This will create a container instance of invertimage
image, run the invert
command with args specified and shutdown once the invert process is complete.
High-level image processing example
The example above is a very basic introduction to YAMLs to be written to use Argo capabilities. Lets assume that we want to process the image in below sequence of activities:
Invert operation - containerised as invertimage image
Increase the brightness operation - containerised as brightness image
Highlight the edges operation - containerised as highlight image
Compress - containerised as compress image
To achieve this, we make use of Steps in Argo Workflow, as shown below.
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: image-processing-
spec:
entrypoint: image-processor
templates:
- name: invertimage
inputs:
parameters:
- name: imagename
container:
image: docker/invertimage
command: [ invert ]
args: [ "{{inputs.parameters.imagename}}" ]
- name: brightness
inputs:
parameters:
- name: imagename
container:
image: docker/brightness
command: [ invert ]
args: [ "{{inputs.parameters.imagename}}" ]
- name: highlight
inputs:
parameters:
- name: imagename
container:
image: docker/highlight
command: [ invert ]
args: [ "{{inputs.parameters.imagename}}" ]
- name: compress
inputs:
parameters:
- name: imagename
container:
image: docker/compress
command: [ invert ]
args: [ "{{inputs.parameters.imagename}}" ]
- name: image-processor
steps:
- - name: invert
template: invertimage
arguments:
parameters:
- name: imagename
value: "picture.jpg"
- - name: brighten
template: brightness
arguments:
parameters:
- name: imagename
value: "picture.jpg"
- - name: highlight
template: highlight
arguments:
parameters:
- name: imagename
value: "picture.jpg"
- - name: compress
template: compress
arguments:
parameters:
- name: imagename
value: "picture.jpg"
Note that the example above does NOT work - this is only for representational purpose to understand how is it possible to configure Argo Workflow.
Here, after the entrypoint
spec, we define a few templates
. The template name specified in the entrypoint “image-processor
“ depends on more templates, which constitute individual steps in the workflow. The individual steps are also called as leaf nodes and all of them are specified with the corresponding image names based on the operation they perform.
In the image-processor
template, we specify steps referring to the leaf node templates and pass the parameter argument indicating the filename of the image that needs to be processed. The sequence is defined using ‘-‘. ‘ - -
‘ means that the workflow needs to wait for the previous step to complete. Whereas, ‘ -
’ would mean the next job can run in parallel to the current step.
Learn more about Argo Workflows here: https://argo-workflows.readthedocs.io/en/latest/