UpdateThere is a follow up post with some updates, you can read here.
If you are using Azure Pipelines, then you surely have used Microsoft-hosted agent. With Microsoft-hosted agents, maintenance and upgrades are taken care of for you. However, there are times when self-hosted agents are needed (i.e. customized images, network connectivity requirements etc.). Pipeline agents can be hosted as stand-alone, on Azure virtual machine scale-sets, as Docker containers. Container based agents are amazingly fast to spin up. This led many to run self-hosted agents on Kubernetes cluster. I am sure, many has done this before, but I didn’t find a complete solution in my search. Therefore, I have decided to build one.
The architecture can be drawn as following:
There is a controller in a designated namespace that keeps an eye on to the agent pool in Azure DevOps and as soon as it sees new job requests queued, it spins up a container agent. It also listens for Kubernetes events when Pods are completed executing a pipeline-job, when such events raise, the controller cleans up the pod and unregister the agent from Azure DevOps. Azure DevOps unfortunately doesn’t have a service hook event, i.e. “Job queued” or such. Therefore, the controller uses REST API to look for incoming job requests. Because there is a latency involved into the process, the controller always keeps N number of “standby” agents on Kubernetes. The standby count you can configure as needed.
How to use
Installing the controller is straight-forward. the controller dynamically spins pods, deletes completed pods etc. Therefore, it requires
cluster-role. It also uses a
Custom Resource Definition to isolate the agent pod specifications from the controller.
The following manifest will install all required
Service Account and
Cluster Role bindings in a separate namespace called “octolamp-system“.
# install controller, CRD from GitHub kubectl apply -f https://raw.githubusercontent.com/cloudoven/azdo-k8s-agents/main/src/kubernetes/install.yaml ###
Configure agent namespace
You need to create a separate namespace where the Azure DevOps agent pods would be created and observed.
kubectl create namespace azdo-agents
Next, we need to create the container specification for the Azure DevOps Agents. This is the container image that Microsoft documented how to create them. I have created my own, however, you should create your own image and install the necessary tools according to your CI/CD needs. We will define these as a
Custom Resource Definition. Let’s create a file named
agent-spec.yaml with the following content:
apiVersion: "azdo.octolamp.nl/v1" kind: AgentSpec metadata: namespace: azdo-agents name: cloudovenagent spec: prefix: "container-agent" image: moimhossain/azdo-agent-linux-x64:latest
The image field needs to point to the container image that you want to use as pipeline agent. Then apply this manifest to Kubernetes:
kubectl apply -f agent-spec.yaml
Next, we will deploy the controller with the details of Azure DevOps organization. Let’s create a file
controller.yaml with following content:
apiVersion: apps/v1 kind: Deployment metadata: name: octolamp-agent-controller namespace: octolamp-system spec: replicas: 1 selector: matchLabels: run: octolamp-agent-controller template: metadata: labels: run: octolamp-agent-controller spec: serviceAccountName: octolamp-service-account containers: - env: - name: AZDO_ORG_URI value: https://dev.azure.com/<Organization Name> - name: AZDO_TOKEN value: <A PAT token that can manage agent pool> - name: AZDO_POOLNAME value: "k8s-pool" - name: TARGET_NAMESPACE value: "azdo-agents" - name: STANDBY_AGENT_COUNT value: "2" - name: MAX_AGENT_COUNT value: "25" - name: APPINSIGHT_CONN_STR value: < Application Insight connection string (not instrumentation key!!)> image: moimhossain/octolamp-agent-controller:net6-v1.0.0 imagePullPolicy: Always name: octolamp-agent-controller resources: limits: cpu: 100m memory: 100Mi
This file needs to be updated according to your Azure DevOps organization URL, a personal access token, a pool that is exist in your Azure DevOps organization (create one from Organization Settings > pipeline settings > agent pools > Add pool).
Other properties taht you can configure, is to define a number for Standby agent count and maximum agents that it should create (limiting to a threshold).
Now, apply these changes to Kubernetes cluster.
kubectl apply -f controller.yaml
That’s it. At this point, you should see that container agents are spinning up and they will show up on Azure DevOps agent pool UI.
I am using Azure Kubernetes service for this example, and AKS supports autoscaling feature for node-pools. That means, when the controller spins up too many agents that the AKS node pool doesn’t have capacity for, AKS will spin up new nodes into the pool which in-terns add capacity to the controller to spin up more agents (assuming you have lot of pipelines run in a brief time window).
The controller keeps track of pod completed events in Kubernetes and whenever a pod completes a pipeline run, it removes the pod from the cluster and unregisters it from the Azure DevOps agent pool list. Therefore, if there are no pipelines awaiting, the controller will scale down all the agents back to the standby count – within ~2 mins. Which will trigger AKS auto-scaler eventually and nodes will be scaled down with in ~10 mins.
This article doesn’t demonstrate the windows-based agents. However, as you have seen, the controller allows you to change the agent image and image spec (with CRDs) – you should be able to make that work without much effort.
The entire code can be found in GitHub. This source code is MIT licensed, provided as-is (without any warranties) and you can use, modify without issues. However, I would appreciate if you acknowledged the author. Also, more than welcome to contribute directly to GitHub.
4 thoughts on “Elastic self-hosted pool for Azure DevOps (on Kubernetes)”
Thanks for nice blog.
I replicated the setup as you described when I try to build docker images inside agent pod using pipeline.I get below error :
“Cannot connect to the Docker daemon at unix://var/run/docker.sock. Is the docker daemon running ?”
It seems like a case of docker inside docker. I just need to build docker (or execute some commands).can you suggest how can it be done in your setup. I want to mount sock file .
Mounting docker socket in containers can be super notorious and not a recommended approach.
I work mostly with Microsoft tech, so I would recommend offloading/outsourcing the image building to a remote service, for instance using ‘az acr build’ command of Azure Container Registry.
The details can be found in here: https://learn.microsoft.com/en-us/cli/azure/acr?view=azure-cli-latest#az-acr-build
Hi, i like this idea..but would this work with API Whitelisting enabled on the cluster? If access to the API is restricted to an office network how does the ADO communicate with the cluster and then what if the pipeline i am running is making changes to the same cluster that the agent sits on?
AzDO DOES NOT communicate to the K8s API, in this scenario. The controller (container) calls AzDO API. All you need it to make sure 443 outbound traffics are allowed from containers.
The pipeline running on these agent can deploy to the same cluster as well.