Azure AD Pod Identity – password-less app-containers in AKS

Background

I like Azure Managed Identity since its advent. The concept behind Managed Identity is clever, and it adds observable value to any DevOps team. All concerns with password configurations in multiple places, life cycle management of secrets, certificates, and rotation policies suddenly irrelevant (OK, most of the cases).
Leveraging managed identity for application hosted in Azure Virtual machine, Azure web apps, Function apps etc. was straightforward. The Managed Identity sits on top of Azure Instance Metadata Service technology. Azure’s Instance Metadata Service is a REST Endpoint accessible to all IaaS VMs created via the Azure Resource Manager. The endpoint is available at a well-known non-routable IP address (169.254.169.254) that can be accessed only from within the VM. Under the hood Azure VMs, VMSS and Azure PaaS resources (i.e. Web Apps, Function Apps etc.) leverage metadata service to retrieve Azure AD token. Thus VM, Web App kind of establishes their own “Application Identity” (what Managed Identity essentially is) that Azure AD authenticates.

Managed Identity in Azure Kubernetes Service

Managed Identity in Kubernetes, however, is a different ballgame. Typically, multiple applications (often developed by different teams in an organization) running in a single cluster, pods are launching, exiting frequently in different nodes. Hence, Managed Identity associating with VM/VMSS are not sufficient, we needed a way to assign identity to every pods in an application. If pods move to different nodes, the identity must somehow move with them in the new node (VM).

Azure Pod Identity

Good news is Azure Pod Identity offers that capability. Azure Pod Identity is an Open source project in GitHub.

Note: Managed pod identities is an open source project and is not supported by Azure technical support.

An application can use Azure Pod Identity to access Azure resources (i.e. Key Vault, Storage, Azure SQL database etc.) via Managed Identity hence, there’s no secret/password involved anywhere in the process. Pods can directly fetch access tokens scoped to resources directly from Azure Active Directory.

Concept

The following two components are installed in cluster to achieve the pod identity.

1. The Node Management Identity (NMI)

AKS cluster runs this Daemon Set in every node. This intercepts outbound calls from pods requesting access tokens and proxies those calls with predefined Managed Identity.

2. The Managed Identity Controller (MIC)

MIC is a central pod with permissions to query the Kubernetes API server and checks for an Azure identity mapping that corresponds to a pod.

Source: GitHub Project – Azure Pod Identity

When pods request access to an Azure service, network rules redirect the traffic to the Node Management Identity (NMI) server. The NMI server identifies pods that request access to Azure services based on their remote address and queries the Managed Identity Controller (MIC). The MIC checks for Azure identity mappings in the AKS cluster, and the NMI server then requests an access token from Azure Active Directory (AD) based on the pod’s identity mapping. Azure AD provides access to the NMI server, which is returned to the pod. This access token can be used by the pod to then request access to services in Azure.

Microsoft

Azure Pipeline to Bootstrap pod Identity

I have started with an existing rbac-enabled Kubernetes cluster – that I have created before. Azure AD pod identity would setup “Service Account”, “custom resource definitions (CRD)”, Cluster Roles and bindings, DaemonSet for NMI etc. I wanted to do it via Pipeline, so I can repeat the process on-demand. Here are the interesting part of the azure-pod-identity-setup-pipeline.yaml

trigger:
- master
variables:
  tag: '$(Build.BuildId)'
  containerRegistry: $(acr-name).azurecr.io
  vmImageName: 'ubuntu-latest'
stages:
- stage: Build
  displayName: Aad-Pod-Identity-Setup
  jobs:  
  - job: Build
    displayName: Setup Aad-Pod-Identity.
    pool:
      vmImage: $(vmImageName)
    environment: 'Kubernetes-Cluster-Environment.default'
    steps:
      - bash: |
          kubectl apply -f https://raw.githubusercontent.com/Azure/aad-pod-identity/master/deploy/infra/deployment-rbac.yaml
        displayName: 'Setup Service Account, CRD, DaemonSet etc'


I will be using a “User assigned identity” for my sample application. I have written a basic .net app with SQL back-end for this purpose. My end goal is to allow the .net app talk to SQL server with pod identity.
Following instruction Aad-pod-identity project instruction, I have created the user assigned identity.

      - task: AzureCLI@2
        inputs:
          scriptType: 'bash'
          scriptLocation: 'inlineScript'
          inlineScript: 'az identity create -g $(rgp) -n $(uaiName) -o json'

In my repository, created the Azure Identity definition in a file named: aad-pod-identity.yaml

apiVersion: "aadpodidentity.k8s.io/v1"
kind: AzureIdentity
metadata:
  name: <a-idname>
spec:
  type: 0
  ResourceID: /subscriptions/<sub>/resourcegroups/<resourcegroup>/providers/Microsoft.ManagedIdentity/userAssignedIdentities/<name>
  ClientID: <clientId>
---
apiVersion: "aadpodidentity.k8s.io/v1"
kind: AzureIdentityBinding
metadata:
  name: demo1-azure-identity-binding
spec:
  AzureIdentity: <a-idname>
  Selector: managed-identity


And added a task to deploy that too.

      - bash: |
          kubectl apply -f manifests/ aad-pod-identity.yaml
        displayName: 'Setup Azure Identity

Triggered the pipeline and Azure pod Identity was ready to roll.

Deploying application

I have a .net core web app (Razor application) which I would configure to run with pod identity to connect to Azure SQL (a back-end) with Azure Active directory authentication – with no password configured at the application level.
Here’s the manifest file (front-end.yaml) for the application, the crucial part is to define the label (aadpodidbinding: managed-identity) match for binding the pod identity we have defined before:

        apiVersion:apps/v1
	kind: Deployment
	metadata:
	  name: dysnomia-frontend
	spec:
	  replicas: 6
	  selector:
	    matchLabels:
	      app: dysnomia-frontend      
	  strategy:
	    rollingUpdate:
	      maxSurge: 1
	      maxUnavailable: 1
	  minReadySeconds: 5 
	  template:
	    metadata:
	      labels:
	        app: dysnomia-frontend
                aadpodidbinding: managed-identity
	    spec:
	      nodeSelector:
	        "beta.kubernetes.io/os": linux
	      containers:
	      - name: dysnomia-frontend
	        image: #{containerRegistry}#/dysnomia-frontend:#{Build.BuildId}#
	        imagePullPolicy: "Always"

That’s pretty much it, once I have created my application pipeline with the above manifest deployed, the .net application can connect to Azure SQL database with the assigned pod identity.

      - bash: |
          kubectl apply -f manifests/front-end.yaml
        displayName: 'Deploy Front-end'


Of course, I needed to do Role assignment for User Assigned Identity and enable Azure AD authentication in my SQL server, but not describing those steps, I have written about that before.

What about non-Azure resources?

The above holds true for all Azure Resources that supports Managed Identity. That means, our application can connect to Cosmos DB, Storage Account, Service Bus, Key Vaults, and many other Azure resources without configuring any password and secrets anywhere in Kubernetes.
However, there are scenarios where we might want to run a Redis container or a SQL server container in our Kubernetes cluster. And cost wise, it might make sense to run it in Kubernetes (as you already have a cluster) instead of Azure PaaS (i.e. Azure SQL or Azure Redis) for many use-cases. In those cases, we must create the SQL password and configure into our .net app (using Kubernetes Secrets).
I was wondering if I could store my SQL password in an Azure Key Vault and let my SQL container and .net app both collect the password from key vault during launch using Azure pod identity. Kubernetes has a first-class option to handle such scenarios- Kubernetes secrets.

However, today I am playing with Azure AD pod Identity – therefore, I really wanted to use pod identity – for fun ;-). Here’s how I managed to make it work.

SQL container, pod identity and Azure Key vault

I’ve created Key vault and defined SQL server password as a secret there. Configured my .net app to use pod identity to talk to key vault and configured key vault access policy so user-assigned identity created above can grab the SQL password. So far so good.

Now, I wanted to run a SQL server instance in my cluster which should also collect the password from Key vault – same way as it did for .net app. Turned out, SQL 2019 image (mcr.microsoft.com/mssql/server:2019-latest) expects the password as an environment variable during container launch.

docker run -d -p 1433:1433 `
           -e "ACCEPT_EULA=Y" `
           -e "SA_PASSWORD=P@ssw0rD" `
           mcr.microsoft.com/mssql/server:2019-latest

Initially, I thought it would be easy to use an init-container to grab the password from Azure Key vault and then pass it through the application container (SQL) as environment variable. After some failed attempts realized that isn’t trivial. I can of course create volume mounts (e.g. EmptyDir) to convey the password from init-container to application container – but that rather dirty – isn’t it?
Secondly I thought of creating my own Docker image based on the SQL container, then I could run a piece of script that will grab the password form Key vault and set it as environment variable. A simple script with a few curl commands would do the trick – you might think. Well, few small issues. SQL container images are striped down ubuntu core – which do not have apt, curl etc. also not running as root either – for all good reasons.


So, I have written a small program in Go and compiled it to a binary.

package main
import (
    "encoding/json"
    "fmt"
    "io/ioutil"
    "net/http"
    "os"
)
type TokenResponse struct {
    Token_type   string `json:"token_type"`
    Access_token string `json:"access_token"`
}
type SecretResponse struct {
    Value string `json:"value"`
    Id    string `json:"id"`
}
func main() {
    var p TokenResponse
    var s SecretResponse
    imds := "http://169.254.169.254/metadata/identity/oauth2/token" +
            "?api-version=2018-02-01&resource=https%3A%2F%2Fvault.azure.net"
    kvUrl := "https://" + os.Args[1] + "/secrets/" + os.Args[2] + 
             "?api-version=2016-10-01"

    client := &http.Client{}
    req, _ := http.NewRequest("GET", imds , nil)
    req.Header.Set("Metadata", "True")
    res, _ := client.Do(req)
    b, _ := ioutil.ReadAll(res.Body)
    json.Unmarshal(b, &p)

    req, _ = http.NewRequest("GET", kvUrl , nil)
    req.Header.Set("Authorization", p.Token_type+" "+p.Access_token)
    res, _ = client.Do(req)
    b, _ = ioutil.ReadAll(res.Body)
    json.Unmarshal(b, &s)
    fmt.Println(s.Value)
}

This program simply grabs the secret from Azure Key vault using Managed Identity. Created a binary out of it:

Go build -o aadtoken


Next, I have created my SQL container image with following docker file:

FROM mcr.microsoft.com/mssql/server:2019-latest

ENV ACCEPT_EULA=Y
ENV MSSQL_PID=Developer
ENV MSSQL_TCP_PORT=1433 
COPY ./aadtoken /
COPY ./startup.sh /
CMD [ "/bin/bash", "./startup.sh" ] 

You see, I am relying on “startup.sh” bash-script. Here’s it:

echo "Retrieving AAD Token with Managed Identity..."
export SA_PASSWORD=$(./aadtoken $KeyVault $SecretName)
echo "SQL password received and cofigured successfully"
/opt/mssql/bin/sqlservr --accept-eula

Created the image and here’s my SQL manifest to deploy in Kubernetes:

apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: mssql-deployment
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: mssql
        aadpodidbinding: managed-identity
    spec:
      terminationGracePeriodSeconds: 10
      containers:
      - name: mssql
        image: #{ACR.Name}#/sql-server:2019-latest
        ports:
        - containerPort: 1433
        env:
        - name: KeyVault
          value: "#{KeyVault.Name}#"
        - name: SecretName
          value: "SQL_PASSWORD"
        volumeMounts:
        - name: mssqldb
          mountPath: /var/opt/mssql
      volumes:
      - name: mssqldb
        persistentVolumeClaim:
          claimName: mssql-data

Deployed the manifest and voila! All works. SQL pods and .net app pods all are using Managed Identity to connect to Key vault and retrieving the secret, stored centrally in one place and management of the password nice & tidy.

Conclusion

I find Azure pod identity a neat feature and security best practice. Especially when you are using Azure Kubernetes and some Azure Resources (i.e. Cosmos DB, Azure SQL, Key vault, Storage Account, Service Bus etc.). If you didn’t know, hope this makes you more enthusiast to investigate further.

Disclaimer 1: It’s an open source project – so the Azure technical support doesn’t apply.

Disclaimer 2: The SQL container part with Go script is totally a fun learning attempt, don’t take it too seriously!

Terraforming Azure DevOps

Background

In many organizations, specially in large enterprises there’s a need to automate Azure DevOps projects and Teams members. Manually managing large number of Azure DevOps projects, Teams for these projects and users to the teams, on-boarding and off-boarding team members are not trivial.

Besides managing the users sometimes, we just need to have an overview (a documentation?) of users and Teams of Projects. Terraform is a great tool for Infrastructure as Code – which not only allows providing infrastructure on demand, but also gives us nice documentation which can be versioned control in a source control system. The workflow kind of looks like following:

GitOps

I am developing a Terraform Provider for Azure DevOps that helps me use Terraform for provisioning Azure DevOps projects, Teams and members. In this article I will share how I am building it.

Note
This provider doesn't implement the complete set of 
Azure DevOps REST APIs. 
Its limited to only projects, teams and member associations. 
It's not recommended to use it in production scenarios.

Terraform Provider

Terrafom is an amazing tool that lets you define your infrastructure as code. Under the hood it’s an incredibly powerful state machine that makes API requests and marshals resources. Terraform has lots of providers – almost for every major cloud – out there. Including many other systems – like Kubernetes, Palo-Alto Networks etc.

In nutshell if any system has REST API that can be manipulated with Terraform Provider. Azure DevOps also has a terraform provider – which doesn’t currently provide resources to create Teams and members. Hence, I am writing my own – shamelessly using/stealing the Microsoft’s Terraform provider (referenced above) for project creation.

Setting up GO Environment

Terraform Providers and plugins are binaries that Terraform communicates during runtime via RPC. It’s theoretically possible to write a provider in any language, but to be honest, I haven’t come across any providers that were written other languages than GO. Terraform provide helper libraries in Go to aid in writing and testing providers.

I am developing in Windows 10 and didn’t want to install GO on my local machine. Containers come to rescue of course. I am using the “Remote development” extension in VS Code. This extension allows me to keep the source code in local machine and compile, build the source code in a container like Magic!

remote

Figure: Remote Development extension in VSCode – running container to build local repository.

Creating the provider

To create a Terraform provider we need to write the logic for managing the Creation, Reading, Updating and Deletion (CRUD) of a resource (i.e. Azure DevOps project, Team and members in this scenario) and Terraform will take care of the rest; state, locking, templating language and managing the lifecycle of the resources. Here in this repository I have a minimum implementation that supports creating Azure DevOps projects, Teams and its members.

First of all we define our provider and resources in main.go file.

Next to that, we will define the provider schema (the attributes it supports as input and outputs, resources etc.)

We are using Azure DevOps personal Access token to communicate to the Azure DevOps REST API. The GO client for Azure DevOps from Microsoft – which is used as dependency, immensely simplified the implementation and also helped learning the flow.

Now defining the “team” resource as following:

That’s all for declaring, now implementing the CRUD methods in resource providers. The full source code is in GitHub.

We can compile the provider application using following command:

> GOOS=windows GOARCH=amd64 go build -o terraform-provider-azuredevops.exe

As I am using Dabian docker image for GoLang I need to specify my target OS (GOOS=windows) and CPU Architecture (GOARCH=amd64) when I build the provider. This will produce the terraform provider for Azure DevOps executable.

Although it’s executable, it’s not meant to launch directly from command prompt. Instead, I will copy it to “%APPDATA%\ terraform.d\plugins\windows_amd64” folder of my machine.

Terraform Script for Azrue DevOps

Now we can write the Terraform file (.tf) that will describe the Azure DevOps Project, Team and members etc.

Terraform

With this terraform file, we can now launch the following command to initialize our terraform environment.

init

The terraform init command is used to initialize a working directory containing Terraform configuration files. This is the first command that should be run after writing a new Terraform configuration or cloning an existing one from version control.

Terraform plan

The terraform plan command is used to create an execution plan. Terraform performs a refresh, and then determines what actions are necessary to achieve the desired state specified in the configuration files. This command is a convenient way to check whether the execution plan for a set of changes matches your expectations without making any changes to real resources or to the state. For example, terraform plan might be run before committing a change to version control, to create confidence that it will behave as expected.

PLan

Figure: terraform plan output – shows exactly what is going to happen if we apply these changes to Azure DevOps

Terraform apply

The terraform apply command is used to apply the changes required to reach the desired state of the configuration, or the pre-determined set of actions generated by a terraform plan execution plan. We will launch it with an “-auto-approve” flag to assert the approval prompt.

apply

Now we can go to our Azure DevOps and sure enough there’s a new project created with the configuration as we scripted in Terraform file.

Taking it further

Now we can check in the terraform file (main.tf above) into an Azure DevOps repository and put a Branch policy to it. That will force any changes (such as creating new projects, adding removing team members) would requrie a Pull-Request and needs to be reviewed by peers (four-eyes principles). Once Pull-Requests are approved, a simple Azure Pipeline can trigger that does the terraform apply. And I have my workflow automated  and I also have nice histories in GIT – which records the purpose of any changes made in past.

Thanks for reading!