AKS · Azure · Azure Active Directory · Azure CLI · Azure Container Registry · Azure SQL · docker · Entra · Kubernetes · Kubernetes · managed-identity · microsoft · Python · Workload Identity Federation · Workload Identity Federation

AKS Workload identity – A Deeper look

Background

Recently, I found myself delving into the intricacies of Workload Identity Federation within Azure Kubernetes Service (AKS) while explaining it to some friends. As I delved deeper into the topic, I realized the importance of documenting and summarizing this information for anyone else navigating the same waters – including my future self. So, let’s dive in and explore the essence of Workload Identity Federation in AKS, aiming to provide a clear understanding for all.

I will be using a Kubernetes pod with python code in it, connecting to Azure SQL database to demonstrating the scenario.

Environment setup
Step 1: Creating SQL server

For this example, use the Azure Portal to create a SQL server and a SQL database. I won’t explain how to do that – follow Microsoft Learn for that. Once the SQL database is ready, we can create a test Table in it:

CREATE TABLE Customer (
    customer_id INT PRIMARY KEY IDENTITY,
    first_name VARCHAR(50),
    last_name VARCHAR(50),
    active BIT
);

Create some records in it as well – for testing purpose:

INSERT INTO Customer (first_name, last_name, active) VALUES ('John', 'Doe', 1);
INSERT INTO Customer (first_name, last_name, active) VALUES ('Jane', 'Smith', 0);
INSERT INTO Customer (first_name, last_name, active) VALUES ('Alice', 'Johnson', 0);
INSERT INTO Customer (first_name, last_name, active) VALUES ('Bob', 'Brown', 1);
INSERT INTO Customer (first_name, last_name, active) VALUES ('Emily', 'Davis', 0);
Step 2: Create Service Principal and grant database access

Next, we will have to create a Service Principal (in our example, let’s call it SP-AZURESQL-USER) in Microsoft Entra (Azure Active Directory) and grant database access to that Service Principal.

To create a service principal using the Azure CLI, you can use the az ad sp create-for-rbac command. This command creates a new service principal (SP) and assigns it the appropriate role-based access control (RBAC) roles. Here’s the basic syntax to create a service principal:

az ad sp create-for-rbac --name <service_principal_name>

Replace <service_principal_name> with the desired name for your service principal.

After running the command, Azure CLI will output the details of the newly created service principal, including its appId (client ID) and password (client secret). Make sure to securely store the client secret as it will not be retrievable later.

Here’s an example of the output:

{
  "appId": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
  "displayName": "<service_principal_name>",
  "name": "http://<service_principal_name>",
  "password": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
  "tenant": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
}

Note down these values, we will be needing them later.

Next, we will grant SQL roles to this service principal. We can grant them as follows:

CREATE USER [SP-AZURESQL-USER] FROM EXTERNAL PROVIDER;

ALTER ROLE db_datareader ADD MEMBER [SP-AZURESQL-USER];
ALTER ROLE db_datawriter ADD MEMBER [SP-AZURESQL-USER];
ALTER ROLE db_ddladmin ADD MEMBER [SP-AZURESQL-USER];

SELECT * FROM sys.database_principals WHERE type_desc = 'EXTERNAL_USER'

Once the above is completed, you can create some environment variables in order to test the setup from local machine. Here are the environment variables that we need:

    export AZ_TENANT_ID="your_tenant_id"
    export AZ_CLIENT_ID="your_client_id"
    export AZ_CLIENT_SECRET="your_client_secret"
    export AZ_SERVER="tcp:$SQL_SERVER_NAME.database.windows.net,1433"
    export AZ_DATABASE="$SQL_DBNAME"

Replace the SQL_SERVER_NAME and SQL_DBNAME with the names you came up with.

Step 3: Create a test application

Create a simple Python app to try out if we can access to the database using the service principal client ID and client secret. Later we will enter to the secret-less world with Workload Identity.

import os
import pyodbc, struct
from azure.identity import DefaultAzureCredential, ClientSecretCredential

def connectToDatabase():
    tenant_id = os.getenv('AZ_TENANT_ID')
    client_id = os.getenv('AZ_CLIENT_ID')
    client_secret = os.getenv('AZ_CLIENT_SECRET')
    server = os.getenv('AZ_SERVER')
    database = os.getenv('AZ_DATABASE')    
    
    credential = ClientSecretCredential(tenant_id, client_id, client_secret)    
    connection_string = f"DRIVER={{ODBC Driver 18 for SQL Server}};SERVER={server};DATABASE={database};Encrypt=yes;TrustServerCertificate=no;Connection Timeout=30"
    token_bytes = credential.get_token("https://database.windows.net/.default").token.encode("UTF-16-LE")
    token_struct = struct.pack(f'<I{len(token_bytes)}s', len(token_bytes), token_bytes)
    SQL_COPT_SS_ACCESS_TOKEN = 1256 
    conn = pyodbc.connect(connection_string, attrs_before={SQL_COPT_SS_ACCESS_TOKEN: token_struct})
    cursor = conn.cursor()
    cursor.execute("SELECT TOP (1000) * FROM [dbo].[Customer]") 
    row = cursor.fetchone()
    while row:
       print("Customer ID-->" + str(row[0]))
       row = cursor.fetchone()

connectToDatabase()

Also a requirement file with following text in it:

pyodbc==5.1.0
azure.identity

We can run the app now,

 python app.py

You should see output like below:

Customer ID-->1
Customer ID-->2
Customer ID-->3
Customer ID-->4
Customer ID-->5

At this point, we established that the service principal (the Identity) has right access to the SQL database and can read rows from it.

Step 4: Create Azure Kubernetes Service

We will now create a demo AKS cluster to try out Workload Identity Federation. Important thing here is, we will have to enable OIDC issuer for the cluster.

Let’s start with some environment variables to ease the process:

export RESOURCE_GROUP="AKS_SQL_WORKLOAD_IDENTITY"
export CLUSTER_NAME="DEMO_CLUSTER"

az aks create -g "${RESOURCE_GROUP}" -n $CLUSTER_NAME \
                        --node-count 1 \
                        --enable-oidc-issuer  \
                        --enable-workload-identity \
                        --generate-ssh-keys

You can now grab the OIDC issuer URI:

export AKS_OIDC_ISSUER="$(az aks show -n $CLUSTER_NAME -g "${RESOURCE_GROUP}" --query "oidcIssuerProfile.issuerUrl" -otsv)"

echo $AKS_OIDC_ISSUER 

You should see an URI somewhat like:

https://eastus.oic.prod-aks.azure.com/000000000000/0000000000000/
Step 5: Create namespace, service account

We will create a new namespace for the experiment and a service account that will be used for federation. Let’s create them:

kubectl create namespace workload-demo 

We will create the service account using the following YAML file:

apiVersion: v1
kind: ServiceAccount
metadata:
  annotations:
    azure.workload.identity/client-id: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
  name: workload-demo-service-account
  namespace: workload-demo

In the above snippet, provide the client ID (Application ID) of the service principal that you created earlier.

You can now apply this YAML spec to create the service account:

kubectl apply -f service-account.yaml 

# Verify
kubectl describe serviceaccount/workload-demo-service-account -n workload-demo
Step 5: Establish Federation

Let’s create the federated identity credential between the managed identity, service account issuer, and subject using the az identity federated-credential create command.

az identity federated-credential create \
         --name ${FEDERATED_IDENTITY_CREDENTIAL_NAME} \
         --identity-name ${SERVICE_PRINCIPAL_NAME} \
         --resource-group ${RESOURCE_GROUP} \
         --issuer ${AKS_OIDC_ISSUER} \
         --subject system:serviceaccount:${SERVICE_ACCOUNT_NAMESPACE}:${SERVICE_ACCOUNT_NAME}

You can also create the Federated credential from Azure Portal.

Step 6: Modify python app to use Identity Federation
import os
import pyodbc, struct
import requests
import time

def connectToAzureSQL():
    tenant_id = os.getenv('AZ_TENANT_ID')
    client_id = os.getenv('AZ_CLIENT_ID')
    server = os.getenv('AZ_SERVER')
    database = os.getenv('AZ_DATABASE')   
    connection_string = f"DRIVER={{ODBC Driver 18 for SQL Server}};SERVER={server};DATABASE={database};Encrypt=yes;TrustServerCertificate=no;Connection Timeout=30"

    service_account_token_path = '/var/run/secrets/tokens/sa-token'
    with open(service_account_token_path, 'r') as token_file:
        service_account_token = token_file.read().strip()
    
    # print the environment variables to the console
    print("AZ_TENANT_ID: " + tenant_id)
    print("AZ_CLIENT_ID: " + client_id)
    print("AZ_SERVER: " + server)
    print("AZ_DATABASE: " + database)
    
    # print a message if the service token is not null and has a length greater than 0
    if service_account_token is not None and len(service_account_token) > 0:
        # truncate the token to 10 characters for security reasons
        print("Service Account Token: " + service_account_token[:10] + "...")        
    else:
        print("Service Account Token is null or empty")

    url = f'https://login.microsoftonline.com:443/{tenant_id}/oauth2/v2.0/token '
    payload = (
        "scope=https%3A%2F%2Fdatabase.windows.net%2F.default"
        f"&client_id={client_id}"
        "&client_assertion_type=urn%3Aietf%3Aparams%3Aoauth%3Aclient-assertion-type%3Ajwt-bearer"
        f"&client_assertion={service_account_token}"
        "&grant_type=client_credentials"
    )
    response = requests.post(url, data=payload, headers={'Content-Type': 'application/x-www-form-urlencoded'})
    if response.status_code == 200:
        print("Token obtained from Azure AD:")
        responseJson = response.json()
        access_token = responseJson['access_token']
        print("Federated token obtained fro Entra")

        token_bytes = access_token.encode("UTF-16-LE")
        token_struct = struct.pack(f'<I{len(token_bytes)}s', len(token_bytes), token_bytes)
        SQL_COPT_SS_ACCESS_TOKEN = 1256 
        conn = pyodbc.connect(connection_string, attrs_before={SQL_COPT_SS_ACCESS_TOKEN: token_struct})
        cursor = conn.cursor()
        cursor.execute("SELECT TOP (1000) * FROM [dbo].[Customer]") 
        row = cursor.fetchone()
        while row:
            print("Customer ID-->" + str(row[0]))
            row = cursor.fetchone()        
    else:
        print("Error:", response.text)
    
connectToAzureSQL()

Let’s break down the key components and functionalities of the code:

  1. Imports: The code imports necessary libraries/modules such as os, pyodbc, struct, requests, and time.
  2. connectToAzureSQL Function: This function serves as the main entry point for connecting to the Azure SQL database. It retrieves the necessary environment variables (AZ_TENANT_ID, AZ_CLIENT_ID, AZ_SERVER, AZ_DATABASE) which are configured to store sensitive information required for authentication and connection.
  3. Retrieve Service Account Token: The code reads the service account token from the file located at /var/run/secrets/tokens/sa-token. This token is essential for authentication with Azure Active Directory (Azure AD) when obtaining access tokens.
  4. Print Environment Variables and Token: The function prints out the retrieved environment variables and a truncated version of the service account token for debugging and logging purposes.
  5. Construct OAuth2 Token Request: It constructs an OAuth2 token request URL using the retrieved tenant ID and client ID. The payload contains necessary parameters such as scope, client_id, client_assertion_type, client_assertion, and grant_type.
  6. Send Token Request: The code sends a POST request to the Azure AD token endpoint (https://login.microsoftonline.com/<tenant_id>/oauth2/v2.0/token) to obtain an access token using the service account token.
  7. Handle Token Response: If the response status code is 200 (OK), it prints the obtained access token and proceeds to establish a connection to the Azure SQL database using the pyodbc library. It then executes a SQL query to fetch customer data from the database.

Let’s update our requirement file too:

pyodbc==5.1.0
requests
Step 7: Create container image and push it to registry

Let’s create a container image for this app and publish it to a container registry, I will be using Docker hub for that.

docker build -t moimhossain/python-odbc-azure-sql:beta .

docker push moimhossain/python-odbc-azure-sql:beta
Step 8: Create manifests

Next, we will create a pod with the image we built. Let’s see the pod manifest:

apiVersion: v1
kind: Pod
metadata:
  name: pyodbc-demo-pod
  namespace: workload-demo
spec:
  containers:
  - name: pyodbc-demo-container
    image: moimhossain/python-odbc-azure-sql:v1    
    imagePullPolicy: Always
    volumeMounts:
    - name: sa-token
      mountPath: /var/run/secrets/tokens
    env:
    - name: AZ_TENANT_ID
      value: "XXXX"
    - name: AZ_CLIENT_ID
      value: "XXXX"
    - name: AZ_SERVER
      value: "XXXX"
    - name: AZ_DATABASE
      value: "XXXX"    
  serviceAccountName: workload-demo-service-account
  volumes:
  - name: sa-token
    projected:
      sources:
      - serviceAccountToken:
          path: sa-token
          expirationSeconds: 1000 
          audience: api://AzureADTokenExchange

Step 9: Deploy container to K8s

At this point, we will deploy the manifest:

kubectl apply -f pod.yaml

We can now see the logs from the pods:

kubectl logs -f  pyodbc-demo-pod -n workload-demo

Which gives us following output:

AZ_TENANT_ID: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXX
AZ_CLIENT_ID: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXX
AZ_SERVER: tcp:XXXX.database.windows.net,1433
AZ_DATABASE: XXXX
Service Account Token: eyJhbGciOi...
Token obtained from Azure AD:
Federated token obtained fro Entra
Customer ID-->1
Customer ID-->2
Customer ID-->3
Customer ID-->4
Customer ID-->5
...
Step 10: Using MSAL libraries

We can further simplify our code – specially the token exchange part with the MSAL library.

Install the Azure-Identity package:

pip install azure-identity

Azure Identity uses credential classes to ease different authentication schemes. A credential is a class that contains or can obtain the data needed for a service client to authenticate requests. Service clients across the Azure SDK accept a credential instance when they’re constructed, and use that credential to authenticate requests.

The Azure Identity library focuses on OAuth authentication with Microsoft Entra ID. It offers various credential classes capable of acquiring a Microsoft Entra access token. See the Credential classes for further information.

In our case, we will be using the WorkloadIdentityCredential class. With workload identity authentication, applications authenticate themselves using their own identity, rather than using a shared service principal or managed identity. The WorkloadIdentityCredential supports Azure workload identity authentication on Azure Kubernetes and acquires a token using the service account credentials available in the Azure Kubernetes environment. Refer to this workload identity overview for more information.

Let’s change our app.py as below:

import os
import pyodbc, struct
import requests
import time

from azure.identity import WorkloadIdentityCredential

def connectToAzureSQL():
    tenant_id = os.getenv('AZ_TENANT_ID')
    client_id = os.getenv('AZ_CLIENT_ID')
    server = os.getenv('AZ_SERVER')
    database = os.getenv('AZ_DATABASE')   
    connection_string = f"DRIVER={{ODBC Driver 18 for SQL Server}};SERVER={server};DATABASE={database};Encrypt=yes;TrustServerCertificate=no;Connection Timeout=30"

   credential = WorkloadIdentityCredential(
       tenant_id=tenant_id ,
       client_id=client_id ,
       token_file_path="/var/run/secrets/tokens/sa-token" 
   )
   token_bytes = credential.get_token("https://database.windows.net/.default").token.encode("UTF-16-LE")
   token_struct = struct.pack(f'<I{len(token_bytes)}s', len(token_bytes), token_bytes)
   SQL_COPT_SS_ACCESS_TOKEN = 1256 
   conn = pyodbc.connect(connection_string, attrs_before={SQL_COPT_SS_ACCESS_TOKEN: token_struct})
   cursor = conn.cursor()
   cursor.execute("SELECT TOP (1000) * FROM [dbo].[Customer]") 
   row = cursor.fetchone()
   while row:
            print("Customer ID-->" + str(row[0]))
            row = cursor.fetchone()   
    
connectToAzureSQL()

We also need to add a label to the pod to work with AKS workload identity federation.

apiVersion: v1
kind: Pod
metadata:
  name: pyodbc-demo-pod
  namespace: workload-demo
  labels:
    azure.workload.identity/use: "true"
spec:
  containers:
  - name: pyodbc-demo-container
    image: moimhossain/python-odbc-azure-sql:v1    
    imagePullPolicy: Always
    volumeMounts:
    - name: sa-token
      mountPath: /var/run/secrets/tokens

     ... (removed for brevity)

See, it simplifies the code significantly, all the token exchange calls are abstracted and out of our sight now. Also, MSAL takes care of the token lifetime and fetches a new token when our service account token nears expiration.

Conclusion

Hope this gives you a bit deeper look into under the hood process of token exchange when using the Workload Identity Federation in Azure Kubernetes services.

All the source code used here can be found in this GitHub repository.

Thanks for reading!

Leave a comment