Manage Kubernetes running anywhere via Azure Arc

Azure Arc (currently in preview) allows attach and configure Kubernetes Clusters running anywhere (inside or outside of Azure). Once connected the clusters shows up in Azure portal and allows applying tags, policies like other resources. This brings simplicity and uniformity managing both cloud and on-premises resources in a single management pane (Azure Portal).

Azure Arc enabled Kubernetes is in preview. It’s NOT recommended for production workloads.

Following are the key scenarios where Azure Arc adds value:

  • Connect Kubernetes running outside of Azure for inventory, grouping, and tagging.
  • Apply policies by using Azure Policy for Kubernetes.
  • Deploy applications and apply configuration by using GitOps-based configuration management.
  • Use Azure Monitor for containers to view and monitor your clusters.

Connect an on-premises (or another cloud) clusters to Azure Arc

I have used the local Kubernetes (docker desktop) for this, however, the steps are identical for any other Kubernetes clusters. All you need is to run the following Azure CLI command from a machine where you can reach both the on-premises Kubernetes cluster and Azure.

az connectedk8s connect --name <ClusterName> --resource-group <ResourceGroup>

It will take moment and then the cluster is connected to Azure. We can see that in Azure portal:

Once we have the connected cluster to Azure – we can create/edit tags just like any other Azure resource. Which is awesome.

Same goes true for the Azure Policies – I can apply any compliance constraints to the cluster and monitor their compliance status in Azure Security center.

GitOps on Arc enabled Kubernetes cluster

The next piece of feature is interesting and can be very useful for many scenarios. This is much like infrastructure-as-code for your Kubernetes configuration (namespaces, deployments etc.). The idea is we define one or more git repository that keeps the desired state of the cluster (i.e. namespaces, deployments etc.) in Yaml files and Azure Resource Manager does the necessary actions to apply those desired state into the connected cluster. Microsoft Document describes how this works:

The connection between your cluster and one or more Git repositories is tracked in Azure Resource Manager as a sourceControlConfiguration extension resource. The sourceControlConfiguration resource properties represent where and how Kubernetes resources should flow from Git to your cluster. The sourceControlConfiguration data is stored encrypted at rest in an Azure Cosmos DB database to ensure data confidentiality.

The config-agent running in your cluster is responsible for watching for new or updated sourceControlConfiguration extension resources on the Azure Arc enabled Kubernetes resource, deploying a flux operator to watch the Git repository, and propagating any updates made to the sourceControlConfiguration. It is even possible to create multiple sourceControlConfiguration resources with namespace scope on the same Azure Arc enabled Kubernetes cluster to achieve multi-tenancy. In such a case, each operator can only deploy configurations to its respective namespace.

An example Git repository can be found in here: https://github.com/Azure/arc-k8s-demo. We can create the configuration from the Portal or via Azure CLI:

az k8sconfiguration create \
    --name cluster-config \
    --cluster-name AzureArcTest1 --resource-group AzureArcTest \
    --operator-instance-name cluster-config --operator-namespace cluster-config \
    --repository-url https://github.com/Azure/arc-k8s-demo \
    --scope cluster --cluster-type connectedClusters

That’s it, We can see that in Azure Portal:

With that setup committing changes to the Git repository will now reflect in connected cluster.

Monitoring

Connected clusters can also be monitored with Azure Monitor for containers. It’s as simple as creating a Log analytics workspace and configuring the cluster to push metrics to it. This document describes the steps to enable monitoring.

I have seen some scenarios where people running on-premises (or in other cloud) clusters heavily using Prometheus and Grafana for monitoring clusters. Good news, we can get the same on Azure Arc enabled clusters. Once we have the metrics available in Azure Log Analytics, we can use Grafana to point to the workspace – it takes less than a minute and few button clicks (no-code configuration required).

Isn’t it awesome? Go, checkout Azure Arc for Kubernetes today.

Restricting Unverified Kubernetes Content with Docker Content Trust

Docker Content Trust (DCT) provides the ability to use digital signatures for data sent to and received from remote Docker registries. These signatures allow client-side or runtime verification of the integrity and publisher of specific image tags.

Signed tags
Image source: Docker Content Trust


Through DCT, image publishers can sign their images and image consumers can ensure that the images they pull are signed. Publishers could be individuals or organizations manually signing their content or automated software supply chains signing content as part of their release process.


Azure Container Registry implements Docker’s content trust model, enabling pushing and pulling of signed images. Once Content Trust is enabled in Azure Container Registry, signing an image is extremely easy as below:

Signing an image from console

Problem statement

Now that we can have DCT enabled in Azure Container Registry (i.e. allows pushing signed images into the repository), we want to make sure Kubernetes Cluster would deny running any images that are not signed.

However, Azure Kubernetes Service doesn’t have the feature (as of the today while writing this article) to restrict only signed images to be executed in the cluster. This doesn’t mean we’re stuck until AKS releases the feature. We can implement the content trust restriction using a custom Admission controller. In this article I want to share how one can create their own custom Admission controller to achieve DCT in an AKS cluster.

What are Admission Controllers?

In a nutshell, Kubernetes admission controllers are plugins that govern and enforce how the cluster is used. They can be thought of as a gatekeeper that intercept (authenticated) API requests and may change the request object or deny the request altogether.

Admission Controller Phases
Image source: Kubernetes Documentations

The admission control process has two phases: the mutating phase is executed first, followed by the validating phase. Consequently, admission controllers can act as mutating or validating controllers or as a combination of both.

In this article we will create a validation controller that would check if the docker image signature and will disallow Kubernetes to run any unsigned image for a selected Azure Container Registry. And we will create our Admission controller in .net core.

Writing a custom Admission Controller

Writing Admission Controller is fairly easy and straightforward – they’re just web APIs (Webhook) get invoked by API server while a resource is about to be created/updated/deleted etc. Webhook can respond to that request with an Allowed or Disallowed flag.

Kubernetes uses mutual TLS for all the communication with Admission controllers hence, we would need to create self-signed certificate for our Admission controller service.

Creating Self Signed Certificates

Following is a bash script that would generate a custom CA (certificate authority) and a pair of certificates for our web hook API.

#!/usr/bin/env bash

# Generate the CA cert and private key
openssl req -nodes -new -x509 -keyout ca.key -out ca.crt -subj "/CN=TailSpin CA"
# Generate the private key for the tailspin server
openssl genrsa -out tailspin-server-tls.key 2048
# Generate a Certificate Signing Request (CSR) for the private key, and sign it with the private key of the CA.

## NOTE
## The CN should be in this format: <service name>.<namespace>.svc
openssl req -new -key tailspin-server-tls.key -subj "/CN=tailspin-admission.tailspin.svc" \
    | openssl x509 -req -CA ca.crt -CAkey ca.key -CAcreateserial -out tailspin-server-tls.crt

We will create a simple asp.net core web api project. It’s as simple as a File->New asp.net web API project. Except, we would configure Kestrel to use the TLS certificates (created above) while listening by modifying the Program.cs as below.

public static IHostBuilder CreateHostBuilder(string[] args) =>
  Host.CreateDefaultBuilder(args)
      .ConfigureWebHostDefaults(webBuilder => {
        webBuilder
         .UseStartup<Startup>()
         .UseKestrel(options => {                        
 options.ConfigureHttpsDefaults(connectionOptions => 
     {                                    connectionOptions.AllowAnyClientCertificate();
connectionOptions.OnAuthenticate = (context, options) => {                                        Console.WriteLine($"OnAuthenticate:: TLS Connection ID: {context.ConnectionId}");
     };
   });
   options.ListenAnyIP(443, async listenOptions => {
     var certificate = await ResourceReader.GetEmbeddedStreamAsync(ResourceReader.Certificates.TLS_CERT);
     var privateKey = await ResourceReader.GetEmbeddedStreamAsync(ResourceReader.Certificates.TLS_KEY);

      Console.WriteLine("Certificate and Key received...creating PFX..");
      var pfxCertificate = CertificateHelper.GetPfxCertificate(
                certificate,
                privateKey);
      listenOptions.UseHttps(pfxCertificate);
      Console.WriteLine("Service is listening to TLS port");
      });
   });
});

Next, we will create a middle-ware/handler in Startup.cs to handle the Admission Validation requests.

        // This method gets called by the runtime. Use this method to configure the HTTP request pipeline.
        public void Configure(IApplicationBuilder app, IWebHostEnvironment env)
        {
            app.Use(GetAdminissionMiddleware());

The implementation of the middleware looks as following:

public Func<HttpContext, Func<Task>, Task> GetAdminissionMiddleware()
        {            
            var acrName = Environment.GetEnvironmentVariable("RegistryFullName"); // "/subscriptions/XX/resourceGroups/YY/providers/Microsoft.ContainerRegistry/registries/ZZ";
            return async (context, next) =>
            {
                if (context.Request.Path.HasValue && context.Request.Path.Value.Equals("/admission"))
                {
                    using var streamReader = new StreamReader(context.Request.Body);
                    var body = await streamReader.ReadToEndAsync();
                    var payload = JsonConvert.DeserializeObject<AdmissionRequest>(body);

                    var allowed = true;
                    if (payload.Request.Operation.Equals("CREATE") || payload.Request.Operation.Equals("UPDATE"))
                    {                        
                        foreach(var container in payload.Request.Object.Spec.Containers)
                        {
                            allowed = (await RegistryHelper.VerifyAsync(acrName, container.Image)) && allowed;
                        }
                    }
                    await GenerateResponseAsync(context, payload, allowed);
                }
                await next();
            };
        }

What we are doing here is, once we receive a request from API server about a POD to be created or updated, we intercept the request, validate if the image has Digital Signature in place (DCT) and reply to API server accordingly. Here’s the response to API server:

private static async Task GenerateResponseAsync(HttpContext context, AdmissionRequest payload, bool allowed)
        {
            context.Response.Headers.Add("content-type", "application/json");
            await context
                .Response
                .WriteAsync(JsonConvert.SerializeObject(new AdmissionReviewResponse
                {
                    ApiVersion = payload.ApiVersion,
                    Kind = payload.Kind,
                    Response = new ResponsePayload
                    {
                        Uid = payload.Request.Uid,
                        Allowed = allowed
                    }
                }));
        }

Now let’s see how we can verify the image has a signed tag in Azure Container Registry.

public class RegistryHelper
    {
        private static HttpClient http = new HttpClient();

        public async static Task<bool> VerifyAsync(string acrName, string imageWithTag)
        {
            var imageNameWithTag = imageWithTag.Split(":".ToCharArray());
            var credentials = SdkContext
                .AzureCredentialsFactory
                .FromServicePrincipal(Environment.GetEnvironmentVariable("ADClientID"),
                Environment.GetEnvironmentVariable("ADClientSecret"),
                Environment.GetEnvironmentVariable("ADTenantID"),
                AzureEnvironment.AzureGlobalCloud);

            var azure = Azure
                .Configure()
                .WithLogLevel(HttpLoggingDelegatingHandler.Level.Basic)
                .Authenticate(credentials)
                .WithSubscription(Environment.GetEnvironmentVariable("ADSubscriptionID"));
           
            var azureRegistry = await azure.ContainerRegistries.GetByIdAsync(acrName);
            var creds = await azureRegistry.GetCredentialsAsync();           

            var authenticationString = $"{creds.Username}:{creds.AccessKeys[AccessKeyType.Primary]}";
            var base64EncodedAuthenticationString = 
                Convert.ToBase64String(ASCIIEncoding.UTF8.GetBytes(authenticationString));
            http.DefaultRequestHeaders.Add("Authorization", "Basic " + base64EncodedAuthenticationString);

            var response = await http.GetAsync($"https://{azureRegistry.Name}.azurecr.io/acr/v1/{imageNameWithTag[0]}/_tags/{imageNameWithTag[1]}");
            var repository = JsonConvert.DeserializeObject<RepositoryTag>(await response.Content.ReadAsStringAsync());

            return repository.Tag.Signed;
        }
    }

This class uses Azure REST API for Azure Container Registry to check if the image tag was digitally signed. That’s all. We would create a docker image (tailspin-admission:latest) for this application and deploy it to Kubernetes with the following manifest:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: tailspin-admission
  namespace: tailspin
spec:
  replicas: 1
  selector:
    matchLabels:
      app: tailspin-admission
  template:
    metadata:
      labels:
        app: tailspin-admission
    spec:
      nodeSelector:
        "beta.kubernetes.io/os": linux
      containers:
      - name: tailspin-admission
        image: "acr.io/tailspin-admission:latest"
        ports:
        - containerPort: 443
---
apiVersion: v1
kind: Service
metadata:
  name: tailspin-admission
  namespace: tailspin
spec:
  type: LoadBalancer
  selector:
    app: tailspin-admission
  ports:
    - port: 443
      targetPort: 443

Now our Admission Controller is running, we need to register this as a validation web hook with the following manifest:

apiVersion: admissionregistration.k8s.io/v1beta1
kind: ValidatingWebhookConfiguration
metadata:
  name: tailspin-admission-controller
webhooks:
  - name: tailspin-admission.tailspin.svc
    admissionReviewVersions: ["v1", "v1beta1"]
    failurePolicy: Fail
    clientConfig:
      service:
        name: tailspin-admission
        namespace: tailspin
        path: "/admission"
      caBundle: ${CA_PEM_B64}
    rules:
      - operations: [ "*" ]
        apiGroups: [""]
        apiVersions: ["*"]
        resources: ["*"]

Here the ${CA_PEM_B64} needs to be filled with the base64 of our CA certificate that we generated above. The following bash script can do that:

# Read the PEM-encoded CA certificate, base64 encode it, and replace the `${CA_PEM_B64}` placeholder in the YAML
# template with it. Then, create the Kubernetes resources.
ca_pem_b64="$(openssl base64 -A <"../certs/ca.crt")"
(sed -e 's@${CA_PEM_B64}@'"$ca_pem_b64"'@g' <"./manifests/webhook-deployment.template.yaml") > "./manifests/webhook-deployment.yaml"

We can now deploy this manifest (KubeCtl) to our AKS cluster. And we are done! The cluster will now prevent running any unsigned images coming from our Azure Container Registry.

Summary

A critical aspect for DCT feature on AKS is enabling a solution which can satisfy all requirements such as content moving repositories or across registries which may extend beyond the current scope of the Azure Container Registry content trust feature as seen today. And enabling DCT on each AKS node is not a feasible solution as many Kubernetes images are not signed.

A custom Admission controller allow you to avoid these limitations and complexity today still being able to enforce Content Trust specific to your own ACR and images only. This article shows a very quick and simple way to create a custom Admission Controller – in .net core and how easy it is to create your own security policy for your AKS cluster.

Hope you find it useful and interesting.

Azure AD Pod Identity – password-less app-containers in AKS

Background

I like Azure Managed Identity since its advent. The concept behind Managed Identity is clever, and it adds observable value to any DevOps team. All concerns with password configurations in multiple places, life cycle management of secrets, certificates, and rotation policies suddenly irrelevant (OK, most of the cases).
Leveraging managed identity for application hosted in Azure Virtual machine, Azure web apps, Function apps etc. was straightforward. The Managed Identity sits on top of Azure Instance Metadata Service technology. Azure’s Instance Metadata Service is a REST Endpoint accessible to all IaaS VMs created via the Azure Resource Manager. The endpoint is available at a well-known non-routable IP address (169.254.169.254) that can be accessed only from within the VM. Under the hood Azure VMs, VMSS and Azure PaaS resources (i.e. Web Apps, Function Apps etc.) leverage metadata service to retrieve Azure AD token. Thus VM, Web App kind of establishes their own “Application Identity” (what Managed Identity essentially is) that Azure AD authenticates.

Managed Identity in Azure Kubernetes Service

Managed Identity in Kubernetes, however, is a different ballgame. Typically, multiple applications (often developed by different teams in an organization) running in a single cluster, pods are launching, exiting frequently in different nodes. Hence, Managed Identity associating with VM/VMSS are not sufficient, we needed a way to assign identity to every pods in an application. If pods move to different nodes, the identity must somehow move with them in the new node (VM).

Azure Pod Identity

Good news is Azure Pod Identity offers that capability. Azure Pod Identity is an Open source project in GitHub.

Note: Managed pod identities is an open source project and is not supported by Azure technical support.

An application can use Azure Pod Identity to access Azure resources (i.e. Key Vault, Storage, Azure SQL database etc.) via Managed Identity hence, there’s no secret/password involved anywhere in the process. Pods can directly fetch access tokens scoped to resources directly from Azure Active Directory.

Concept

The following two components are installed in cluster to achieve the pod identity.

1. The Node Management Identity (NMI)

AKS cluster runs this Daemon Set in every node. This intercepts outbound calls from pods requesting access tokens and proxies those calls with predefined Managed Identity.

2. The Managed Identity Controller (MIC)

MIC is a central pod with permissions to query the Kubernetes API server and checks for an Azure identity mapping that corresponds to a pod.

Source: GitHub Project – Azure Pod Identity

When pods request access to an Azure service, network rules redirect the traffic to the Node Management Identity (NMI) server. The NMI server identifies pods that request access to Azure services based on their remote address and queries the Managed Identity Controller (MIC). The MIC checks for Azure identity mappings in the AKS cluster, and the NMI server then requests an access token from Azure Active Directory (AD) based on the pod’s identity mapping. Azure AD provides access to the NMI server, which is returned to the pod. This access token can be used by the pod to then request access to services in Azure.

Microsoft

Azure Pipeline to Bootstrap pod Identity

I have started with an existing rbac-enabled Kubernetes cluster – that I have created before. Azure AD pod identity would setup “Service Account”, “custom resource definitions (CRD)”, Cluster Roles and bindings, DaemonSet for NMI etc. I wanted to do it via Pipeline, so I can repeat the process on-demand. Here are the interesting part of the azure-pod-identity-setup-pipeline.yaml

trigger:
- master
variables:
  tag: '$(Build.BuildId)'
  containerRegistry: $(acr-name).azurecr.io
  vmImageName: 'ubuntu-latest'
stages:
- stage: Build
  displayName: Aad-Pod-Identity-Setup
  jobs:  
  - job: Build
    displayName: Setup Aad-Pod-Identity.
    pool:
      vmImage: $(vmImageName)
    environment: 'Kubernetes-Cluster-Environment.default'
    steps:
      - bash: |
          kubectl apply -f https://raw.githubusercontent.com/Azure/aad-pod-identity/master/deploy/infra/deployment-rbac.yaml
        displayName: 'Setup Service Account, CRD, DaemonSet etc'


I will be using a “User assigned identity” for my sample application. I have written a basic .net app with SQL back-end for this purpose. My end goal is to allow the .net app talk to SQL server with pod identity.
Following instruction Aad-pod-identity project instruction, I have created the user assigned identity.

      - task: AzureCLI@2
        inputs:
          scriptType: 'bash'
          scriptLocation: 'inlineScript'
          inlineScript: 'az identity create -g $(rgp) -n $(uaiName) -o json'

In my repository, created the Azure Identity definition in a file named: aad-pod-identity.yaml

apiVersion: "aadpodidentity.k8s.io/v1"
kind: AzureIdentity
metadata:
  name: <a-idname>
spec:
  type: 0
  ResourceID: /subscriptions/<sub>/resourcegroups/<resourcegroup>/providers/Microsoft.ManagedIdentity/userAssignedIdentities/<name>
  ClientID: <clientId>
---
apiVersion: "aadpodidentity.k8s.io/v1"
kind: AzureIdentityBinding
metadata:
  name: demo1-azure-identity-binding
spec:
  AzureIdentity: <a-idname>
  Selector: managed-identity


And added a task to deploy that too.

      - bash: |
          kubectl apply -f manifests/ aad-pod-identity.yaml
        displayName: 'Setup Azure Identity

Triggered the pipeline and Azure pod Identity was ready to roll.

Deploying application

I have a .net core web app (Razor application) which I would configure to run with pod identity to connect to Azure SQL (a back-end) with Azure Active directory authentication – with no password configured at the application level.
Here’s the manifest file (front-end.yaml) for the application, the crucial part is to define the label (aadpodidbinding: managed-identity) match for binding the pod identity we have defined before:

        apiVersion:apps/v1
	kind: Deployment
	metadata:
	  name: dysnomia-frontend
	spec:
	  replicas: 6
	  selector:
	    matchLabels:
	      app: dysnomia-frontend      
	  strategy:
	    rollingUpdate:
	      maxSurge: 1
	      maxUnavailable: 1
	  minReadySeconds: 5 
	  template:
	    metadata:
	      labels:
	        app: dysnomia-frontend
                aadpodidbinding: managed-identity
	    spec:
	      nodeSelector:
	        "beta.kubernetes.io/os": linux
	      containers:
	      - name: dysnomia-frontend
	        image: #{containerRegistry}#/dysnomia-frontend:#{Build.BuildId}#
	        imagePullPolicy: "Always"

That’s pretty much it, once I have created my application pipeline with the above manifest deployed, the .net application can connect to Azure SQL database with the assigned pod identity.

      - bash: |
          kubectl apply -f manifests/front-end.yaml
        displayName: 'Deploy Front-end'


Of course, I needed to do Role assignment for User Assigned Identity and enable Azure AD authentication in my SQL server, but not describing those steps, I have written about that before.

What about non-Azure resources?

The above holds true for all Azure Resources that supports Managed Identity. That means, our application can connect to Cosmos DB, Storage Account, Service Bus, Key Vaults, and many other Azure resources without configuring any password and secrets anywhere in Kubernetes.
However, there are scenarios where we might want to run a Redis container or a SQL server container in our Kubernetes cluster. And cost wise, it might make sense to run it in Kubernetes (as you already have a cluster) instead of Azure PaaS (i.e. Azure SQL or Azure Redis) for many use-cases. In those cases, we must create the SQL password and configure into our .net app (using Kubernetes Secrets).
I was wondering if I could store my SQL password in an Azure Key Vault and let my SQL container and .net app both collect the password from key vault during launch using Azure pod identity. Kubernetes has a first-class option to handle such scenarios- Kubernetes secrets.

However, today I am playing with Azure AD pod Identity – therefore, I really wanted to use pod identity – for fun ;-). Here’s how I managed to make it work.

SQL container, pod identity and Azure Key vault

I’ve created Key vault and defined SQL server password as a secret there. Configured my .net app to use pod identity to talk to key vault and configured key vault access policy so user-assigned identity created above can grab the SQL password. So far so good.

Now, I wanted to run a SQL server instance in my cluster which should also collect the password from Key vault – same way as it did for .net app. Turned out, SQL 2019 image (mcr.microsoft.com/mssql/server:2019-latest) expects the password as an environment variable during container launch.

docker run -d -p 1433:1433 `
           -e "ACCEPT_EULA=Y" `
           -e "SA_PASSWORD=P@ssw0rD" `
           mcr.microsoft.com/mssql/server:2019-latest

Initially, I thought it would be easy to use an init-container to grab the password from Azure Key vault and then pass it through the application container (SQL) as environment variable. After some failed attempts realized that isn’t trivial. I can of course create volume mounts (e.g. EmptyDir) to convey the password from init-container to application container – but that rather dirty – isn’t it?
Secondly I thought of creating my own Docker image based on the SQL container, then I could run a piece of script that will grab the password form Key vault and set it as environment variable. A simple script with a few curl commands would do the trick – you might think. Well, few small issues. SQL container images are striped down ubuntu core – which do not have apt, curl etc. also not running as root either – for all good reasons.


So, I have written a small program in Go and compiled it to a binary.

package main
import (
    "encoding/json"
    "fmt"
    "io/ioutil"
    "net/http"
    "os"
)
type TokenResponse struct {
    Token_type   string `json:"token_type"`
    Access_token string `json:"access_token"`
}
type SecretResponse struct {
    Value string `json:"value"`
    Id    string `json:"id"`
}
func main() {
    var p TokenResponse
    var s SecretResponse
    imds := "http://169.254.169.254/metadata/identity/oauth2/token" +
            "?api-version=2018-02-01&resource=https%3A%2F%2Fvault.azure.net"
    kvUrl := "https://" + os.Args[1] + "/secrets/" + os.Args[2] + 
             "?api-version=2016-10-01"

    client := &http.Client{}
    req, _ := http.NewRequest("GET", imds , nil)
    req.Header.Set("Metadata", "True")
    res, _ := client.Do(req)
    b, _ := ioutil.ReadAll(res.Body)
    json.Unmarshal(b, &p)

    req, _ = http.NewRequest("GET", kvUrl , nil)
    req.Header.Set("Authorization", p.Token_type+" "+p.Access_token)
    res, _ = client.Do(req)
    b, _ = ioutil.ReadAll(res.Body)
    json.Unmarshal(b, &s)
    fmt.Println(s.Value)
}

This program simply grabs the secret from Azure Key vault using Managed Identity. Created a binary out of it:

Go build -o aadtoken


Next, I have created my SQL container image with following docker file:

FROM mcr.microsoft.com/mssql/server:2019-latest

ENV ACCEPT_EULA=Y
ENV MSSQL_PID=Developer
ENV MSSQL_TCP_PORT=1433 
COPY ./aadtoken /
COPY ./startup.sh /
CMD [ "/bin/bash", "./startup.sh" ] 

You see, I am relying on “startup.sh” bash-script. Here’s it:

echo "Retrieving AAD Token with Managed Identity..."
export SA_PASSWORD=$(./aadtoken $KeyVault $SecretName)
echo "SQL password received and cofigured successfully"
/opt/mssql/bin/sqlservr --accept-eula

Created the image and here’s my SQL manifest to deploy in Kubernetes:

apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: mssql-deployment
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: mssql
        aadpodidbinding: managed-identity
    spec:
      terminationGracePeriodSeconds: 10
      containers:
      - name: mssql
        image: #{ACR.Name}#/sql-server:2019-latest
        ports:
        - containerPort: 1433
        env:
        - name: KeyVault
          value: "#{KeyVault.Name}#"
        - name: SecretName
          value: "SQL_PASSWORD"
        volumeMounts:
        - name: mssqldb
          mountPath: /var/opt/mssql
      volumes:
      - name: mssqldb
        persistentVolumeClaim:
          claimName: mssql-data

Deployed the manifest and voila! All works. SQL pods and .net app pods all are using Managed Identity to connect to Key vault and retrieving the secret, stored centrally in one place and management of the password nice & tidy.

Conclusion

I find Azure pod identity a neat feature and security best practice. Especially when you are using Azure Kubernetes and some Azure Resources (i.e. Cosmos DB, Azure SQL, Key vault, Storage Account, Service Bus etc.). If you didn’t know, hope this makes you more enthusiast to investigate further.

Disclaimer 1: It’s an open source project – so the Azure technical support doesn’t apply.

Disclaimer 2: The SQL container part with Go script is totally a fun learning attempt, don’t take it too seriously!

Linkerd in Azure Kubernetes Service cluster

In this article I would document my journey on setting up Linkerd Service Mesh on Azure Kubernetes service.

Background

I have a tiny Kubernetes cluster. I run some workload there, some are useful, others are just try-out, fun stuffs. I have few services that need to talk to each other. I do not have a lot of traffic to be honest, but I sometimes curiously run Apache ab to simulate load and see how my services perform under stress. Until very recently I was using a messaging (basically a pub-sub) pattern to create reactive service-to-service communication. Which works great, but often comes with a latency. I can only imagine, if I were to run these service to service communication for a mission critical high-traffic performance-driven scenario (an online game for instance), this model won’t fly well. There comes the need for a service-to-service communication pattern in cluster.

What’s big deal? We can have REST calls between services, even can implement gRPC for that matter. The issue is things behaves different at scale. When many services talks to many others, nodes fail in between, network address of PODs changes, new PODs show up, some goes down, figuring out where the service sits becomes quite a challenging task.

Then Kubernetes comes to rescue, Kubernetes provides “service”, that gives us service discovery out of the box. Which is awesome. Not all issues disappeared though. Services in a cluster need fault-tolerances, traceability and most importantly, “observability”.  Circuit-breakers, retry-logics etc. implementing them for each service is again a challenge. This is exactly the Service Mesh addresses.

Service mesh

From thoughtworks radar:

Service mesh is an approach to operating a secure, fast and reliable microservices ecosystem. It has been an important steppingstone in making it easier to adopt microservices at scale. It offers discovery, security, tracing, monitoring and failure handling. It provides these cross-functional capabilities without the need for a shared asset such as an API gateway or baking libraries into each service. A typical implementation involves lightweight reverse-proxy processes, aka sidecars, deployed alongside each service process in a separate container. Sidecars intercept the inbound and outbound traffic of each service and provide cross-functional capabilities mentioned above.

Some of us might remember Aspect Oriented programming (AOP) – where we used to separate cross cutting concerns from our core-business-concerns. Service mesh is no different. They isolate (in a separate container) these networking and fault-tolerance concerns from the core-capabilities (also running in container).

Linkerd

There are quite several service mesh solutions out there – all suitable to run in Kubernetes. I have used earlier Envoy and Istio. They work great in Kubernetes as well as VM hosted clusters. However, I must admit, I developed a preference for Linkerd since I discovered it. Let’s briefly look at how Linkerd works. Imagine the following two services, Service A and Service B. Service A talks to Service B.

service-2-service

When Linkerd installed, it works like an interceptor between all the communication between services. Linkerd uses sidecar pattern to proxy the communication by updating the KubeProxy IP Table.

Linkerd-architecture.png

Linkerd implants two sidecar containers in our PODs. The init container configures the IP table so the incoming and outgoing TCP traffics flow through the Linkerd Proxy container. The proxy container is the data plane that does the actual interception and all the other fault-tolerance goodies.

Primary reason behind my Linkerd preferences are performance and simplicity. Ivan Sim has done performance benchmarking with Linkerd and Istio:

Both the Linkerd2-meshed setup and Istio-meshed setup experienced higher latency and lower throughput, when compared with the baseline setup. The latency incurred in the Istio-meshed setup was higher than that observed in the Linkerd2-meshed setup. The Linkerd2-meshed setup was able to handle higher HTTP and GRPC ping throughput than the Istio-meshed setup.

Cluster provision

Spinning up AKS is easy as pie these days. We can use Azure Resource Manager Template or Terraform for that. I have used Terraform to generate that.

resource "azurerm_resource_group" "cloudoven" {
name = "cloudoven"
location = "West Europe"
}
resource "azurerm_kubernetes_cluster" "cloudovenaks" {
name = "cloudovenaks"
location = "${azurerm_resource_group.cloudoven.location}"
resource_group_name = "${azurerm_resource_group.cloudoven.name}"
dns_prefix = "cloudovenaks"
agent_pool_profile {
name = "default"
count = 1
vm_size = "Standard_D1_v2"
os_type = "Linux"
os_disk_size_gb = 30
}
agent_pool_profile {
name = "pool2"
count = 1
vm_size = "Standard_D2_v2"
os_type = "Linux"
os_disk_size_gb = 30
}
service_principal {
client_id = "98e758f8r-f734-034a-ac98-0404c500e010"
client_secret = "Jk==3djk(efd31kla934-=="
}
tags = {
Environment = "Production"
}
}
output "client_certificate" {
value = "${azurerm_kubernetes_cluster.cloudovenaks.kube_config.0.client_certificate}"
}
output "kube_config" {
value = "${azurerm_kubernetes_cluster.cloudovenaks.kube_config_raw}"
}

view raw
Kuberentes-iac
hosted with ❤ by GitHub

Service deployment

This is going to take few minutes and then we have a cluster. We will use the canonical emojivoto app (“buoyantio/emojivoto-emoji-svc:v8”) to test our Linkerd installation. Here’s the Kubernetes manifest file for that.

apiVersion: v1
kind: Namespace
metadata:
name: emojivoto
kind: ServiceAccount
apiVersion: v1
metadata:
name: emoji
namespace: emojivoto
kind: ServiceAccount
apiVersion: v1
metadata:
name: voting
namespace: emojivoto
kind: ServiceAccount
apiVersion: v1
metadata:
name: web
namespace: emojivoto
apiVersion: apps/v1beta1
kind: Deployment
metadata:
creationTimestamp: null
name: emoji
namespace: emojivoto
spec:
replicas: 1
selector:
matchLabels:
app: emoji-svc
strategy: {}
template:
metadata:
creationTimestamp: null
labels:
app: emoji-svc
spec:
serviceAccountName: emoji
containers:
env:
name: GRPC_PORT
value: "8080"
image: buoyantio/emojivoto-emoji-svc:v8
name: emoji-svc
ports:
containerPort: 8080
name: grpc
resources:
requests:
cpu: 100m
status: {}
apiVersion: v1
kind: Service
metadata:
name: emoji-svc
namespace: emojivoto
spec:
selector:
app: emoji-svc
clusterIP: None
ports:
name: grpc
port: 8080
targetPort: 8080
apiVersion: apps/v1beta1
kind: Deployment
metadata:
creationTimestamp: null
name: voting
namespace: emojivoto
spec:
replicas: 1
selector:
matchLabels:
app: voting-svc
strategy: {}
template:
metadata:
creationTimestamp: null
labels:
app: voting-svc
spec:
serviceAccountName: voting
containers:
env:
name: GRPC_PORT
value: "8080"
image: buoyantio/emojivoto-voting-svc:v8
name: voting-svc
ports:
containerPort: 8080
name: grpc
resources:
requests:
cpu: 100m
status: {}
apiVersion: v1
kind: Service
metadata:
name: voting-svc
namespace: emojivoto
spec:
selector:
app: voting-svc
clusterIP: None
ports:
name: grpc
port: 8080
targetPort: 8080
apiVersion: apps/v1beta1
kind: Deployment
metadata:
creationTimestamp: null
name: web
namespace: emojivoto
spec:
replicas: 1
selector:
matchLabels:
app: web-svc
strategy: {}
template:
metadata:
creationTimestamp: null
labels:
app: web-svc
spec:
serviceAccountName: web
containers:
env:
name: WEB_PORT
value: "80"
name: EMOJISVC_HOST
value: emoji-svc.emojivoto:8080
name: VOTINGSVC_HOST
value: voting-svc.emojivoto:8080
name: INDEX_BUNDLE
value: dist/index_bundle.js
image: buoyantio/emojivoto-web:v8
name: web-svc
ports:
containerPort: 80
name: http
resources:
requests:
cpu: 100m
status: {}
apiVersion: v1
kind: Service
metadata:
name: web-svc
namespace: emojivoto
spec:
type: LoadBalancer
selector:
app: web-svc
ports:
name: http
port: 80
targetPort: 80
apiVersion: apps/v1beta1
kind: Deployment
metadata:
creationTimestamp: null
name: vote-bot
namespace: emojivoto
spec:
replicas: 1
selector:
matchLabels:
app: vote-bot
strategy: {}
template:
metadata:
creationTimestamp: null
labels:
app: vote-bot
spec:
containers:
command:
emojivoto-vote-bot
env:
name: WEB_HOST
value: web-svc.emojivoto:80
image: buoyantio/emojivoto-web:v8
name: vote-bot
resources:
requests:
cpu: 10m
status: {}

view raw
emoji-manifest.yml
hosted with ❤ by GitHub

With this IaC – we can run Terraform apply to provision our AKS cluster in Azure.

Azure Pipeline

Let’s create a pipeline for the service deployment. The easiest way to do that is to create a service connection to our AKS cluster. We go to the project settings in Azure DevOps project, pick Service connections and create a new service connection of type “Kubernetes connection”.

Azure DevOps connection

Installing Linkerd

We will create a pipeline that installs Linkerd into the AKS cluster. Azure Pipeline now offers “pipeline-as-code” – which is just an YAML file that describes the steps need to be performed when the pipeline is triggered. We will use the following pipeline-as-code:

pool:
name: Hosted Ubuntu 1604
steps:
task: KubectlInstaller@0
displayName: 'Install Kubectl latest'
task: Kubernetes@1
displayName: 'kubectl get'
inputs:
kubernetesServiceEndpoint: CloudOvenKubernetes
command: get
arguments: nodes
script: |
curl -sL https://run.linkerd.io/install | sh
export PATH=$PATH:$HOME/.linkerd2/bin
linkerd version
linkerd check –pre
linkerd install | kubectl apply -f –
linkerd check
displayName: 'Linkerd – Installation'

We can at this point trigger the pipeline to install Linkerd into the AKS cluster.

Linkerd installation (2)

Deployment of PODs and services

Let’s create another pipeline as code that deploys all the services and deployment resources to AKS using the following Kubernetes manifest file:

pool:
name: Hosted Ubuntu 1604
steps:
task: KubectlInstaller@0
displayName: 'Install Kubectl latest'
task: Kubernetes@1
displayName: 'kubectl apply'
inputs:
kubernetesServiceEndpoint: CloudOvenKubernetes
command: apply
useConfigurationFile: true
configuration: src/services/emojivoto/all.yml

In Azure Portal we can already see our services running:

Azure KS

Also in Kubernetes Dashboard:

Kub1

We have got our services running – but they are not really affected by Linkerd yet. We will add another step into the build pipeline to tell Linkerd to do its magic.

pool:
name: Hosted Ubuntu 1604
steps:
task: KubectlInstaller@0
displayName: 'Install Kubectl latest'
task: Kubernetes@1
displayName: 'kubectl apply'
inputs:
kubernetesServiceEndpoint: CloudOvenKubernetes
command: apply
useConfigurationFile: true
configuration: src/services/emojivoto/all.yml
script: 'src/services/emojivoto/all.yml | linkerd inject – | kubectl apply -f –'
displayName: 'Inject Linkerd'

Next thing, we trigger the pipeline and put some traffic into the service that we have just deployed. The emoji service is simulating some service to service invocation scenarios and now it’s time for us to open the Linkerd dashboard to inspect all the distributed traces and many other useful matrix to look at.

linkerd-censored

We can also see kind of an application map – in a graphical way to understand which service is calling who and what is request latencies etc.

linkerd-graph

Even fascinating, Linkerd provides some drill-down to the communications in Grafana Dashboard.

ezgif.com-gif-maker.gif

Conclusion

I have enjoyed a lot setting it up and see the outcome and wanted to share my experience with it. If you are looking into Service Mesh and read this post, I strongly encourage to give Linkerd a go, it’s awesome!

Thanks for reading.