Bridge to Kubernetes – be confident on shipping software

Bridge to Kubernetes is a successor of Azure Dev Space. Distributed software’s are comprised of more than one services (often referred as micro-services), they depend on each other (one service invoking APIs of another service) to deliver capabilities to end users. While separations of services bring flexibility in delivering features (or bug fixes) faster, it adds complexity into the system as well as makes the developers workflow (inner loop) difficult. Imagine, a product has three services, backend (running a database for instance), an API service (middleware – that talks to the backend service) and frontend (that servers user interfaces to the end users and invokes API services). Running these services in a Kubernetes cluster means, three different deployments, corresponding services and possibly an ingress object. When we think the developer workflow for API service, we immediately see the complexity, as the developers of APIs now need to run a local version of the backend service and the frontend service to issue a request to debug or test their APIs. While the API developers might not fully aware how-to setup the backend service on their machine – as that’s built by a separate team. They now either must fake that service (with proxy/stubs) or learn all the details how to run backend service on their development workstation. This is cumbersome and still doesn’t guarantee that their API service will behave exactly as expected when they are running it in Test or production environment. Which leads to run an acceptance checks after deployment and increases lead time. This also makes it complicated to issue reproductions in local machine for troubleshooting purposes.

This is where Bridge to Kubernetes (previously Azure Dev Space) comes to rescue. Bridge to Kubernetes can connect development workstation to Kubernetes cluster, it eliminates the need to manually source, configure and compile external dependencies on development workstation. Environment variables, connection strings and volumes from the cluster are inherited and available to a microservice code running locally.

Setting up the environment

Let’s create a simple scenario – we will create 3 .net core API applications, namely, backend, API, and frontend. These apps do the minimum on purpose – to make really emphasize the specifics of the Bridge to Kubernetes (rather distracting with lot of convoluted feature codes). The backend looks like following:

app.UseEndpoints(endpoints =>
{
    endpoints.MapGet("/", async context =>
    {
        await context.Response.WriteAsync("Hello from Backend!");
    });
});

Basically, backend app exposes single route, and any request comes to that served with a greeting. Next, we will look at the API app (the middleware that consumes backend service):

app.UseEndpoints(endpoints =>
{
    endpoints.MapGet("/", async context =>
    {
        using var client = new System.Net.Http.HttpClient();
        var request = new System.Net.Http.HttpRequestMessage
        {
            RequestUri = new Uri("http://backend/")
        };
        var header = "kubernetes-route-as";
        if (context.Request.Headers.ContainsKey(header))
        {
            request.Headers.Add(header, context.Request.Headers[header] as IEnumerable<string>);
        }
        var response = await client.SendAsync(request);
        await context.Response.WriteAsync($"API bits {await response.Content.ReadAsStringAsync()}");

    });
});

Important to notice that we are checking for any headers with “kubernetes-route-as” key, when they are present, we are propagating that to the upstream invocations. This would be required later when we will see the Bridge to Kubernetes in action.

Lastly, we have our frontend service, invoking API service (as API did to backend):

app.UseEndpoints(endpoints =>
{
    endpoints.MapGet("/", async context =>
    {
        using var client = new System.Net.Http.HttpClient();
        var request = new System.Net.Http.HttpRequestMessage
        {
            RequestUri = new Uri("http://api/")
        };
        var header = "kubernetes-route-as";
        if (context.Request.Headers.ContainsKey(header))
        {
            request.Headers.Add(header, context.Request.Headers[header] as IEnumerable<string>);
        }
        await context.Response.WriteAsync($"Front End Bit --> {await response.Content.ReadAsStringAsync()}");

    });
}

Now we will build all these services and deploy them into the cluster (Azure Kubernetes Service). To keep the simple and easy to follow, we will deploy them using manifest files (as opposed to Helm charts).

The backend manifest looks following:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: backend
  namespace: b2kapp
spec:
  replicas: 1
  selector:
    matchLabels:
      app: backend
  template:
    metadata:
      labels:
        app: backend
    spec:
      containers:
      - name: backend
        image: private-registry/backend:beta
        ports:
        - containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: backend
  namespace: b2kapp
spec:
  type: ClusterIP
  ports:
  - port: 80
  selector:
    app: backend

The manifest for API looks almost identical to above:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: api
  namespace: b2kapp
spec:
  replicas: 1
  selector:
    matchLabels:
      app: api
  template:
    metadata:
      labels:
        app: api
    spec:
      containers:
      - name: api
        image: moimhossain/api:beta
        ports:
        - containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: api
  namespace: b2kapp
spec:
  type: ClusterIP
  ports:
  - port: 80
  selector:
    app: api

Finally, the manifest for frontend service:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: frontend
  namespace: b2kapp
spec:
  replicas: 1
  selector:
    matchLabels:
      app: frontend
  template:
    metadata:
      labels:
        app: frontend
    spec:
      containers:
      - name: frontend
        image: moimhossain/frontend:beta1
        ports:
        - containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: frontend
  namespace: b2kapp
spec:
  type: ClusterIP
  ports:
  - port: 80
  selector:
    app: frontend
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: frontend-ingress
  namespace: b2kapp
  annotations:
    kubernetes.io/ingress.class: nginx
    nginx.ingress.kubernetes.io/ssl-redirect: "false"
    nginx.ingress.kubernetes.io/rewrite-target: /$1
spec:
  rules:
  - host: octo-lamp.nl
    http:
      paths:
      - backend:
          serviceName: frontend
          servicePort: 80
        path: /(.*)

You notice we added an ingress resource for the frontend only – and on a specific host name. I have used nginx ingress for the cluster and mapped the external IP of the ingress controller to an Azure DNS.

Applying all these manifests will deliver the application on http://octo-lamp.nl

Debugging scenario

Now that we have the application running, let’s say we want to debug the middleware service – API. Bridge to Kubernetes is a hybrid solution, that requires installation on Kubernetes cluster and on our local machine. In the cluster, we will install only the routing manager component from Bridge to Kubernetes. Using the following command:

kubectl apply -n b2kapp -f https://raw.githubusercontent.com/microsoft/mindaro/master/routingmanager.yml

At this point, if we see the pods into the namespace b2kapp we should see the following pods:

K9S view to the AKS cluster

To debug the api service locally, we would require installing the bridge to Kubernetes extension for Visual studio or VS Code (whichever you prefer to use)- I will be using visual studio in this case. Open the API project in Visual studio and you will notice there is a new launch profile – bridge to Kubernetes. Select that profile and hit F5. You will be asked to configure the bridge to Kubernetes:

We will select the correct namespace and service (in this case API) to debug. One important option here to select the routing isolation mode. If checked, B2K will offer a dynamic sub-route (with URL) that we can navigate to route traffic only coming via that sub-route specific URLs – this leave the regular traffic uninterrupted while we are debugging. once you press Ok, B2K will setup the cluster with few envoy proxies to route traffics to our local machine and hit any debug points that we have set.

The routing magic is done by two processes running in local machine in the background.

The DSC.exe is that process that dynamically allocate ports in local machine and use Kubernetes port forwarding to bind those ports to an agent running in Kubernetes – that is how the traffics are forwarded from the cloud to our local machine.

One thing to point out, that we are not building any docker images or running docker container during the debugging – it’s all happening on bare metal local machine (very typical way of debugging .net apps or node apps). This brings fast setup and a lightweight way to debug a service.

The other process is EndpointManager.exe – this is the process that requires elevated permissions because it modifies the hosts on local machine. Which in turn, allows API app to resolve a non-existent backend URI (http://backend) on local machine and manage to route that traffic back to the cluster where the service is running. If you open the C:\Windows\System32\drivers\etc\host file while running the debugger you will see these changes:

# Added by Bridge To Kubernetes
127.1.1.8 frontend frontend.b2kapp frontend.b2kapp.svc frontend.b2kapp.svc.cluster.local
127.1.1.7 backend backend.b2kapp backend.b2kapp.svc backend.b2kapp.svc.cluster.local
127.1.1.6 api api.b2kapp api.b2kapp.svc api.b2kapp.svc.cluster.local
# End of section

Running pull request workflow

One can also run a Pull Request workflow using the capability of Bridge to Kubernetes. that allows a team to deploy a feature that is in a feature branch (not yet merged to the release/master/main branch) and deploy that in Kubernetes using the isolation mode. That way, you can test a single service with new features (or bug fixes) by visiting it through the sub-domain URI and test the feature how that behaves in cluster. Of course, all the dependent services are real instances of service running into the cluster. This really can boost the confidence of releasing either a feature or bug fixes for any DevOps teams.

The way you do that, is to deploy a clone of the service (API service for this example) and PODs with some specific labels and annotations. Let’s say I have a manifest for API service – specifically written for PR flow, that would look like below:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-PRBRANCENAME
  namespace:  b2kapp
spec:
  replicas: 1
  selector:
    matchLabels:
      app: api-PRBRANCENAME
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1
  minReadySeconds: 5 
  template:
    metadata:
      annotations:
        routing.visualstudio.io/route-on-header: kubernetes-route-as=PRBRANCENAME
      labels:
        app: api-PRBRANCENAME
        routing.visualstudio.io/route-from: api
    spec:
      nodeSelector:
        "beta.kubernetes.io/os": linux
      containers:
      - name: api
        image: DOCKER_REGISTRY_NAME/b2k8s-api:DOCKER_IMAGE_VERSION
        ports:
        - containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: api-PRBRANCENAME
  namespace: b2kapp
spec:
  type: ClusterIP
  ports:
  - port: 80
  selector:
    app: api-PRBRANCENAME

All I need to do in the Pipeline that builds the PR, is to come up with a branch name (typically branch names are provided in tools like Jenkins or Azure DevOps in environment variables) and replace the word PRBRANCENAME with the branch name. then simply apply the manifest to the same namespace. Once you do that, the routing manager does the following:

  • Duplicates all ingresses (including load balancer ingresses) found in the namespace using the PRBRANCENAME for the subdomain.
  • Creates an envoy pod for each service associated with duplicated ingresses with the PRBRANCENAME subdomain.
  • Creates an additional envoy pod for the service you are working on in isolation. This allows requests with the subdomain to be routed to your development computer.
  • Configures routing rules for each envoy pod to handle routing for services with the subdomain.

Therefore, if we now visit the PRBRANCENAME.octo-lamp.nl – we will see that requests are trafficked through the newly deploy API service (where the features are built) and the rest of the traffics remains unchanged. A great way to build release confidences.

Conclusion

That’s all for today. I seriously think it’s a neat approach to build confidence in any DevOps teams that runs services on Kubernetes.

Thanks for reading!

How to use ADFS/SAML2.0 as Identity provider with Azure AD B2C

Azure Active Directory B2C (Azure AD B2C) provides support for the SAML 2.0 identity provider. With this capability, you can create a technical profile in Azure AD B2C to federate with SAML-based identity provider, such as ADFS. Thus, allow users to sign in with their existing enterprise identities. Microsoft has good docs on this topic, however, there are few steps that are currently not adequately documented and might lead to errors, unless you’re an identity pro.

In this blog post, I will write down the steps you need to take to setup ADFS federation with Azure AD B2C and the steps that you should watch out.

Setting up ADFS in Azure VM

We would need an ADFS server for this exercise, and we will spin up a virtual machine in Azure with ADFS installed. This article explains the steps you need to standup a virtual machine with ADFS in it. However, this article would use a self-signed certificate in ADFS configuration and Azure AD B2C won’t appreciate that. Azure AD B2C will only accept certificates from well known certificate authorities (CA).

Creating certificate

We will use a free certificate from Let’s Encrypt. I have used win-acme for that. Download win-acme from the site and lunch it in a windows machine. Crucial step is when you provide the Domain name for the certificate, you need to make sure, you have provided the DNS name of your virtual machine here.

One problem though, you won’t be able to export the certificate with private key – which is required by ADFS. You need additional steps to export certificate with private keys and here’s an article that shows how to do that. With the certificate (with private keys) you can complete the ADFS setup as described in this article.
Once created, you might want to check the federation setup with a simple ASP.net MVC app – just to make sure, users defined in ADDS are able to Sign-in with your ADFS. I have a simple application in GitHub that allows you to do that. Just change the web.config file with the domain name of your ADFS.

<appSettings>
	<add key="ida:ADFSMetadata" value="https://woodbine.westeurope.cloudapp.azure.com/FederationMetadata/2007-06/FederationMetadata.xml" />
	<add key="ida:Wtrealm" value="https://localhost:44380/" />
</appSettings>

You would also need to create a relying party trust in ADFS for this app. Make sure you have the following LDAP attributes mapped as outgoing claims in Claim Rules for your relying party trust configuration.

Creating Azure AD B2C Tenant

Create a tenant for Azure AD B2C from Azure Portal as described in here.

Creating certificate

You would need one more certificate that Azure AD B2C would use to sign the SAML requests sent to ADFS (SAML identity provider). For production scenarios, you should use a proper CA certificate, but for non-production scenarios, a self-signed certificate will suffice. You can generate one in PowerShell like below:

$tenantName = "<YOUR TENANT NAME>.onmicrosoft.com"
$pwdText = "<YOUR PASSWORD HERE>"

$Cert = New-SelfSignedCertificate -CertStoreLocation Cert:\CurrentUser\My -DnsName "$tenantName" -Subject "B2C SAML Signing Cert" -HashAlgorithm SHA256 -KeySpec Signature -KeyLength 2048
$pwd = ConvertTo-SecureString -String $pwdText -Force -AsPlainText
Export-PfxCertificate -Cert $Cert -FilePath .\B2CSigningCert.pfx -Password $pwd

Next, you need to upload the certificate in Azure AD B2C. In Azure portal go to Azure AD B2C, then Identity Experience Framework tab from the left, then Policy Keys (also from the left menu) and add a new policy key. Choose upload options and provide the PFX file generated earlier. Provide the name of the policy key ‘ADFSSamlCert’ (Azure AD B2C will append a suffix to that on save, which will look like B2C_1A_ADFSSamlCert).

Add signing and encryption keys

Follow Microsoft Documents to accomplish the following:

Create signing key.
Create encryption key.
Register the IdentityExperienceFramework application.
Register the ProxyIdentityExperienceFramework application.

Creating custom policy for ADFS

Microsoft has custom policy starter pack – that is super handy. However, you can also find the custom policy files that are customized with the technical profile for ADFS and also defined the user journeys in them in my GitHub repository. You can download the files from GitHub and search for the word woodbineb2c (the Azure AD B2C that I have used) and replace with your Azure AD B2C tenant name. Next, search for the word woodbine.westeurope.cloudapp.azure.com (that’s Azure VM where I have installed ADFS) and replace it with your ADFS FQDN.
Now, upload these XML files (6 files in total) in Azure AD B2C’s Identity Experience Framework’s Custom Policy blade. Upload them in following order:

1. TrustFrameworkBase.xml
2. TrustFrameworkExtensions.xml
3. SignUpOrSignInADFS.xml
4. SignUpOrSignin.xml
5. ProfileEdit.xml
6. PasswordReset.xml

Configure an AD FS relying party trust

To use AD FS as an identity provider in Azure AD B2C, you need to create an AD FS Relying Party Trust with the Azure AD B2C SAML metadata. The SAML metadata of Azure AD B2C technical profile typically looks like below:

https://your-tenant-name.b2clogin.com/your-tenant-name.onmicrosoft.com/your-policy/samlp/metadata?idptp=your-technical-profile

Replace the following values:
your-tenant with your tenant name, such as your-tenant.onmicrosoft.com.
your-policy with your policy name. For example, B2C_1A_signup_signin_adfs.
your-technical-profile with the name of your SAML identity provider technical profile. For example, Contoso-SAML2.

Follow this Microsoft Document to configure the relying party trust. However, when define the claim rules, map the DisplayName attribute to display_name or name (with lower-case) in outgoing claim.   

Request Signing algorithm

You would need to make sure that the SAML requests sent by Azure AD B2C is signed with the expected signature algorithm configured in AD FS. By default, Azure AD B2C seems signing requests with rsa-sha1 – therefore, make sure in AD FS relying party trust property you have selected the same algorithm.

Enforce signatures for SAML assertions in ADFS

Lastly, ADFS by default won’t sign the assertions it sends to Azure AD B2C and Azure AD B2C will throw errors when that’s the case. You can resolve it by forcing ADFS to put signature on both message and assertions by running the following powershell command in ADFS server:

Set-AdfsRelyingPartyTrust -TargetName <RP Name> -SamlResponseSignature MessageAndAssertion

Create Application to test

I have a node application that uses MSAL 2.0 to test this setup. I have customized the sample from a Microsoft QuickStart sample. You can find the source code in here. You need to modify the B2C endpoints before you launch it. Here’s the instructions how to modify these endpoints.

Conclusion

There are quite some steps to set things up to get the correct behavior. I hope this would help you – if you are trying to setup AD FS as Identity Provider to your Azure AD B2C tenant.

Thank you!

Azure Resource Governance with Template Specs & Biceps

All the example codes are available in GitHub.

Background

Governance of cloud estates is challenging for businesses. It’s crucial to enforce security policies, workload redundancies, uniformity (such as naming conventions), simplify deployments with packaged artifacts (i.e., ARM templates), Azure role-based access control (Azure RBAC) across the enterprise.

Generally, the idea is, a centralized team (sometimes referred as platform team) builds and publishes Infrastructure-as-code artifacts and number of product development teams consume them, only providing their own parameters.

Azure offers native capabilities like Azure Policy, Blueprints and Management groups to address this problem. But there are wide range of external solutions (Terraform, Pulumi etc.) available too.

One attribute of Terraform, strikes me a lot is the ability to store a versioned module in a registry and consume it from the registry. The same principle that familiar to engineers and widely used in programming languages – such as NuGet for .net, Maven for Java, npm for Node

With ARM templates it’s rather unpleasant. If you currently have your templates in an Azure repo, GitHub repo or storage account, you run into several challenges when trying to share and use the templates. For a user to deploy it, the template must either be local or the URL for the template must be publicly accessible. To get around this limitation, you might share copies of the template with users who need to deploy it, or open access to the repo or storage account. When users own local copies of a template, these copies can eventually diverge from the original template. When you make a repo or storage account publicly accessible, you may allow unintended users to access the template.

Azure Resource Manager – Template Spec

Microsoft delivered some cool new features for Resource manager templates recently. One of these features, named as Template Spec. Template Spec is a first-class Azure Resource type, but it really is just a regular ARM template. Best part is that you can version it, persist it in Azure – just like a Terraform registry, share it across the organization with RBAC and consume them from repository.

Template Specs is currently in preview. To use it, you must install the latest version of PowerShell or Azure CLI. For Azure PowerShell, use version 5.0.0 or later. For Azure CLI, use version 2.14.2 or later.

The benefit of using template specs is that you can create canonical templates and share them with teams in your organization. The template specs are secure because they’re available to Azure Resource Manager for deployment, but not accessible to users without Azure RBAC permission. Users only need read access to the template spec to deploy its template, so you can share the template without allowing others to modify it.

The templates you include in a template spec should be verified by the platform team (or administrators) in your organization to follow the organization’s requirements and guidance.

How template spec works?

If you are familiar with ARM template, template specs are not new to you. They are just typical ARM templates and stored in Azure Resource group as “template spec” with a version number. That means, you can take any ARM template (the template JSON file only – without any parameter files) and publish it as template spec, using PowerShell, Azure CLI or REST API.

Publishing Template Spec

Let’s say I have a template that defines just an Application Insight component.

{
    "contentVersion": "1.0.0.0",
    "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#",
    "parameters": {
        "appInsights": { "type": "string" },
        "location": { "type": "string" }
    },
    "resources": [{
            "type": "microsoft.insights/components",
            "apiVersion": "2020-02-02-preview",
            "name": "[parameters('appInsights')]",
            "location": "[parameters('location')]",
            "properties": {
                "ApplicationId": "[parameters('appInsights')]",
                "Application_Type": "web"
            }
        }
    ],
    "outputs": {
        "instrumentationKey": {
            "type": "string",
            "value": "[reference(parameters('appInsights')).InstrumentationKey]"
        }
    }
}

We can now publish this as “template spec” using Azure CLI:

az ts create \
    --name "cloudoven-appInsights" \
    --version $VERSION \
    --resource-group $RESOURCE_GROUP \
    --location $LOCATION \
    --template-file "component.json" \
    --yes --query 'id' -o json

Once published, you can see it in Azure Portal – appeared as a new resource type Microsoft.Resources/templateSpecs.

Consuming Template Spec

Every published Template Spec has a unique ID. To consume a Template Spec, all you need is the ID of the Template Spec. You can retrieve the ID in Azure CLI:

APPINS_TSID=$(az ts show --resource-group $TSRGP --name $TSNAME --version $VERSION --query 'id' -o json)
echo Template Spec ID: $APPINS_TSID

You can now, deploy Azure Resources with the Template Spec ID and optionally your own parameters, like following:

az deployment group create \
  --resource-group $RESOURCEGROUP \
  --template-spec $APPINS_TSID \
  --parameters "parameters.json

Linked Templates & Modularizations

Overtime, Infrastructure-as-code tends to become big monolithic file containing numerous resources. ARM templates (thanks to all it’s verbosity) specially known for growing big fast and becomes difficult to comprehend by reading a large JSON file. You could address the issue before by using linked templates. However, with a caveat that linked templates needed to be accessible via an URL in the public internet – far from ideal.

Good news is Template Spec got this covered. If the main template for your template spec references linked templates, the PowerShell and CLI commands can automatically find and package the linked templates from your local drive.

Example

Here I have an example ARM template that defines multiple resources (Application Insights, Server farm and a web app) in small files and finally creating a main template that brings everything together. One can then publish the main template as Template Spec – hence any consumer can provision their web app just by pointing to the ID of the template spec. Here’s the interesting bit of the main template:

"resources": [
        {
            "type": "Microsoft.Resources/deployments",
            "apiVersion": "2020-06-01",
            "name": "DeployAppInsights",
            "properties": {
                "mode": "Incremental",
                "parameters": {
                    "appInsights": { "value": "[parameters('appInsights')]"},
                    "location": {"value": "[parameters('location')]"}
                },
                "templateLink": {                    
                    "relativePath": "../appInsights/component.json"
                }
            }
        },
        {
            "type": "Microsoft.Resources/deployments",
            "apiVersion": "2020-06-01",
            "name": "DeployHostingplan",
            "properties": {
                "mode": "Incremental",      
                "templateLink": {
                    "relativePath": "../server-farm/component.json"
                }
            }
        },
        {
            "type": "Microsoft.Resources/deployments",
            "apiVersion": "2020-06-01",
            "name": "DeployWebApp",
            "dependsOn": [ "DeployHostingplan" ],
            "properties": {
                "mode": "Incremental",             
                "templateLink": {
                    "relativePath": "../web-app/component.json"
                }
            }
        }
    ]

You see, Template Specs natively offers the modularity, centralized registry, however, they are still ARM JSON files. One common criticism of ARM template is it’s too verbose and JSON are not particularly famous for readability.

Microsoft is aiming to address these concerns with a new Domain Specific Language (DSL) that named a Azure Bicep.  

What is Bicep?

Bicep aims to drastically simplify the authoring experience with a cleaner syntax and better support for modularity and code re-use. Bicep is a transparent abstraction over ARM and ARM templates, which means anything that can be done in an ARM Template can be done in bicep (outside of temporary known limitations).

If we take the same Application Insight component (above) and re-write it in Bicep, it looks following:

param appInsights string
param location string = resourceGroup().location
resource appIns 'Microsoft.Insights/components@2020-02-02-preview' = {
  name: appInsights
  location: location
  kind: appInsights
  properties: {
    Application_Type: 'web'
  }
}
output InstrumentationKey string = appIns.properties.InstrumentationKey

Very clean and concise compared to the ARM JSON version of it. If you are coming from Terraform, you might already find yourself at home – because Bicep took lot of inspiration from Terraform HCL (HashiCorp Language). You save bicep scripts with .bicep file extensions.

Important things to understand, Bicep is a client-side language layer sits on top of ARM json. The idea is you write it in Bicep then compile the script using a Bicep compiler (or Transpiler) to produce ARM JSON as compiled artifact and you still deploy ARM template (JSON) to Azure. Here’s how you compile a bicep file to produce the ARM JSON:

bicep build ./main.bicep

Bicep currently is in experimental state, and not recommended to use in production.

Creating Template Spec in Bicep

The above Example – you’ve seen how you could create a Template Spec that is quite modularized into linked templates. Let’s rewrite that in Bicep and see how clean and simple it looks:

param webAppName string = ''
param appInsights string = ''
param location string = resourceGroup().location
param hostingPlanName string = ''
param containerSpec string =''
param costCenter string
param environment string

module appInsightsDeployment '../appinsights/component.bicep' = {
  name: 'appInsightsDeployment'
  params:{
    appInsights: '${appInsights}'
    location: '${location}'
    costCenter: costCenter
    environment: environment
  }
}

module deployHostingplan '../server-farm/component.bicep' = {
  name: 'deployHostingplan'
  params:{
    hostingPlanName:  '${hostingPlanName}'
    location: '${location}'
    costCenter: costCenter
    environment: environment    
  }   
}

module deployWebApp '../web-app/component.bicep' = {
  name: 'deployWebApp'
  params:{
    location: '${location}'
    webAppName: '${webAppName}'
    instrumentationKey: appInsightsDeployment.outputs.InstrumentationKey
    serverFarmId: deployHostingplan.outputs.hostingPlanId
    containerSpec: '${containerSpec}'
    costCenter: costCenter
    environment: environment    
  }   
}

Notice, Bicep came up with a nice keyword module to address the linked template scenario. Bicep module is an opaque set of one or more resources to be deployed together. It only exposes parameters and outputs as contract to other Bicep files, hiding details on how internal resources are defined. This allows you to abstract away complex details of the raw resource declaration from the end user who now only needs to be concerned about the module contract. Parameters and outputs are optional.

CI/CD – GitHub Action

Let’s create a delivery pipeline in GitHub action to compile our Bicep file and publish them as Template Spec in Azure. Following GitHub workflow, installs Bicep tools, compiles the scripts, and finally publishes them as Template Spec in Azure. You can see the complete repository in GitHub.

jobs:  
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - name: Install Bicep CLI
        working-directory: ./src/template-spec-bicep/paas-components
        run: |
          chmod +x ./install-bicep.sh
          ./install-bicep.sh
      - name: Compile Bicep Scripts
        working-directory: ./src/template-spec-bicep/paas-components
        run: |
          chmod +x ./build-templates.sh
          ./build-templates.sh
      - name: Azure Login
        uses: Azure/login@v1.1
        with:
          creds: ${{ secrets.AZURE_CREDENTIALS }}
      - name: Deploy Template Specs to Azure
        working-directory: ./src/template-spec-bicep/paas-components        
        run: |          
          chmod +x ./deploy-templates.sh
          ./deploy-templates.sh

Consuming the template spec doesn’t change disregarding the choice you make with Bicep or ARM template. Consumers just create a parameter file and deploy resource only specifying the Template Spec ID. You can see an example of Consumer workflow (also GitHub action) here.

Conclusion

I am excited how the Azure team is offering new tools like Bicep, Template Specs to simplify the cloud governance, self-service areas. It’s important to understand that Bicep team is not competing with available tools in the space (like Terraform etc.) rather offering more options to folks in their cloud journey.

Bicep is in experimental phase now and Template Specs are not in preview, therefore, don’t use them in production just yet.   

Azure DevOps Security & Permissions REST API

Every Few months I notice the following Saga repeats. I face a challenge where I need to programmatically manage security aspects of Azure DevOps resources (like Repository, Pipeline, Environment etc.). I do lookup the Azure DevOps REST API documentation, realize that the Permissions & Security API’s are notoriously complicated and inadequately documented. So, I begin with F8 to kick off the Development tools for Browser and intercepting HTTP requests. Trying to guess what’s payloads are exchanged and try to come up with appropriate HTTP requests myself. However strange it might sound, usually this method works for me (actually worked almost all the time). But it’s a painful and time-consuming process. Recently I had to go through this process one more time and I promised to myself that once I am done, I will write a Blog post about it and put the code in a GitHub repository – so next time I will save myself some time & pain. That’s exactly what this post is all about.

Security & Permission REST API

As I have said, the security REST API is relatively complicated and to inadequately documented. Typically, each family of resources (work items, Git repositories, etc.) is secured using a different namespace. The first challenge is to find out the namespace IDs.

Then each security namespace contains zero or more access control lists. Each access control list contains a token, an inherit flag and a set of zero or more access control entries. Each access control entry contains an identity descriptor, an allowed permissions bitmask and a denied permissions bitmask.

Tokens are arbitrary strings representing resources in Azure DevOps. Token format differs per resource type; however, hierarchy and separator characters are common between all tokens. Now, where do you find these tokens format? Well, I mostly find them by intercepting the Browser HTTP payloads. To save me from future efforts, I have created a .net Object model around the security namespace IDs, permissions and tokens – so when I consume those libraries, I can ignore these lower level elements and have a higher order APIs to manage permissions. You can look into the GitHub repository to learn about it. However, just to make it more fun to use, I have spent a bit time to create a Manifest file (Yes, stolen from Kubernetes world) where I can get my future job done only by writing YAML files – as oppose to .net/C# codes.

Instructions to use

The repository containes two projects (once is a Library – produced a DLL and another is the console executable application) and the console executable is named as azdoctl.exe.

The idea is to create a manifest file (yaml format) and apply the changes via the azdoctl.exe:

> azdoctl apply -f manifest.yaml
Manifest file

You need to create a manifest file to descibe your Azure DevOps project and permissions. The format of the manifest file is in yaml (and idea is borrowed from Kubernetes manufest files.)

Schema

Here’s the schema of the manifest file:

apiVersion: apps/v1
kind: Project
metadata:
  name: Bi-Team-Project
  description: Project for BI Engineering team
template:
  name: Agile
  sourceControlType: Git

Manifest file starts with the team project name and description. Each manifest file can have only one team project definition.

Teams

Next, we can define teams for the project with following yaml block:

teams:
  - name: Bi-Core-Team
    description: The core team that run BI projects
    admins:
      - name: Timothy Green
        id: 4ae3c851-6ef3-4748-bef9-4f809736d538
      - name: Linda
        id: 9c5918c7-ef03-4059-a49e-aa6e6d761423
    membership:
      groups:
        - name: 'UX Specialists'
          id: a2931c86-e975-4220-aa89-dc3f952290f4
      users:
        - name: Timothy Green
          id: 4ae3c851-6ef3-4748-bef9-4f809736d538
        - name: Linda
          id: 9c5918c7-ef03-4059-a49e-aa6e6d761423

Here we can create teams and assign admins and members to them. All the references (name and ids) must be valid in Azure Active Directory. Ids are Object ID for group or users in Azure Active directory.

Repository

Next, we can define the repository – that must be created and assigned permissions to.

repositories:
  - name: Sample-Git-Repository
    permissions:
      - group: 'Data-Scientists'
        origin: aad
        allowed:
          - GenericRead
          - GenericContribute
          - CreateBranch
          - PullRequestContribute
      - group: 'BI-Scrum-masters'
        origin: aad
        allowed:
          - GenericRead
          - GenericContribute
          - CreateBranch
          - PullRequestContribute
          - PolicyExempt

Again, you can apply an Azure AD group with very fine-grained permissions to each repository that you want to create.

List of all the allowed permissions:

        Administer,
        GenericRead,
        GenericContribute,
        ForcePush,
        CreateBranch,
        CreateTag, 
        ManageNote,    
        PolicyExempt,   
        CreateRepository, 
        DeleteRepository,
        RenameRepository,
        EditPolicies,
        RemoveOthersLocks,
        ManagePermissions,
        PullRequestContribute,
        PullRequestBypassPolicy

Environment

You can create environments and assign permissions to them with following yaml block.

environments:
  - name: Development-Environment
    description: 'Deployment environment for Developers'
    permissions:
      - group: 'Bi-Developers'
        origin: aad
        roles: 
          - Administrator
  - name: Production-Environment
    description: 'Deployment environment for Production'
    permissions:
      - group: 'Bi-Developers'
        origin: aad
        roles: 
          - User        

Build and Release (pipeline) folders

You can also create Folders for build and release pipelines and apply specific permission during bootstrap. That way teams can have fine grained permissions into these folders.

Build Pipeline Folders

Here’s the snippet for creating build folders.

buildFolders:
  - path: '/Bi-Application-Builds'
    permissions:
      - group: 'Bi-Developers'
        origin: aad
        allowed:
          - ViewBuilds
          - QueueBuilds
          - StopBuilds
          - ViewBuildDefinition
          - EditBuildDefinition
          - DeleteBuilds

And, for the release pipelines:

releaseFolders:
  - path: '/Bi-Application-Relases'
    permissions:
      - group: 'Bi-Developers'
        origin: aad
        allowed:
          - ViewReleaseDefinition
          - EditReleaseDefinition
          - ViewReleases
          - CreateReleases
          - EditReleaseEnvironment
          - DeleteReleaseEnvironment
          - ManageDeployments

Once you have the yaml file defined, you can apply it as described above.

Conclusion

That’s pretty much it for today. By the way,

The code is provided as-is, with MIT license. You can use it, replicate it, modify it as much as you wish. I would appreciate if you acknowledge the usefulness, but that’s not enforced. You are free to use it anyway you want.

And, that also means, the author is not taking any responsibility to provide any guarantee or such.

Thanks!

Manage Kubernetes running anywhere via Azure Arc

Azure Arc (currently in preview) allows attach and configure Kubernetes Clusters running anywhere (inside or outside of Azure). Once connected the clusters shows up in Azure portal and allows applying tags, policies like other resources. This brings simplicity and uniformity managing both cloud and on-premises resources in a single management pane (Azure Portal).

Azure Arc enabled Kubernetes is in preview. It’s NOT recommended for production workloads.

Following are the key scenarios where Azure Arc adds value:

  • Connect Kubernetes running outside of Azure for inventory, grouping, and tagging.
  • Apply policies by using Azure Policy for Kubernetes.
  • Deploy applications and apply configuration by using GitOps-based configuration management.
  • Use Azure Monitor for containers to view and monitor your clusters.

Connect an on-premises (or another cloud) clusters to Azure Arc

I have used the local Kubernetes (docker desktop) for this, however, the steps are identical for any other Kubernetes clusters. All you need is to run the following Azure CLI command from a machine where you can reach both the on-premises Kubernetes cluster and Azure.

az connectedk8s connect --name <ClusterName> --resource-group <ResourceGroup>

It will take moment and then the cluster is connected to Azure. We can see that in Azure portal:

Once we have the connected cluster to Azure – we can create/edit tags just like any other Azure resource. Which is awesome.

Same goes true for the Azure Policies – I can apply any compliance constraints to the cluster and monitor their compliance status in Azure Security center.

GitOps on Arc enabled Kubernetes cluster

The next piece of feature is interesting and can be very useful for many scenarios. This is much like infrastructure-as-code for your Kubernetes configuration (namespaces, deployments etc.). The idea is we define one or more git repository that keeps the desired state of the cluster (i.e. namespaces, deployments etc.) in Yaml files and Azure Resource Manager does the necessary actions to apply those desired state into the connected cluster. Microsoft Document describes how this works:

The connection between your cluster and one or more Git repositories is tracked in Azure Resource Manager as a sourceControlConfiguration extension resource. The sourceControlConfiguration resource properties represent where and how Kubernetes resources should flow from Git to your cluster. The sourceControlConfiguration data is stored encrypted at rest in an Azure Cosmos DB database to ensure data confidentiality.

The config-agent running in your cluster is responsible for watching for new or updated sourceControlConfiguration extension resources on the Azure Arc enabled Kubernetes resource, deploying a flux operator to watch the Git repository, and propagating any updates made to the sourceControlConfiguration. It is even possible to create multiple sourceControlConfiguration resources with namespace scope on the same Azure Arc enabled Kubernetes cluster to achieve multi-tenancy. In such a case, each operator can only deploy configurations to its respective namespace.

An example Git repository can be found in here: https://github.com/Azure/arc-k8s-demo. We can create the configuration from the Portal or via Azure CLI:

az k8sconfiguration create \
    --name cluster-config \
    --cluster-name AzureArcTest1 --resource-group AzureArcTest \
    --operator-instance-name cluster-config --operator-namespace cluster-config \
    --repository-url https://github.com/Azure/arc-k8s-demo \
    --scope cluster --cluster-type connectedClusters

That’s it, We can see that in Azure Portal:

With that setup committing changes to the Git repository will now reflect in connected cluster.

Monitoring

Connected clusters can also be monitored with Azure Monitor for containers. It’s as simple as creating a Log analytics workspace and configuring the cluster to push metrics to it. This document describes the steps to enable monitoring.

I have seen some scenarios where people running on-premises (or in other cloud) clusters heavily using Prometheus and Grafana for monitoring clusters. Good news, we can get the same on Azure Arc enabled clusters. Once we have the metrics available in Azure Log Analytics, we can use Grafana to point to the workspace – it takes less than a minute and few button clicks (no-code configuration required).

Isn’t it awesome? Go, checkout Azure Arc for Kubernetes today.

Azure DevOps Multi-Stage pipelines for Enterprise AKS scenarios

Background

Multi-Stage Azure pipelines enables writing the build (continuous integration) and deploy (continuous delivery) in Pipeline-as-Code (YAML) that gets stored into a version control (Git repository). However, deploying in multiple environments (test, acceptance, production etc.) needs approvals/control gates. Often different stakeholders (product owners/Operations folks) are involved into that process of approvals. In addition to that, restricting secrets/credentials for higher-order stages (i.e. production) from developers are not uncommon.

Good news is Azure DevOps allows doing all that, with notions called Environment and resources. The idea is environment (e.g. Production) are configured with resources (e.g. Kubernetes, Virtual machine etc.) in them, then “approval policy” configured for the environment. When a pipeline targets environment in deployment stage, it pauses with a pending approval from responsible authorities (i.e. groups or users). Azure DevOps offers awesome UI to create environments, setting up approval policies.

The problem begins when we want to automate environment creations to scale the process.

Problem statement

As of today (while writing this article)- provisioning and setting up approve policies for environments via REST API is not documented and publicly unavailable – there is a feature request awaiting.
In this article, I will share some code that can be used to automate provisioning environment, approval policy management.

Scenario

It’s fairly common (in fact best practice) to logically isolate AKS clusters for separate teams and projects. To minimize the number of physical AKS clusters we deploy to isolate teams or applications.

With logical isolation, a single AKS cluster can be used for multiple workloads, teams, or environments. Kubernetes Namespaces form the logical isolation boundary for workloads and resources.

When we setup such isolation for multiple teams, it’s crucial to automate the bootstrap of team projects in Azure DevOps– setting up scoped environments, service accounts so teams can’t deploy to namespaces of other teams etc. The need for automation is right there – and that’s all this article is about.

The process I am trying to establish as follows:

  1. Cluster Administrators will provision a namespace for a team (GitOps )
  2. Automatically create an Environment for the team’s namespace and configure approvals
  3. Team 1 will use the environment in their multi-stage pipeline

Let’s do this!

Provision namespace for teams

It all begins with a demand from a team – they need a namespace for development/deployment. The cluster administrators would keep a Git repository that contains the Kubernetes manifest files describing these namespaces. And there is a pipeline that applies them to the cluster each time a new file is added/modified. This repository will be restricted to the Cluster administrators (operation folks) only. Developers could issue a pull request but the PR approvals and commits to master should only be accepted by a cluster administrator or people with similar responsibility.

After that, we will create a service account for each of the namespaces. These are the accounts that will be used later when we will define Azure DevOps environment for each team.

Now the pipeline for this repository essentially applies all the manifests (both for namespaces and services accounts) to the cluster.

trigger:
- master
stages:
- stage: Build
  displayName: Provision namespace and service accounts
  jobs:  
  - job: Build
    displayName: Update namespace and service accounts
    steps:
      <… omitted irrelevant codes …>
      - bash: |
          kubectl apply -f ./namespaces 
        displayName: 'Update namespaces'
      - bash: |
          kubectl apply -f ./ServiceAccounts 
        displayName: 'Update service accounts'   
      - bash: |
          dotnet ado-env-gen.dll
        displayName: 'Provision Azure DevOps Environments'       

At this point, we have service account configured for each namespace that we will use to create the environment, endpoints etc. You might notice that I have created some label for each service account (i.e. purpose=ado-automation), this is to tag along the Azure DevOps Project name to a service account. This will come handy when we will provision environments.

The last task that runs a .net core console app (i.e. ado-env-gen.dll) – which I will described in detail later in this article.

Provisioning Environment in Azure DevOps

NOTE: Provisioning environment via REST api currently is undocumented and might change in coming future – beware of that.

It takes multiple steps to create an Environment to Azure DevOps. The steps are below:

  1. Create a Service endpoint with Kubernetes Service Account
  2. Create an empty environment (with no resources yet)
  3. Connect the service endpoint to the environment as Resource

I’ve used .net (C#) for this, but any REST client technology could do that.

Creating Service Endpoint

Following method creates a service endpoint in Azure DevOps that uses a Service Account scoped to a given namespace.

        public async Task<Endpoint> CreateKubernetesEndpointAsync(
            Guid projectId, string projectName,
            string endpointName, string endpointDescription,
            string clusterApiUri,
            string serviceAccountCertificate, string apiToken)
        {
            return await GetAzureDevOpsDefaultUri()
                .PostRestAsync<Endpoint>(
                $"{projectName}/_apis/serviceendpoint/endpoints?api-version=6.0-preview.4",
                new
                {
                    authorization = new
                    {
                        parameters = new
                        {
                            serviceAccountCertificate,
                            isCreatedFromSecretYaml = true,
                            apitoken = apiToken
                        },
                        scheme = "Token"
                    },
                    data = new
                    {
                        authorizationType = "ServiceAccount"
                    },
                    name = endpointName,
                    owner = "library",
                    type = "kubernetes",
                    url = clusterApiUri,
                    description = endpointDescription,
                    serviceEndpointProjectReferences = new List<Object>
                    {
                        new
                        {
                            description = endpointDescription,
                            name =  endpointName,
                            projectReference = new
                            {
                                id =  projectId,
                                name =  projectName
                            }
                        }
                    }
                }, await GetBearerTokenAsync());
        }

We will find out how to invoke this method in a moment. Before that, Step 2, let’s create the empty environment now.

Creating Environment in Azure DevOps

        public async Task<PipelineEnvironment> CreateEnvironmentAsync(
            string project, string envName, string envDesc)
        {
            var env = await GetAzureDevOpsDefaultUri()
                .PostRestAsync<PipelineEnvironment>(
                $"{project}/_apis/distributedtask/environments?api-version=5.1-preview.1",
                new
                {
                    name = envName,
                    description = envDesc
                },
                await GetBearerTokenAsync());

            return env;
        }

Now we have environment, but it still empty. We need to add a resource into it and that would be the Service Endpoint – so the environment comes to life.

        public async Task<string> CreateKubernetesResourceAsync(
            string projectName, long environmentId, Guid endpointId,
            string kubernetesNamespace, string kubernetesClusterName)
        {
            var link = await GetAzureDevOpsDefaultUri()
                            .PostRestAsync(
                            $"{projectName}/_apis/distributedtask/environments/{environmentId}/providers/kubernetes?api-version=5.0-preview.1",
                            new
                            {
                                name = kubernetesNamespace,
                                @namespace = kubernetesNamespace,
                                clusterName = kubernetesClusterName,
                                serviceEndpointId = endpointId
                            },
                            await GetBearerTokenAsync());
            return link;
        }

Of course, environment needs to have Approval policies configure. The following method configures a Azure DevOps group as Approver to the environment. Hence any pipeline that reference this environment will be paused and wait for approval from one of the members of the group.

        public async Task<string> CreateApprovalPolicyAsync(
            string projectName, Guid groupId, long envId, 
            string instruction = "Please approve the Deployment")
        {
            var response = await GetAzureDevOpsDefaultUri()
                .PostRestAsync(
                $"{projectName}/_apis/pipelines/checks/configurations?api-version=5.2-preview.1",
                new
                {
                    timeout = 43200,
                    type = new
                    {                                   
                        name = "Approval"
                    },
                    settings = new
                    {
                        executionOrder = 1,
                        instructions = instruction,
                        blockedApprovers = new List<object> { },
                        minRequiredApprovers = 0,
                        requesterCannotBeApprover = false,
                        approvers = new List<object> { new { id = groupId } }
                    },
                    resource = new
                    {
                        type = "environment",
                        id = envId.ToString()
                    }
                }, await GetBearerTokenAsync());
            return response;
        }

So far so good. But we need to stich all these together. Before we do so, one last item needs attention. We would want to create a Service connection to the Azure container registry so the teams can push/pull images to that. And we would do that using Service Principals designated to the teams – instead of the Admin keys of ACR.

Creating Container Registry connection

The following snippet allows us provisioning Service Connection to Azure Container Registry with Service principals – which can have fine grained RBAC roles (i.e. ACRPush or ACRPull etc.) that makes sense for the team.

        public async Task<string> CreateAcrConnectionAsync(
            string projectName, string acrName, string name, string description,
            string subscriptionId, string subscriptionName, string resourceGroup,
            string clientId, string secret, string tenantId)
        {
            var response = await GetAzureDevOpsDefaultUri()
                .PostRestAsync(
                $"{projectName}/_apis/serviceendpoint/endpoints?api-version=5.1-preview.2",
                new
                {
                    name,
                    description,
                    type = "dockerregistry",
                    url = $"https://{acrName}.azurecr.io",
                    isShared = false,
                    owner = "library",
                    data = new
                    {
                        registryId = $"/subscriptions/{subscriptionId}/resourceGroups/{resourceGroup}/providers/Microsoft.ContainerRegistry/registries/{acrName}",
                        registrytype = "ACR",
                        subscriptionId,
                        subscriptionName
                    },
                    authorization = new
                    {
                        scheme = "ServicePrincipal",
                        parameters = new
                        {
                            loginServer = $"{acrName}.azurecr.io",
                            servicePrincipalId = clientId,
                            tenantId,
                            serviceprincipalkey = secret
                        }
                    }
                },
                await GetBearerTokenAsync());
            return response;
        }

We came pretty close to a wrap. We’ll stitch all the methods above together. Plan is to create a simple console application will fix everything (using the above methods). Here’s the pseudo steps:

  1. Find all Service Account created for this purpose
  2. For each Service Account: determining the correct Team Project and
    • Create Service Endpoint with the Account
    • Create Environment
    • Connect Service Endpoint to Environment (adding resource)
    • Configure Approval policies
    • Create Azure Container Registry connection

The first step needs to communicate to the cluster – obviously. I have used the official .net client for Kubernetes for that.

Bringing all together

All the above methods are invoked from a simple C# console application. Below is the relevant part of the main method that brings all the above together:

        private static async Task Main(string [] args)
        {
            var clusterApiUrl = Environment.GetEnvironmentVariable("AKS_URI");
            var adoUrl = Environment.GetEnvironmentVariable("AZDO_ORG_SERVICE_URL");
            var pat = Environment.GetEnvironmentVariable("AZDO_PERSONAL_ACCESS_TOKEN");
            var adoClient = new AdoClient(adoUrl, pat);
            var groups = await adoClient.ListGroupsAsync();

            var config = KubernetesClientConfiguration.BuildConfigFromConfigFile();
            var client = new Kubernetes(config);

We started by collecting some secret and configuration data – all from environment variables – so we can run this console as part of the pipeline task and use pipeline variables at ease.

        var accounts = await client
            .ListServiceAccountForAllNamespacesAsync(labelSelector: "purpose=ado-automation");

This gets us the list of all the service accounts we have provisioned specially for this purpose (filtered using the labels).

            foreach (var account in accounts.Items)
            {
                var project = await GetProjectAsync(account.Metadata.Labels["project"], adoClient);
                var secretName = account.Secrets[0].Name;
                var secret = await client
                    .ReadNamespacedSecretAsync(secretName, account.Metadata.NamespaceProperty);

We are iterating all the accounts and retrieving their secrets from the cluster. Next step, creating the environment with these secrets.

                var endpoint = await adoClient.CreateKubernetesEndpointAsync(
                    project.Id,
                    project.Name,
                    $"Kubernetes-Cluster-Endpoint-{account.Metadata.NamespaceProperty}",
                    $"Service endpoint to the namespace {account.Metadata.NamespaceProperty}",
                    clusterApiUrl,
                    Convert.ToBase64String(secret.Data["ca.crt"]),
                    Convert.ToBase64String(secret.Data["token"]));

                var environment = await adoClient.CreateEnvironmentAsync(project.Name,
                    $"Kubernetes-Environment-{account.Metadata.NamespaceProperty}",
                    $"Environment scoped to the namespace {account.Metadata.NamespaceProperty}");

                await adoClient.CreateKubernetesResourceAsync(project.Name, 
                    environment.Id, endpoint.Id,
                    account.Metadata.NamespaceProperty,
                    account.Metadata.ClusterName);

That will give us the environment – correctly configured with the appropriate Service Accounts. Let’s set up the approval policy now:

                var group = groups.FirstOrDefault(g => g.DisplayName
                    .Equals($"[{project.Name}]\\Release Administrators", StringComparison.OrdinalIgnoreCase));
                await adoClient.CreateApprovalPolicyAsync(project.Name, group.OriginId, environment.Id);

We are taking a designated project group “Release Administrators” and set them as approves.

            await adoClient.CreateAcrConnectionAsync(project.Name, 
                Environment.GetEnvironmentVariable("ACRName"), 
                $"ACR-Connection", "The connection to the ACR",
                Environment.GetEnvironmentVariable("SubId"),
                Environment.GetEnvironmentVariable("SubName"),
                Environment.GetEnvironmentVariable("ResourceGroup"),
                Environment.GetEnvironmentVariable("ClientId"), 
                Environment.GetEnvironmentVariable("Secret"),
                Environment.GetEnvironmentVariable("TenantId"));

Lastly created the ACR connection as well.

The entire project is in GitHub – in case you want to have a read!

Verify everything

We have got our orchestration completed. Every time we add a new team, we create one manifest for their namespace and Service account and create a PR to the repository described above. A cluster admin approves the PR and a pipeline gets kicked off.

The pipeline ensures:

  1. All the namespaces and service accounts are created
  2. An environment with the appropriate service accounts are created in the correct team project.

Now a team can create their own pipeline in their repository – referring to the environment. Voila, all starts working nice. All they need is to refer the name of the environment that’s provisioned for their team (for instance “team-1”), as following example:

- stage: Deploy
  displayName: Deploy stage
  dependsOn: Build
  jobs:
  - deployment: Deploy
    condition: and(succeeded(), not(startsWith(variables['Build.SourceBranch'], 'refs/pull/')))
    displayName: Deploy
    pool:
      vmImage: $(vmImageName)
    environment: 'Kubernetes-Cluster-Environment.team-1'
    strategy:
      runOnce:
        deploy:
          steps:
          - download: current
            artifact: kube-manifests
          - task: KubernetesManifest@0
            displayName: Deploy to Kubernetes cluster
            inputs:
              action: deploy
              manifests: |
                $(Pipeline.Workspace)/kube-manifests/all-template.yaml

Now the multi-stage pipeline knows how to talk to the correct namespace in AKS with approval awaiting.

Conclusion

This might appear an overkill for small-scale projects, as it involves quite some overhead of development and maintenance. However, on multiple occasions (especially within large enterprises), I have experienced the need for orchestrations via REST API to onboard teams in Azure DevOps, bootstrapping configurations across multiple teams’ projects etc. If you’re on the same boat, this article might be an interesting read for you!

Thanks for reading!

Azure AD App via ARM Template Deployment Scripts

Background

ARM templates offer a great way to define resources and deploy them. However, ARM templates didn’t have any support to invoke or run scripts. If we wanted to carry out some operations as part of the deployment (Azure AD app registrations, Certificate generations, copy data to/from another system etc.) we had to create pre or post deployment scripts (using Azure PowerShell or Azure CLI). Microsoft recently announced the preview of Deployment Scripts (new resource type Microsoft.Resources/deploymentScripts) – which brings a way to run a script as part of ARM template deployment.

I have few web apps using Open ID connect for user authentication and they’re running as Azure App services. I always wanted to automate (preferably in a declarative and idempotent way) the required app registrations in Azure AD and deploy them together with the ARM templates of web apps.

Since we now have deployment script capability, I wanted to leverage it for Azure AD app registrations. In this article I will share my experience doing exactly that.

What are deployment scripts?

Deployment scripts allows running custom scripts (can be either Azure PowerShell or Azure CLI) as part of an ARM template deployment. It can be used to perform custom steps that can’t be done by ARM templates.


A simple deployment template that runs a bash command (echo) looks like below:

Figure: Simple example of Deployment Scripts

Microsoft described the benefits of deployment scripts as following:

– Easy to code, use, and debug. You can develop deployment scripts in your favorite development environments. The scripts can be embedded in templates or in external script files.


– You can specify the script language and platform. Currently, Azure PowerShell and Azure CLI deployment scripts on the Linux environment are supported.


– Allow specifying the identities that are used to execute the scripts. Currently, only Azure user-assigned managed identity is supported.


– Allow passing command-line arguments to the script.
Can specify script outputs and pass them back to the deployment.

Source

Registering Azure AD app

We can write a small script (with Azure CLI) like above sample, that registers the Azure AD app – that’s quite straightforward. However, first we need to address the Identity aspect, what account would run the script and how app-registration permission can be granted to that account. The answer is using Managed Identity.

User Assigned Managed Identity

Managed identities for Azure resources provide Azure services with a managed identity in Azure Active Directory. We can use this identity to authenticate to services that support Azure AD authentication, without needing credentials in your code. There are two types of Managed Identity, System assigned and User Assigned.

Deployment Scripts currently supports User Assigned Identities only, hence, we need to create a User Assigned Managed Identity that would run the CLI script. This identity is used to execute deployment scripts. We would also grant Azure AD app registration permissions to this identity. Creating User Assigned Identity is straightforward and the steps are nicely described here.

Figure: User Assigned Managed Identity in Azure Portal

Next to that, we will have to grant permissions to the identity. Following PowerShell script grants the required permissions to the Managed Identity.

Figure: Grant permissions (Click to Open in window to copy)

ARM template

We will now write the ARM template that will leverage the deployment scripts to register our app in Azure AD.

Figure: Deployment Script (Click to Open in window to copy)

I wouldn’t explain each of the settings/config options in here. Most important part here is the scriptContent property – which can have a string value of any scripts (PowerShell or Bash). You can also point to an external script file instead of embedded script.

Another important property is cleanupPreference. It specifies the preference of cleaning up deployment resources when the script execution gets in a terminal state. Default setting is Always, which means deleting the resources despite the terminal state (Succeeded, Failed, Canceled).

You can find more details on each of the configuration properties for Deployment Script in this document.

I have used some variable references that are defined in the same template json file.

Figure: Variables (Click to open new window to copy)

Notice here the cliArg variable. This would be the argument that we are passing as inputs to our CLI/bash script. The catch here is, the arguments need to be separated by white-spaces.

Finally, we would love to grab the newly registered app id and configure an entry into the App Settings in our web app – so the web app Open ID authentication can work right after the deployment.

Figure: Variables (Click to open new window to copy)

At this point we will deploy the template and after the deployment completed, we will see the app has been registered in Azure AD:

Figure: Azure AD App

Also, we can verify that the newly created App ID is nicely configured into the web app’s app-settings.

Figure: App settings configured

That’s all there is to it!

I haven’t defined any API permission scopes for the app registrations in this example, however, having the Azure CLI script in place, defining further API scopes are trivial.

How it worked?

If we login to the Azure Portal we will see the following:

Figure: Azure Portal resources

We see a new resource of type Deployment Script besides our Web App (and it’s Service Plan) that is obvious. However, we also see Container Instance and a Storage Account. Where they came from?

Well, Azure RM deployment created them while deploying the Deployment scripts. The storage account and a container instance, are created in the same resource group for script execution and troubleshooting. These resources are usually deleted by the script service when the script execution gets in a terminal state. Important to know, we are billed for the resources until the resources are deleted.

The container instance runs a Docker image as a Sandbox for our Deployment Script. You can see the image name form the portal that Microsoft is using for execution. This can come handy to try out the script locally – for development purposes.

Conclusion

I have a mixed feeling about the deployment script in ARM templates. It obviously has some benefits. But this shouldn’t replace all pre or post deployment script. Because sometimes it might be cleaner and easier to create a pre- or post-script task in continuous delivery pipeline than composing all in ARM templates.

Key Vault as backing store of Azure Functions

If you have used Azure function, you probably are aware that Azure Functions leverages a Storage Account underneath to support the file storage (where the function app code resides as Azure File share) and also as a backing store to keep Functions Keys (the secrets that are used in Function invocations).

Containers

Figure: Storage Account containers – “azure-webjobs-secrets”

If you look inside the container there are files with following contents:

secrets-in-storages

Figure: These JSON files has the function keys

host-json

Figure: Encrypted master keys  and other function keys

I have been in a conversation where; it was not appreciated to see the keys stored in the storage account. The security and governance team was seeking for a better place to keep these keys. Where secrets can be further restricted from developer access.

Of course, we can create a VNET around the storage accountand use private link but that has some other consequence as the content (functions implementations artifacts) stored also into the storage account. Configuring two separate storage account can address this better, however, this can make the setup complicated than it has to be.
A better option could be to store this keys into a Key Vault as backing store – which is a great feature of Azure functions, but I’ve found few people are aware of this due to lack of documentations. In this article I will show you how to move these secrets to a Key Vault.

To do so, we need to configure few Application Settings into the Function App. They are given below:

App Settings name Value
AzureWebJobsSecretStorageType keyvault
AzureWebJobsSecretStorageKeyVaultName <Key Vault Name>
AzureWebJobsSecretStorageKeyVaultConnectionString <Connection String or Leave it empty with Managed Identity configured on Azure Functions>

Once you have configured the above settings, you need to enable Managed Identity on your Azure Function. You will have to accomplish that in Identity section under platform features tab. That is a much better option in my opinion as we don’t need to maintain any more secrets to talk to Key vault securely. Go ahead and turn the system identity toggle on. This will create a service principal with the same name as Azure Function application you have.

managedidentity

Figure: Enabling system assigned managed identity on Function app
Next step is to add a rule to the key vault’s access policies for the service principal created in earlier step.

access policyu

Figure: Key vault Access policy
That’s it, hit your function app now and you will see the keys are stored inside the Key vault. You can safely delete the container from the storage account now.

secretsinkeyvault

Figure: Secrets are stored in Key Vault

Hope this will save time when you are concerned to keep the keys in storage account.
The Azure Function is open sourced and is in GitHub. You can have a look into the sources and see other interesting ideas that you may play with.

Access Control management via REST API – Azure Data Lake Gen 2

Background

A while ago, I have built an web-based self-service portal that facilitated multiple teams in the organisation, setting up their Access Control (ACLs) for corresponding data lake folders.

The portal application was targeting Azure Data Lake Gen 1. Recently I wanted to achieve the same but on Azure Data Lake Gen 2. At the time of writing this post, there’s no official NuGet package for ACL management targeting Data Lake Gen 2. One must rely on REST API only.

Read about known issues and limitations of Azure Data Lake Storage Gen 2

Further more, the REST API documentations do not provide example snippets like many other Azure resources. Therefore, it takes time to demystify the REST APIs to manipulate ACLs. Good new is, I have done that for you and will share a straight-forward C# class that wraps the details and issues correct REST API calls to a Data Lake Store Gen 2.

About Azure Data Lake Store Gen 2

Azure Data Lake Storage Gen2 is a set of capabilities dedicated to big data analytics. Data Lake Storage Gen2 is significantly different from it’s earlier version known as Azure Data Lake Storage Gen1, Gen2 is entirely built on Azure Blob storage.

Data Lake Storage Gen2 is the result of converging the capabilities of two existing Azure storage services, Azure Blob storage and Azure Data Lake Storage Gen1. Gen1 Features such as file system semantics, directory, and file level security and scale are combined with low-cost, tiered storage, high availability/disaster recovery capabilities from Azure Blob storage.

Let’s get started!

Create a Service Principal

First we would need a service principal. We will use this principal to authenticate to Azure Active Directory (using OAuth 2.0 protocol) in order to authorize our REST calls. We will use Azure CLI to do that.

az ad sp create-for-rbac --name ServicePrincipalName
Add required permissions

Now you need to grant permission for your application to access Azure Storage.

  • Click on the application Settings
  • Click on Required permissions
  • Click on Add
  • Click Select API
  • Filter on Azure Storage
  • Click on Azure Storage
  • Click Select
  • Click the checkbox next to Access Azure Storage
  • Click Select
  • Click Done

App

Now we have Client ID, Client Secret and Tenant ID (take it from the Properties tab of Azure Active Directory – listed as Directory ID).

Access Token from Azure Active Directory

Let’s write some C# code to get an Access Token from Azure Active Directory:

public class TokenProvider
{
private readonly string tenantId;
private readonly string clientId;
private readonly string secret;
private readonly string scopeUri;
private const string IdentityEndpoint = "https://login.microsoftonline.com";
private const string DEFAULT_SCOPE = "https://management.azure.com/";
private const string MEDIATYPE = "application/x-www-form-urlencoded";
public OAuthTokenProvider(string tenantId, string clientId, string secret, string scopeUri = DEFAULT_SCOPE)
{
this.tenantId = tenantId;
this.clientId = WebUtility.UrlEncode(clientId);
this.secret = WebUtility.UrlEncode(secret);
this.scopeUri = WebUtility.UrlEncode(scopeUri);
}
public async Task<Token> GetAccessTokenV2EndpointAsync()
{
var url = $"{IdentityEndpoint}/{this.tenantId}/oauth2/v2.0/token";
var Http = Statics.Http;
Http.DefaultRequestHeaders.Accept.Clear();
Http.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue(MEDIATYPE));
var body = $"grant_type=client_credentials&client_id={clientId}&client_secret={secret}&scope={scopeUri}";
var response = await Http.PostAsync(url, new StringContent(body, Encoding.UTF8, MEDIATYPE));
if (response.IsSuccessStatusCode)
{
var tokenResponse = await response.Content.ReadAsStringAsync();
return JsonConvert.DeserializeObject<Token>(tokenResponse);
}
return default(Token);
}
public class Token
{
public string access_token { get; set; }
public string token_type { get; set; }
public int expires_in { get; set; }
public int ext_expires_in { get; set; }
}
}

view raw
token-provider.cs
hosted with ❤ by GitHub

Creating ADLS Gen 2 REST client

Once we have the token provider, we can jump in implementing the REST client for Azure Data Lake.

public class FileSystemApi
{
private readonly string storageAccountName;
private readonly OAuthTokenProvider tokenProvider;
private readonly Uri baseUri;
private const string ACK_HEADER_NAME = "x-ms-acl";
private const string API_VERSION_HEADER_NAME = "x-ms-version";
private const string API_VERSION_HEADER_VALUE = "2018-11-09";
private int Timeout = 100;
public FileSystemApi(string storageAccountName, OAuthTokenProvider tokenProvider)
{
this.storageAccountName = storageAccountName;
this.tokenProvider = tokenProvider;
this.baseUri = new Uri($"https://{this.storageAccountName}.dfs.core.windows.net");
}

view raw
file-system.cs
hosted with ❤ by GitHub

Data Lake  ACLs and POSIX permissions

The security model for Data Lake Gen2 supports ACL and POSIX permissions along with some extra granularity specific to Data Lake Storage Gen2. Settings may be configured through Storage Explorer or through frameworks like Hive and Spark. We will do that via REST API in this post.

There are two kinds of access control lists (ACLs), Access ACLs and Default ACLs.

  • Access ACLs: These control access to an object. Files and folders both have Access ACLs.
  • Default ACLs: A “template” of ACLs associated with a folder that determine the Access ACLs for any child items that are created under that folder. Files do not have Default ACLs.

Here’s the table of allowed grant types:

acl1

While we define ACLs we need to use a short form of these grant types. Microsoft Document explained these short form in below table:

posix

However, in our code we would also simplify the POSIX ACL notations by using some supporting classes as below. That way REST client consumers do not need to spend time building the short form of their aimed grant criteria’s.

public enum AclType
{
User,
Group,
Other,
Mask
}
public enum AclScope
{
Access,
Default
}
[FlagsAttribute]
public enum GrantType : short
{
None = 0,
Read = 1,
Write = 2,
Execute = 4
};
public class AclEntry
{
public AclEntry(AclScope scope, AclType type, string upnOrObjectId, GrantType grant)
{
Scope = scope;
AclType = type;
UpnOrObjectId = upnOrObjectId;
Grant = grant;
}
public AclScope Scope { get; private set; }
public AclType AclType { get; private set; }
public string UpnOrObjectId { get; private set; }
public GrantType Grant { get; private set; }
public string GetGrantPosixFormat()
{
return $"{(this.Grant.HasFlag(GrantType.Read) ? 'r' : '-')}{(this.Grant.HasFlag(GrantType.Write) ? 'w' : '-')}{(this.Grant.HasFlag(GrantType.Execute) ? 'x' : '-')}";
}
public override string ToString()
{
return $"{(this.Scope == AclScope.Default ? "default:" : string.Empty)}{this.AclType.ToString().ToLowerInvariant()}:{this.UpnOrObjectId}:{GetGrantPosixFormat()}";
}
}

view raw
acl-supports.cs
hosted with ❤ by GitHub

Now we can create methods to perform different REST calls, let’s start by creating a file system.

public async Task<bool> CreateFileSystemAsync(
string fileSystemName)
{
var tokenInfo = await tokenProvider.GetAccessTokenV2EndpointAsync();
var jsonContent = new StringContent(string.Empty);
var headers = Statics.Http.DefaultRequestHeaders;
headers.Clear();
headers.Add("Authorization", $"Bearer {tokenInfo.access_token}");
headers.Add(API_VERSION_HEADER_NAME, API_VERSION_HEADER_VALUE);
var response = await Statics.Http.PutAsync($"{baseUri}{WebUtility.UrlEncode(fileSystemName)}?resource=filesystem", jsonContent);
return response.IsSuccessStatusCode;
}

Here we are retrieving a Access Token and then issuing a REST call to Azure Data Lake Storage Gen 2 API to create a new file system. Next, we will create a folder and file in it and then set some Access Control to them.

Let’s create the folder:

public async Task<bool> CreateDirectoryAsync(string fileSystemName, string fullPath)
{
var tokenInfo = await tokenProvider.GetAccessTokenV2EndpointAsync();
var jsonContent = new StringContent(string.Empty);
var headers = Statics.Http.DefaultRequestHeaders;
headers.Clear();
headers.Add("Authorization", $"Bearer {tokenInfo.access_token}");
headers.Add(API_VERSION_HEADER_NAME, API_VERSION_HEADER_VALUE);
var response = await Statics.Http.PutAsync($"{baseUri}{WebUtility.UrlEncode(fileSystemName)}{fullPath}?resource=directory", jsonContent);
return response.IsSuccessStatusCode;
}

view raw
CreateDirectory.cs
hosted with ❤ by GitHub

And creating file in it. Now, file creation (ingestion in Data Lake) is not that straight forward, at least, one can’t do that by a single call. We would have to first create an empty file, then we can write some content in it. We can also append content to an existing file. Finally, we would require to flush the buffer so the new content gets persisted.

Let’s do that, first we will see how to create an empty file:

public async Task<bool> CreateEmptyFileAsync(string fileSystemName, string path, string fileName)
{
var tokenInfo = await tokenProvider.GetAccessTokenV2EndpointAsync();
var jsonContent = new StringContent(string.Empty);
var headers = Statics.Http.DefaultRequestHeaders;
headers.Clear();
headers.Add("Authorization", $"Bearer {tokenInfo.access_token}");
headers.Add(API_VERSION_HEADER_NAME, API_VERSION_HEADER_VALUE);
var response = await Statics.Http.PutAsync($"{baseUri}{WebUtility.UrlEncode(fileSystemName)}{path}{fileName}?resource=file", jsonContent);
return response.IsSuccessStatusCode;
}

view raw
CreateEmptyFile.cs
hosted with ❤ by GitHub

The above snippet will create an empty file, now we will read all content from a local file (from PC) and write them into the empty file in Azure Data Lake that we just created.

public async Task<bool> CreateFileAsync(string filesystem, string path,
string fileName, Stream stream)
{
var operationResult = await this.CreateEmptyFileAsync(filesystem, path, fileName);
if (operationResult)
{
var tokenInfo = await tokenProvider.GetAccessTokenV2EndpointAsync();
var headers = Statics.Http.DefaultRequestHeaders;
headers.Clear();
headers.Add("Authorization", $"Bearer {tokenInfo.access_token}");
headers.Add(API_VERSION_HEADER_NAME, API_VERSION_HEADER_VALUE);
using (var streamContent = new StreamContent(stream))
{
var resourceUrl = $"{baseUri}{filesystem}{path}{fileName}?action=append&timeout={this.Timeout}&position=0";
var msg = new HttpRequestMessage(new HttpMethod("PATCH"), resourceUrl);
msg.Content = streamContent;
var response = await Statics.Http.SendAsync(msg);
//flush the buffer to commit the file
var flushUrl = $"{baseUri}{filesystem}{path}{fileName}?action=flush&timeout={this.Timeout}&position={msg.Content.Headers.ContentLength}";
var flushMsg = new HttpRequestMessage(new HttpMethod("PATCH"), flushUrl);
response = await Statics.Http.SendAsync(flushMsg);
return response.IsSuccessStatusCode;
}
}
return false;
}

view raw
CreateFile.cs
hosted with ❤ by GitHub

Right! Now time to set Access control to the directory or files inside a directory. Here’s the method that we will use to do that.

public async Task<bool> SetAccessControlAsync(string fileSystemName, string path, AclEntry[] acls)
{
var targetPath = $"{WebUtility.UrlEncode(fileSystemName)}{path}";
var tokenInfo = await tokenProvider.GetAccessTokenV2EndpointAsync();
var jsonContent = new StringContent(string.Empty);
var headers = Statics.Http.DefaultRequestHeaders;
headers.Clear();
headers.Add("Authorization", $"Bearer {tokenInfo.access_token}");
headers.Add(API_VERSION_HEADER_NAME, API_VERSION_HEADER_VALUE);
headers.Add(ACK_HEADER_NAME, string.Join(',', acls.Select(a => a.ToString()).ToArray()));
var response = await Statics.Http.PatchAsync($"{baseUri}{targetPath}?action=setAccessControl", jsonContent);
return response.IsSuccessStatusCode;
}

view raw
SetAcl.cs
hosted with ❤ by GitHub

The entire File system REST API class can be found here. Here’s an example how we can use this methods from a console application.

var tokenProvider = new OAuthTokenProvider(tenantId, clientId, secret, scope);
var hdfs = new FileSystemApi(storageAccountName, tokenProvider);
var response = hdfs.CreateFileSystemAsync(fileSystemName).Result;
hdfs.CreateDirectoryAsync(fileSystemName, "/demo").Wait();
hdfs.CreateEmptyFileAsync(fileSystemName, "/demo/", "example.txt").Wait();
var stream = new FileStream(@"C:\temp.txt", FileMode.Open, FileAccess.Read);
hdfs.CreateFileAsync(fileSystemName, "/demo/", "mytest.txt", stream).Wait();
var acls = new AclEntry[]
{
new AclEntry(
AclScope.Access,
AclType.Group,
"2dec2374-3c51-4743-b247-ad6f80ce4f0b",
(GrantType.Read | GrantType.Execute)),
new AclEntry(
AclScope.Access,
AclType.Group,
"62049695-0418-428e-a5e4-64600d6d68d8",
(GrantType.Read | GrantType.Write | GrantType.Execute)),
new AclEntry(
AclScope.Default,
AclType.Group,
"62049695-0418-428e-a5e4-64600d6d68d8",
(GrantType.Read | GrantType.Write | GrantType.Execute))
};
hdfs.SetAccessControlAsync(fileSystemName, "/", acls).Wait();

view raw
Console.cs
hosted with ❤ by GitHub

Conclusion

Until, there’s an Official Client Package released, if you’re into Azure Data Lake Store Gen 2 and wondering how to accomplish these REST calls – I hope this post helped you to move further!

Thanks for reading.

 

Linkerd in Azure Kubernetes Service cluster

In this article I would document my journey on setting up Linkerd Service Mesh on Azure Kubernetes service.

Background

I have a tiny Kubernetes cluster. I run some workload there, some are useful, others are just try-out, fun stuffs. I have few services that need to talk to each other. I do not have a lot of traffic to be honest, but I sometimes curiously run Apache ab to simulate load and see how my services perform under stress. Until very recently I was using a messaging (basically a pub-sub) pattern to create reactive service-to-service communication. Which works great, but often comes with a latency. I can only imagine, if I were to run these service to service communication for a mission critical high-traffic performance-driven scenario (an online game for instance), this model won’t fly well. There comes the need for a service-to-service communication pattern in cluster.

What’s big deal? We can have REST calls between services, even can implement gRPC for that matter. The issue is things behaves different at scale. When many services talks to many others, nodes fail in between, network address of PODs changes, new PODs show up, some goes down, figuring out where the service sits becomes quite a challenging task.

Then Kubernetes comes to rescue, Kubernetes provides “service”, that gives us service discovery out of the box. Which is awesome. Not all issues disappeared though. Services in a cluster need fault-tolerances, traceability and most importantly, “observability”.  Circuit-breakers, retry-logics etc. implementing them for each service is again a challenge. This is exactly the Service Mesh addresses.

Service mesh

From thoughtworks radar:

Service mesh is an approach to operating a secure, fast and reliable microservices ecosystem. It has been an important steppingstone in making it easier to adopt microservices at scale. It offers discovery, security, tracing, monitoring and failure handling. It provides these cross-functional capabilities without the need for a shared asset such as an API gateway or baking libraries into each service. A typical implementation involves lightweight reverse-proxy processes, aka sidecars, deployed alongside each service process in a separate container. Sidecars intercept the inbound and outbound traffic of each service and provide cross-functional capabilities mentioned above.

Some of us might remember Aspect Oriented programming (AOP) – where we used to separate cross cutting concerns from our core-business-concerns. Service mesh is no different. They isolate (in a separate container) these networking and fault-tolerance concerns from the core-capabilities (also running in container).

Linkerd

There are quite several service mesh solutions out there – all suitable to run in Kubernetes. I have used earlier Envoy and Istio. They work great in Kubernetes as well as VM hosted clusters. However, I must admit, I developed a preference for Linkerd since I discovered it. Let’s briefly look at how Linkerd works. Imagine the following two services, Service A and Service B. Service A talks to Service B.

service-2-service

When Linkerd installed, it works like an interceptor between all the communication between services. Linkerd uses sidecar pattern to proxy the communication by updating the KubeProxy IP Table.

Linkerd-architecture.png

Linkerd implants two sidecar containers in our PODs. The init container configures the IP table so the incoming and outgoing TCP traffics flow through the Linkerd Proxy container. The proxy container is the data plane that does the actual interception and all the other fault-tolerance goodies.

Primary reason behind my Linkerd preferences are performance and simplicity. Ivan Sim has done performance benchmarking with Linkerd and Istio:

Both the Linkerd2-meshed setup and Istio-meshed setup experienced higher latency and lower throughput, when compared with the baseline setup. The latency incurred in the Istio-meshed setup was higher than that observed in the Linkerd2-meshed setup. The Linkerd2-meshed setup was able to handle higher HTTP and GRPC ping throughput than the Istio-meshed setup.

Cluster provision

Spinning up AKS is easy as pie these days. We can use Azure Resource Manager Template or Terraform for that. I have used Terraform to generate that.

resource "azurerm_resource_group" "cloudoven" {
name = "cloudoven"
location = "West Europe"
}
resource "azurerm_kubernetes_cluster" "cloudovenaks" {
name = "cloudovenaks"
location = "${azurerm_resource_group.cloudoven.location}"
resource_group_name = "${azurerm_resource_group.cloudoven.name}"
dns_prefix = "cloudovenaks"
agent_pool_profile {
name = "default"
count = 1
vm_size = "Standard_D1_v2"
os_type = "Linux"
os_disk_size_gb = 30
}
agent_pool_profile {
name = "pool2"
count = 1
vm_size = "Standard_D2_v2"
os_type = "Linux"
os_disk_size_gb = 30
}
service_principal {
client_id = "98e758f8r-f734-034a-ac98-0404c500e010"
client_secret = "Jk==3djk(efd31kla934-=="
}
tags = {
Environment = "Production"
}
}
output "client_certificate" {
value = "${azurerm_kubernetes_cluster.cloudovenaks.kube_config.0.client_certificate}"
}
output "kube_config" {
value = "${azurerm_kubernetes_cluster.cloudovenaks.kube_config_raw}"
}

view raw
Kuberentes-iac
hosted with ❤ by GitHub

Service deployment

This is going to take few minutes and then we have a cluster. We will use the canonical emojivoto app (“buoyantio/emojivoto-emoji-svc:v8”) to test our Linkerd installation. Here’s the Kubernetes manifest file for that.

apiVersion: v1
kind: Namespace
metadata:
name: emojivoto
kind: ServiceAccount
apiVersion: v1
metadata:
name: emoji
namespace: emojivoto
kind: ServiceAccount
apiVersion: v1
metadata:
name: voting
namespace: emojivoto
kind: ServiceAccount
apiVersion: v1
metadata:
name: web
namespace: emojivoto
apiVersion: apps/v1beta1
kind: Deployment
metadata:
creationTimestamp: null
name: emoji
namespace: emojivoto
spec:
replicas: 1
selector:
matchLabels:
app: emoji-svc
strategy: {}
template:
metadata:
creationTimestamp: null
labels:
app: emoji-svc
spec:
serviceAccountName: emoji
containers:
env:
name: GRPC_PORT
value: "8080"
image: buoyantio/emojivoto-emoji-svc:v8
name: emoji-svc
ports:
containerPort: 8080
name: grpc
resources:
requests:
cpu: 100m
status: {}
apiVersion: v1
kind: Service
metadata:
name: emoji-svc
namespace: emojivoto
spec:
selector:
app: emoji-svc
clusterIP: None
ports:
name: grpc
port: 8080
targetPort: 8080
apiVersion: apps/v1beta1
kind: Deployment
metadata:
creationTimestamp: null
name: voting
namespace: emojivoto
spec:
replicas: 1
selector:
matchLabels:
app: voting-svc
strategy: {}
template:
metadata:
creationTimestamp: null
labels:
app: voting-svc
spec:
serviceAccountName: voting
containers:
env:
name: GRPC_PORT
value: "8080"
image: buoyantio/emojivoto-voting-svc:v8
name: voting-svc
ports:
containerPort: 8080
name: grpc
resources:
requests:
cpu: 100m
status: {}
apiVersion: v1
kind: Service
metadata:
name: voting-svc
namespace: emojivoto
spec:
selector:
app: voting-svc
clusterIP: None
ports:
name: grpc
port: 8080
targetPort: 8080
apiVersion: apps/v1beta1
kind: Deployment
metadata:
creationTimestamp: null
name: web
namespace: emojivoto
spec:
replicas: 1
selector:
matchLabels:
app: web-svc
strategy: {}
template:
metadata:
creationTimestamp: null
labels:
app: web-svc
spec:
serviceAccountName: web
containers:
env:
name: WEB_PORT
value: "80"
name: EMOJISVC_HOST
value: emoji-svc.emojivoto:8080
name: VOTINGSVC_HOST
value: voting-svc.emojivoto:8080
name: INDEX_BUNDLE
value: dist/index_bundle.js
image: buoyantio/emojivoto-web:v8
name: web-svc
ports:
containerPort: 80
name: http
resources:
requests:
cpu: 100m
status: {}
apiVersion: v1
kind: Service
metadata:
name: web-svc
namespace: emojivoto
spec:
type: LoadBalancer
selector:
app: web-svc
ports:
name: http
port: 80
targetPort: 80
apiVersion: apps/v1beta1
kind: Deployment
metadata:
creationTimestamp: null
name: vote-bot
namespace: emojivoto
spec:
replicas: 1
selector:
matchLabels:
app: vote-bot
strategy: {}
template:
metadata:
creationTimestamp: null
labels:
app: vote-bot
spec:
containers:
command:
emojivoto-vote-bot
env:
name: WEB_HOST
value: web-svc.emojivoto:80
image: buoyantio/emojivoto-web:v8
name: vote-bot
resources:
requests:
cpu: 10m
status: {}

view raw
emoji-manifest.yml
hosted with ❤ by GitHub

With this IaC – we can run Terraform apply to provision our AKS cluster in Azure.

Azure Pipeline

Let’s create a pipeline for the service deployment. The easiest way to do that is to create a service connection to our AKS cluster. We go to the project settings in Azure DevOps project, pick Service connections and create a new service connection of type “Kubernetes connection”.

Azure DevOps connection

Installing Linkerd

We will create a pipeline that installs Linkerd into the AKS cluster. Azure Pipeline now offers “pipeline-as-code” – which is just an YAML file that describes the steps need to be performed when the pipeline is triggered. We will use the following pipeline-as-code:

pool:
name: Hosted Ubuntu 1604
steps:
task: KubectlInstaller@0
displayName: 'Install Kubectl latest'
task: Kubernetes@1
displayName: 'kubectl get'
inputs:
kubernetesServiceEndpoint: CloudOvenKubernetes
command: get
arguments: nodes
script: |
curl -sL https://run.linkerd.io/install | sh
export PATH=$PATH:$HOME/.linkerd2/bin
linkerd version
linkerd check –pre
linkerd install | kubectl apply -f –
linkerd check
displayName: 'Linkerd – Installation'

We can at this point trigger the pipeline to install Linkerd into the AKS cluster.

Linkerd installation (2)

Deployment of PODs and services

Let’s create another pipeline as code that deploys all the services and deployment resources to AKS using the following Kubernetes manifest file:

pool:
name: Hosted Ubuntu 1604
steps:
task: KubectlInstaller@0
displayName: 'Install Kubectl latest'
task: Kubernetes@1
displayName: 'kubectl apply'
inputs:
kubernetesServiceEndpoint: CloudOvenKubernetes
command: apply
useConfigurationFile: true
configuration: src/services/emojivoto/all.yml

In Azure Portal we can already see our services running:

Azure KS

Also in Kubernetes Dashboard:

Kub1

We have got our services running – but they are not really affected by Linkerd yet. We will add another step into the build pipeline to tell Linkerd to do its magic.

pool:
name: Hosted Ubuntu 1604
steps:
task: KubectlInstaller@0
displayName: 'Install Kubectl latest'
task: Kubernetes@1
displayName: 'kubectl apply'
inputs:
kubernetesServiceEndpoint: CloudOvenKubernetes
command: apply
useConfigurationFile: true
configuration: src/services/emojivoto/all.yml
script: 'src/services/emojivoto/all.yml | linkerd inject – | kubectl apply -f –'
displayName: 'Inject Linkerd'

Next thing, we trigger the pipeline and put some traffic into the service that we have just deployed. The emoji service is simulating some service to service invocation scenarios and now it’s time for us to open the Linkerd dashboard to inspect all the distributed traces and many other useful matrix to look at.

linkerd-censored

We can also see kind of an application map – in a graphical way to understand which service is calling who and what is request latencies etc.

linkerd-graph

Even fascinating, Linkerd provides some drill-down to the communications in Grafana Dashboard.

ezgif.com-gif-maker.gif

Conclusion

I have enjoyed a lot setting it up and see the outcome and wanted to share my experience with it. If you are looking into Service Mesh and read this post, I strongly encourage to give Linkerd a go, it’s awesome!

Thanks for reading.