Azure Resource Governance with Template Specs & Biceps

All the example codes are available in GitHub.

Background

Governance of cloud estates is challenging for businesses. It’s crucial to enforce security policies, workload redundancies, uniformity (such as naming conventions), simplify deployments with packaged artifacts (i.e., ARM templates), Azure role-based access control (Azure RBAC) across the enterprise.

Generally, the idea is, a centralized team (sometimes referred as platform team) builds and publishes Infrastructure-as-code artifacts and number of product development teams consume them, only providing their own parameters.

Azure offers native capabilities like Azure Policy, Blueprints and Management groups to address this problem. But there are wide range of external solutions (Terraform, Pulumi etc.) available too.

One attribute of Terraform, strikes me a lot is the ability to store a versioned module in a registry and consume it from the registry. The same principle that familiar to engineers and widely used in programming languages – such as NuGet for .net, Maven for Java, npm for Node

With ARM templates it’s rather unpleasant. If you currently have your templates in an Azure repo, GitHub repo or storage account, you run into several challenges when trying to share and use the templates. For a user to deploy it, the template must either be local or the URL for the template must be publicly accessible. To get around this limitation, you might share copies of the template with users who need to deploy it, or open access to the repo or storage account. When users own local copies of a template, these copies can eventually diverge from the original template. When you make a repo or storage account publicly accessible, you may allow unintended users to access the template.

Azure Resource Manager – Template Spec

Microsoft delivered some cool new features for Resource manager templates recently. One of these features, named as Template Spec. Template Spec is a first-class Azure Resource type, but it really is just a regular ARM template. Best part is that you can version it, persist it in Azure – just like a Terraform registry, share it across the organization with RBAC and consume them from repository.

Template Specs is currently in preview. To use it, you must install the latest version of PowerShell or Azure CLI. For Azure PowerShell, use version 5.0.0 or later. For Azure CLI, use version 2.14.2 or later.

The benefit of using template specs is that you can create canonical templates and share them with teams in your organization. The template specs are secure because they’re available to Azure Resource Manager for deployment, but not accessible to users without Azure RBAC permission. Users only need read access to the template spec to deploy its template, so you can share the template without allowing others to modify it.

The templates you include in a template spec should be verified by the platform team (or administrators) in your organization to follow the organization’s requirements and guidance.

How template spec works?

If you are familiar with ARM template, template specs are not new to you. They are just typical ARM templates and stored in Azure Resource group as “template spec” with a version number. That means, you can take any ARM template (the template JSON file only – without any parameter files) and publish it as template spec, using PowerShell, Azure CLI or REST API.

Publishing Template Spec

Let’s say I have a template that defines just an Application Insight component.

{
    "contentVersion": "1.0.0.0",
    "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#",
    "parameters": {
        "appInsights": { "type": "string" },
        "location": { "type": "string" }
    },
    "resources": [{
            "type": "microsoft.insights/components",
            "apiVersion": "2020-02-02-preview",
            "name": "[parameters('appInsights')]",
            "location": "[parameters('location')]",
            "properties": {
                "ApplicationId": "[parameters('appInsights')]",
                "Application_Type": "web"
            }
        }
    ],
    "outputs": {
        "instrumentationKey": {
            "type": "string",
            "value": "[reference(parameters('appInsights')).InstrumentationKey]"
        }
    }
}

We can now publish this as “template spec” using Azure CLI:

az ts create \
    --name "cloudoven-appInsights" \
    --version $VERSION \
    --resource-group $RESOURCE_GROUP \
    --location $LOCATION \
    --template-file "component.json" \
    --yes --query 'id' -o json

Once published, you can see it in Azure Portal – appeared as a new resource type Microsoft.Resources/templateSpecs.

Consuming Template Spec

Every published Template Spec has a unique ID. To consume a Template Spec, all you need is the ID of the Template Spec. You can retrieve the ID in Azure CLI:

APPINS_TSID=$(az ts show --resource-group $TSRGP --name $TSNAME --version $VERSION --query 'id' -o json)
echo Template Spec ID: $APPINS_TSID

You can now, deploy Azure Resources with the Template Spec ID and optionally your own parameters, like following:

az deployment group create \
  --resource-group $RESOURCEGROUP \
  --template-spec $APPINS_TSID \
  --parameters "parameters.json

Linked Templates & Modularizations

Overtime, Infrastructure-as-code tends to become big monolithic file containing numerous resources. ARM templates (thanks to all it’s verbosity) specially known for growing big fast and becomes difficult to comprehend by reading a large JSON file. You could address the issue before by using linked templates. However, with a caveat that linked templates needed to be accessible via an URL in the public internet – far from ideal.

Good news is Template Spec got this covered. If the main template for your template spec references linked templates, the PowerShell and CLI commands can automatically find and package the linked templates from your local drive.

Example

Here I have an example ARM template that defines multiple resources (Application Insights, Server farm and a web app) in small files and finally creating a main template that brings everything together. One can then publish the main template as Template Spec – hence any consumer can provision their web app just by pointing to the ID of the template spec. Here’s the interesting bit of the main template:

"resources": [
        {
            "type": "Microsoft.Resources/deployments",
            "apiVersion": "2020-06-01",
            "name": "DeployAppInsights",
            "properties": {
                "mode": "Incremental",
                "parameters": {
                    "appInsights": { "value": "[parameters('appInsights')]"},
                    "location": {"value": "[parameters('location')]"}
                },
                "templateLink": {                    
                    "relativePath": "../appInsights/component.json"
                }
            }
        },
        {
            "type": "Microsoft.Resources/deployments",
            "apiVersion": "2020-06-01",
            "name": "DeployHostingplan",
            "properties": {
                "mode": "Incremental",      
                "templateLink": {
                    "relativePath": "../server-farm/component.json"
                }
            }
        },
        {
            "type": "Microsoft.Resources/deployments",
            "apiVersion": "2020-06-01",
            "name": "DeployWebApp",
            "dependsOn": [ "DeployHostingplan" ],
            "properties": {
                "mode": "Incremental",             
                "templateLink": {
                    "relativePath": "../web-app/component.json"
                }
            }
        }
    ]

You see, Template Specs natively offers the modularity, centralized registry, however, they are still ARM JSON files. One common criticism of ARM template is it’s too verbose and JSON are not particularly famous for readability.

Microsoft is aiming to address these concerns with a new Domain Specific Language (DSL) that named a Azure Bicep.  

What is Bicep?

Bicep aims to drastically simplify the authoring experience with a cleaner syntax and better support for modularity and code re-use. Bicep is a transparent abstraction over ARM and ARM templates, which means anything that can be done in an ARM Template can be done in bicep (outside of temporary known limitations).

If we take the same Application Insight component (above) and re-write it in Bicep, it looks following:

param appInsights string
param location string = resourceGroup().location
resource appIns 'Microsoft.Insights/components@2020-02-02-preview' = {
  name: appInsights
  location: location
  kind: appInsights
  properties: {
    Application_Type: 'web'
  }
}
output InstrumentationKey string = appIns.properties.InstrumentationKey

Very clean and concise compared to the ARM JSON version of it. If you are coming from Terraform, you might already find yourself at home – because Bicep took lot of inspiration from Terraform HCL (HashiCorp Language). You save bicep scripts with .bicep file extensions.

Important things to understand, Bicep is a client-side language layer sits on top of ARM json. The idea is you write it in Bicep then compile the script using a Bicep compiler (or Transpiler) to produce ARM JSON as compiled artifact and you still deploy ARM template (JSON) to Azure. Here’s how you compile a bicep file to produce the ARM JSON:

bicep build ./main.bicep

Bicep currently is in experimental state, and not recommended to use in production.

Creating Template Spec in Bicep

The above Example – you’ve seen how you could create a Template Spec that is quite modularized into linked templates. Let’s rewrite that in Bicep and see how clean and simple it looks:

param webAppName string = ''
param appInsights string = ''
param location string = resourceGroup().location
param hostingPlanName string = ''
param containerSpec string =''
param costCenter string
param environment string

module appInsightsDeployment '../appinsights/component.bicep' = {
  name: 'appInsightsDeployment'
  params:{
    appInsights: '${appInsights}'
    location: '${location}'
    costCenter: costCenter
    environment: environment
  }
}

module deployHostingplan '../server-farm/component.bicep' = {
  name: 'deployHostingplan'
  params:{
    hostingPlanName:  '${hostingPlanName}'
    location: '${location}'
    costCenter: costCenter
    environment: environment    
  }   
}

module deployWebApp '../web-app/component.bicep' = {
  name: 'deployWebApp'
  params:{
    location: '${location}'
    webAppName: '${webAppName}'
    instrumentationKey: appInsightsDeployment.outputs.InstrumentationKey
    serverFarmId: deployHostingplan.outputs.hostingPlanId
    containerSpec: '${containerSpec}'
    costCenter: costCenter
    environment: environment    
  }   
}

Notice, Bicep came up with a nice keyword module to address the linked template scenario. Bicep module is an opaque set of one or more resources to be deployed together. It only exposes parameters and outputs as contract to other Bicep files, hiding details on how internal resources are defined. This allows you to abstract away complex details of the raw resource declaration from the end user who now only needs to be concerned about the module contract. Parameters and outputs are optional.

CI/CD – GitHub Action

Let’s create a delivery pipeline in GitHub action to compile our Bicep file and publish them as Template Spec in Azure. Following GitHub workflow, installs Bicep tools, compiles the scripts, and finally publishes them as Template Spec in Azure. You can see the complete repository in GitHub.

jobs:  
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - name: Install Bicep CLI
        working-directory: ./src/template-spec-bicep/paas-components
        run: |
          chmod +x ./install-bicep.sh
          ./install-bicep.sh
      - name: Compile Bicep Scripts
        working-directory: ./src/template-spec-bicep/paas-components
        run: |
          chmod +x ./build-templates.sh
          ./build-templates.sh
      - name: Azure Login
        uses: Azure/login@v1.1
        with:
          creds: ${{ secrets.AZURE_CREDENTIALS }}
      - name: Deploy Template Specs to Azure
        working-directory: ./src/template-spec-bicep/paas-components        
        run: |          
          chmod +x ./deploy-templates.sh
          ./deploy-templates.sh

Consuming the template spec doesn’t change disregarding the choice you make with Bicep or ARM template. Consumers just create a parameter file and deploy resource only specifying the Template Spec ID. You can see an example of Consumer workflow (also GitHub action) here.

Conclusion

I am excited how the Azure team is offering new tools like Bicep, Template Specs to simplify the cloud governance, self-service areas. It’s important to understand that Bicep team is not competing with available tools in the space (like Terraform etc.) rather offering more options to folks in their cloud journey.

Bicep is in experimental phase now and Template Specs are not in preview, therefore, don’t use them in production just yet.   

Azure AD Pod Identity – password-less app-containers in AKS

Background

I like Azure Managed Identity since its advent. The concept behind Managed Identity is clever, and it adds observable value to any DevOps team. All concerns with password configurations in multiple places, life cycle management of secrets, certificates, and rotation policies suddenly irrelevant (OK, most of the cases).
Leveraging managed identity for application hosted in Azure Virtual machine, Azure web apps, Function apps etc. was straightforward. The Managed Identity sits on top of Azure Instance Metadata Service technology. Azure’s Instance Metadata Service is a REST Endpoint accessible to all IaaS VMs created via the Azure Resource Manager. The endpoint is available at a well-known non-routable IP address (169.254.169.254) that can be accessed only from within the VM. Under the hood Azure VMs, VMSS and Azure PaaS resources (i.e. Web Apps, Function Apps etc.) leverage metadata service to retrieve Azure AD token. Thus VM, Web App kind of establishes their own “Application Identity” (what Managed Identity essentially is) that Azure AD authenticates.

Managed Identity in Azure Kubernetes Service

Managed Identity in Kubernetes, however, is a different ballgame. Typically, multiple applications (often developed by different teams in an organization) running in a single cluster, pods are launching, exiting frequently in different nodes. Hence, Managed Identity associating with VM/VMSS are not sufficient, we needed a way to assign identity to every pods in an application. If pods move to different nodes, the identity must somehow move with them in the new node (VM).

Azure Pod Identity

Good news is Azure Pod Identity offers that capability. Azure Pod Identity is an Open source project in GitHub.

Note: Managed pod identities is an open source project and is not supported by Azure technical support.

An application can use Azure Pod Identity to access Azure resources (i.e. Key Vault, Storage, Azure SQL database etc.) via Managed Identity hence, there’s no secret/password involved anywhere in the process. Pods can directly fetch access tokens scoped to resources directly from Azure Active Directory.

Concept

The following two components are installed in cluster to achieve the pod identity.

1. The Node Management Identity (NMI)

AKS cluster runs this Daemon Set in every node. This intercepts outbound calls from pods requesting access tokens and proxies those calls with predefined Managed Identity.

2. The Managed Identity Controller (MIC)

MIC is a central pod with permissions to query the Kubernetes API server and checks for an Azure identity mapping that corresponds to a pod.

Source: GitHub Project – Azure Pod Identity

When pods request access to an Azure service, network rules redirect the traffic to the Node Management Identity (NMI) server. The NMI server identifies pods that request access to Azure services based on their remote address and queries the Managed Identity Controller (MIC). The MIC checks for Azure identity mappings in the AKS cluster, and the NMI server then requests an access token from Azure Active Directory (AD) based on the pod’s identity mapping. Azure AD provides access to the NMI server, which is returned to the pod. This access token can be used by the pod to then request access to services in Azure.

Microsoft

Azure Pipeline to Bootstrap pod Identity

I have started with an existing rbac-enabled Kubernetes cluster – that I have created before. Azure AD pod identity would setup “Service Account”, “custom resource definitions (CRD)”, Cluster Roles and bindings, DaemonSet for NMI etc. I wanted to do it via Pipeline, so I can repeat the process on-demand. Here are the interesting part of the azure-pod-identity-setup-pipeline.yaml

trigger:
- master
variables:
  tag: '$(Build.BuildId)'
  containerRegistry: $(acr-name).azurecr.io
  vmImageName: 'ubuntu-latest'
stages:
- stage: Build
  displayName: Aad-Pod-Identity-Setup
  jobs:  
  - job: Build
    displayName: Setup Aad-Pod-Identity.
    pool:
      vmImage: $(vmImageName)
    environment: 'Kubernetes-Cluster-Environment.default'
    steps:
      - bash: |
          kubectl apply -f https://raw.githubusercontent.com/Azure/aad-pod-identity/master/deploy/infra/deployment-rbac.yaml
        displayName: 'Setup Service Account, CRD, DaemonSet etc'


I will be using a “User assigned identity” for my sample application. I have written a basic .net app with SQL back-end for this purpose. My end goal is to allow the .net app talk to SQL server with pod identity.
Following instruction Aad-pod-identity project instruction, I have created the user assigned identity.

      - task: AzureCLI@2
        inputs:
          scriptType: 'bash'
          scriptLocation: 'inlineScript'
          inlineScript: 'az identity create -g $(rgp) -n $(uaiName) -o json'

In my repository, created the Azure Identity definition in a file named: aad-pod-identity.yaml

apiVersion: "aadpodidentity.k8s.io/v1"
kind: AzureIdentity
metadata:
  name: <a-idname>
spec:
  type: 0
  ResourceID: /subscriptions/<sub>/resourcegroups/<resourcegroup>/providers/Microsoft.ManagedIdentity/userAssignedIdentities/<name>
  ClientID: <clientId>
---
apiVersion: "aadpodidentity.k8s.io/v1"
kind: AzureIdentityBinding
metadata:
  name: demo1-azure-identity-binding
spec:
  AzureIdentity: <a-idname>
  Selector: managed-identity


And added a task to deploy that too.

      - bash: |
          kubectl apply -f manifests/ aad-pod-identity.yaml
        displayName: 'Setup Azure Identity

Triggered the pipeline and Azure pod Identity was ready to roll.

Deploying application

I have a .net core web app (Razor application) which I would configure to run with pod identity to connect to Azure SQL (a back-end) with Azure Active directory authentication – with no password configured at the application level.
Here’s the manifest file (front-end.yaml) for the application, the crucial part is to define the label (aadpodidbinding: managed-identity) match for binding the pod identity we have defined before:

        apiVersion:apps/v1
	kind: Deployment
	metadata:
	  name: dysnomia-frontend
	spec:
	  replicas: 6
	  selector:
	    matchLabels:
	      app: dysnomia-frontend      
	  strategy:
	    rollingUpdate:
	      maxSurge: 1
	      maxUnavailable: 1
	  minReadySeconds: 5 
	  template:
	    metadata:
	      labels:
	        app: dysnomia-frontend
                aadpodidbinding: managed-identity
	    spec:
	      nodeSelector:
	        "beta.kubernetes.io/os": linux
	      containers:
	      - name: dysnomia-frontend
	        image: #{containerRegistry}#/dysnomia-frontend:#{Build.BuildId}#
	        imagePullPolicy: "Always"

That’s pretty much it, once I have created my application pipeline with the above manifest deployed, the .net application can connect to Azure SQL database with the assigned pod identity.

      - bash: |
          kubectl apply -f manifests/front-end.yaml
        displayName: 'Deploy Front-end'


Of course, I needed to do Role assignment for User Assigned Identity and enable Azure AD authentication in my SQL server, but not describing those steps, I have written about that before.

What about non-Azure resources?

The above holds true for all Azure Resources that supports Managed Identity. That means, our application can connect to Cosmos DB, Storage Account, Service Bus, Key Vaults, and many other Azure resources without configuring any password and secrets anywhere in Kubernetes.
However, there are scenarios where we might want to run a Redis container or a SQL server container in our Kubernetes cluster. And cost wise, it might make sense to run it in Kubernetes (as you already have a cluster) instead of Azure PaaS (i.e. Azure SQL or Azure Redis) for many use-cases. In those cases, we must create the SQL password and configure into our .net app (using Kubernetes Secrets).
I was wondering if I could store my SQL password in an Azure Key Vault and let my SQL container and .net app both collect the password from key vault during launch using Azure pod identity. Kubernetes has a first-class option to handle such scenarios- Kubernetes secrets.

However, today I am playing with Azure AD pod Identity – therefore, I really wanted to use pod identity – for fun ;-). Here’s how I managed to make it work.

SQL container, pod identity and Azure Key vault

I’ve created Key vault and defined SQL server password as a secret there. Configured my .net app to use pod identity to talk to key vault and configured key vault access policy so user-assigned identity created above can grab the SQL password. So far so good.

Now, I wanted to run a SQL server instance in my cluster which should also collect the password from Key vault – same way as it did for .net app. Turned out, SQL 2019 image (mcr.microsoft.com/mssql/server:2019-latest) expects the password as an environment variable during container launch.

docker run -d -p 1433:1433 `
           -e "ACCEPT_EULA=Y" `
           -e "SA_PASSWORD=P@ssw0rD" `
           mcr.microsoft.com/mssql/server:2019-latest

Initially, I thought it would be easy to use an init-container to grab the password from Azure Key vault and then pass it through the application container (SQL) as environment variable. After some failed attempts realized that isn’t trivial. I can of course create volume mounts (e.g. EmptyDir) to convey the password from init-container to application container – but that rather dirty – isn’t it?
Secondly I thought of creating my own Docker image based on the SQL container, then I could run a piece of script that will grab the password form Key vault and set it as environment variable. A simple script with a few curl commands would do the trick – you might think. Well, few small issues. SQL container images are striped down ubuntu core – which do not have apt, curl etc. also not running as root either – for all good reasons.


So, I have written a small program in Go and compiled it to a binary.

package main
import (
    "encoding/json"
    "fmt"
    "io/ioutil"
    "net/http"
    "os"
)
type TokenResponse struct {
    Token_type   string `json:"token_type"`
    Access_token string `json:"access_token"`
}
type SecretResponse struct {
    Value string `json:"value"`
    Id    string `json:"id"`
}
func main() {
    var p TokenResponse
    var s SecretResponse
    imds := "http://169.254.169.254/metadata/identity/oauth2/token" +
            "?api-version=2018-02-01&resource=https%3A%2F%2Fvault.azure.net"
    kvUrl := "https://" + os.Args[1] + "/secrets/" + os.Args[2] + 
             "?api-version=2016-10-01"

    client := &http.Client{}
    req, _ := http.NewRequest("GET", imds , nil)
    req.Header.Set("Metadata", "True")
    res, _ := client.Do(req)
    b, _ := ioutil.ReadAll(res.Body)
    json.Unmarshal(b, &p)

    req, _ = http.NewRequest("GET", kvUrl , nil)
    req.Header.Set("Authorization", p.Token_type+" "+p.Access_token)
    res, _ = client.Do(req)
    b, _ = ioutil.ReadAll(res.Body)
    json.Unmarshal(b, &s)
    fmt.Println(s.Value)
}

This program simply grabs the secret from Azure Key vault using Managed Identity. Created a binary out of it:

Go build -o aadtoken


Next, I have created my SQL container image with following docker file:

FROM mcr.microsoft.com/mssql/server:2019-latest

ENV ACCEPT_EULA=Y
ENV MSSQL_PID=Developer
ENV MSSQL_TCP_PORT=1433 
COPY ./aadtoken /
COPY ./startup.sh /
CMD [ "/bin/bash", "./startup.sh" ] 

You see, I am relying on “startup.sh” bash-script. Here’s it:

echo "Retrieving AAD Token with Managed Identity..."
export SA_PASSWORD=$(./aadtoken $KeyVault $SecretName)
echo "SQL password received and cofigured successfully"
/opt/mssql/bin/sqlservr --accept-eula

Created the image and here’s my SQL manifest to deploy in Kubernetes:

apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: mssql-deployment
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: mssql
        aadpodidbinding: managed-identity
    spec:
      terminationGracePeriodSeconds: 10
      containers:
      - name: mssql
        image: #{ACR.Name}#/sql-server:2019-latest
        ports:
        - containerPort: 1433
        env:
        - name: KeyVault
          value: "#{KeyVault.Name}#"
        - name: SecretName
          value: "SQL_PASSWORD"
        volumeMounts:
        - name: mssqldb
          mountPath: /var/opt/mssql
      volumes:
      - name: mssqldb
        persistentVolumeClaim:
          claimName: mssql-data

Deployed the manifest and voila! All works. SQL pods and .net app pods all are using Managed Identity to connect to Key vault and retrieving the secret, stored centrally in one place and management of the password nice & tidy.

Conclusion

I find Azure pod identity a neat feature and security best practice. Especially when you are using Azure Kubernetes and some Azure Resources (i.e. Cosmos DB, Azure SQL, Key vault, Storage Account, Service Bus etc.). If you didn’t know, hope this makes you more enthusiast to investigate further.

Disclaimer 1: It’s an open source project – so the Azure technical support doesn’t apply.

Disclaimer 2: The SQL container part with Go script is totally a fun learning attempt, don’t take it too seriously!

Azure AD App via ARM Template Deployment Scripts

Background

ARM templates offer a great way to define resources and deploy them. However, ARM templates didn’t have any support to invoke or run scripts. If we wanted to carry out some operations as part of the deployment (Azure AD app registrations, Certificate generations, copy data to/from another system etc.) we had to create pre or post deployment scripts (using Azure PowerShell or Azure CLI). Microsoft recently announced the preview of Deployment Scripts (new resource type Microsoft.Resources/deploymentScripts) – which brings a way to run a script as part of ARM template deployment.

I have few web apps using Open ID connect for user authentication and they’re running as Azure App services. I always wanted to automate (preferably in a declarative and idempotent way) the required app registrations in Azure AD and deploy them together with the ARM templates of web apps.

Since we now have deployment script capability, I wanted to leverage it for Azure AD app registrations. In this article I will share my experience doing exactly that.

What are deployment scripts?

Deployment scripts allows running custom scripts (can be either Azure PowerShell or Azure CLI) as part of an ARM template deployment. It can be used to perform custom steps that can’t be done by ARM templates.


A simple deployment template that runs a bash command (echo) looks like below:

Figure: Simple example of Deployment Scripts

Microsoft described the benefits of deployment scripts as following:

– Easy to code, use, and debug. You can develop deployment scripts in your favorite development environments. The scripts can be embedded in templates or in external script files.


– You can specify the script language and platform. Currently, Azure PowerShell and Azure CLI deployment scripts on the Linux environment are supported.


– Allow specifying the identities that are used to execute the scripts. Currently, only Azure user-assigned managed identity is supported.


– Allow passing command-line arguments to the script.
Can specify script outputs and pass them back to the deployment.

Source

Registering Azure AD app

We can write a small script (with Azure CLI) like above sample, that registers the Azure AD app – that’s quite straightforward. However, first we need to address the Identity aspect, what account would run the script and how app-registration permission can be granted to that account. The answer is using Managed Identity.

User Assigned Managed Identity

Managed identities for Azure resources provide Azure services with a managed identity in Azure Active Directory. We can use this identity to authenticate to services that support Azure AD authentication, without needing credentials in your code. There are two types of Managed Identity, System assigned and User Assigned.

Deployment Scripts currently supports User Assigned Identities only, hence, we need to create a User Assigned Managed Identity that would run the CLI script. This identity is used to execute deployment scripts. We would also grant Azure AD app registration permissions to this identity. Creating User Assigned Identity is straightforward and the steps are nicely described here.

Figure: User Assigned Managed Identity in Azure Portal

Next to that, we will have to grant permissions to the identity. Following PowerShell script grants the required permissions to the Managed Identity.

Figure: Grant permissions (Click to Open in window to copy)

ARM template

We will now write the ARM template that will leverage the deployment scripts to register our app in Azure AD.

Figure: Deployment Script (Click to Open in window to copy)

I wouldn’t explain each of the settings/config options in here. Most important part here is the scriptContent property – which can have a string value of any scripts (PowerShell or Bash). You can also point to an external script file instead of embedded script.

Another important property is cleanupPreference. It specifies the preference of cleaning up deployment resources when the script execution gets in a terminal state. Default setting is Always, which means deleting the resources despite the terminal state (Succeeded, Failed, Canceled).

You can find more details on each of the configuration properties for Deployment Script in this document.

I have used some variable references that are defined in the same template json file.

Figure: Variables (Click to open new window to copy)

Notice here the cliArg variable. This would be the argument that we are passing as inputs to our CLI/bash script. The catch here is, the arguments need to be separated by white-spaces.

Finally, we would love to grab the newly registered app id and configure an entry into the App Settings in our web app – so the web app Open ID authentication can work right after the deployment.

Figure: Variables (Click to open new window to copy)

At this point we will deploy the template and after the deployment completed, we will see the app has been registered in Azure AD:

Figure: Azure AD App

Also, we can verify that the newly created App ID is nicely configured into the web app’s app-settings.

Figure: App settings configured

That’s all there is to it!

I haven’t defined any API permission scopes for the app registrations in this example, however, having the Azure CLI script in place, defining further API scopes are trivial.

How it worked?

If we login to the Azure Portal we will see the following:

Figure: Azure Portal resources

We see a new resource of type Deployment Script besides our Web App (and it’s Service Plan) that is obvious. However, we also see Container Instance and a Storage Account. Where they came from?

Well, Azure RM deployment created them while deploying the Deployment scripts. The storage account and a container instance, are created in the same resource group for script execution and troubleshooting. These resources are usually deleted by the script service when the script execution gets in a terminal state. Important to know, we are billed for the resources until the resources are deleted.

The container instance runs a Docker image as a Sandbox for our Deployment Script. You can see the image name form the portal that Microsoft is using for execution. This can come handy to try out the script locally – for development purposes.

Conclusion

I have a mixed feeling about the deployment script in ARM templates. It obviously has some benefits. But this shouldn’t replace all pre or post deployment script. Because sometimes it might be cleaner and easier to create a pre- or post-script task in continuous delivery pipeline than composing all in ARM templates.

Terraforming Azure DevOps

Background

In many organizations, specially in large enterprises there’s a need to automate Azure DevOps projects and Teams members. Manually managing large number of Azure DevOps projects, Teams for these projects and users to the teams, on-boarding and off-boarding team members are not trivial.

Besides managing the users sometimes, we just need to have an overview (a documentation?) of users and Teams of Projects. Terraform is a great tool for Infrastructure as Code – which not only allows providing infrastructure on demand, but also gives us nice documentation which can be versioned control in a source control system. The workflow kind of looks like following:

GitOps

I am developing a Terraform Provider for Azure DevOps that helps me use Terraform for provisioning Azure DevOps projects, Teams and members. In this article I will share how I am building it.

Note
This provider doesn't implement the complete set of 
Azure DevOps REST APIs. 
Its limited to only projects, teams and member associations. 
It's not recommended to use it in production scenarios.

Terraform Provider

Terrafom is an amazing tool that lets you define your infrastructure as code. Under the hood it’s an incredibly powerful state machine that makes API requests and marshals resources. Terraform has lots of providers – almost for every major cloud – out there. Including many other systems – like Kubernetes, Palo-Alto Networks etc.

In nutshell if any system has REST API that can be manipulated with Terraform Provider. Azure DevOps also has a terraform provider – which doesn’t currently provide resources to create Teams and members. Hence, I am writing my own – shamelessly using/stealing the Microsoft’s Terraform provider (referenced above) for project creation.

Setting up GO Environment

Terraform Providers and plugins are binaries that Terraform communicates during runtime via RPC. It’s theoretically possible to write a provider in any language, but to be honest, I haven’t come across any providers that were written other languages than GO. Terraform provide helper libraries in Go to aid in writing and testing providers.

I am developing in Windows 10 and didn’t want to install GO on my local machine. Containers come to rescue of course. I am using the “Remote development” extension in VS Code. This extension allows me to keep the source code in local machine and compile, build the source code in a container like Magic!

remote

Figure: Remote Development extension in VSCode – running container to build local repository.

Creating the provider

To create a Terraform provider we need to write the logic for managing the Creation, Reading, Updating and Deletion (CRUD) of a resource (i.e. Azure DevOps project, Team and members in this scenario) and Terraform will take care of the rest; state, locking, templating language and managing the lifecycle of the resources. Here in this repository I have a minimum implementation that supports creating Azure DevOps projects, Teams and its members.

First of all we define our provider and resources in main.go file.

package main
import (
"github.com/hashicorp/terraform/plugin"
"github.com/hashicorp/terraform/terraform"
)
func main() {
plugin.Serve(&plugin.ServeOpts{
ProviderFunc: func() terraform.ResourceProvider {
return Provider()
},
})
}

view raw
main.go
hosted with ❤ by GitHub

Next to that, we will define the provider schema (the attributes it supports as input and outputs, resources etc.)

package main
import (
"github.com/hashicorp/terraform/helper/schema"
)
// Provider – The top level Azure DevOps Provider definition.
func Provider() *schema.Provider {
p := &schema.Provider{
ResourcesMap: map[string]*schema.Resource{
"azuredevops_pipeline": resourcePipeline(),
"azuredevops_project": resourceProject(),
"azuredevops_team": resourceTeam(),
"azuredevops_serviceendpoint": resourceServiceEndpoint(),
},
Schema: map[string]*schema.Schema{
"org_service_url": {
Type: schema.TypeString,
Required: true,
DefaultFunc: schema.EnvDefaultFunc("AZDO_ORG_SERVICE_URL", nil),
Description: "The url of the Azure DevOps instance which should be used.",
},
"personal_access_token": {
Type: schema.TypeString,
Required: true,
DefaultFunc: schema.EnvDefaultFunc("AZDO_PERSONAL_ACCESS_TOKEN", nil),
Description: "The personal access token which should be used.",
},
},
}
p.ConfigureFunc = providerConfigure(p)
return p
}
func providerConfigure(p *schema.Provider) schema.ConfigureFunc {
return func(d *schema.ResourceData) (interface{}, error) {
client, err := getAzdoClient(d.Get("personal_access_token").(string), d.Get("org_service_url").(string))
return client, err
}
}

view raw
provider.go
hosted with ❤ by GitHub

We are using Azure DevOps personal Access token to communicate to the Azure DevOps REST API. The GO client for Azure DevOps from Microsoft – which is used as dependency, immensely simplified the implementation and also helped learning the flow.

Now defining the “team” resource as following:

package main
import (
"fmt"
"github.com/microsoft/terraform-provider-azuredevops/utils/converter"
"github.com/microsoft/terraform-provider-azuredevops/utils/tfhelper"
"os"
"log"
"github.com/google/uuid"
"github.com/hashicorp/terraform/helper/schema"
"github.com/microsoft/azure-devops-go-api/azuredevops/core"
"github.com/microsoft/azure-devops-go-api/azuredevops/graph"
)
func resourceTeam() *schema.Resource {
return &schema.Resource{
Create: resourceTeamCreate,
Read: resourceTeamRead,
Update: resourceTeamUpdate,
Delete: resourceTeamDelete,
//https://godoc.org/github.com/hashicorp/terraform/helper/schema#Schema
Schema: map[string]*schema.Schema{
"project_id": &schema.Schema{
Type: schema.TypeString,
Required: true,
DiffSuppressFunc: tfhelper.DiffFuncSupressCaseSensitivity,
},
"name": &schema.Schema{
Type: schema.TypeString,
ForceNew: true,
Required: true,
DiffSuppressFunc: tfhelper.DiffFuncSupressCaseSensitivity,
},
"description": &schema.Schema{
Type: schema.TypeString,
Optional: true,
Default: "",
},
"members": {
Type: schema.TypeSet,
Optional: true,
Computed: true,
Elem: &schema.Resource{
Schema: map[string]*schema.Schema{
"name": {
Type: schema.TypeString,
Required: true,
},
"admin": {
Type: schema.TypeBool,
Required: true,
Sensitive: true,
},
},
},
},
},
}
}

view raw
resource_team.go
hosted with ❤ by GitHub

That’s all for declaring, now implementing the CRUD methods in resource providers. The full source code is in GitHub.

We can compile the provider application using following command:

> GOOS=windows GOARCH=amd64 go build -o terraform-provider-azuredevops.exe

As I am using Dabian docker image for GoLang I need to specify my target OS (GOOS=windows) and CPU Architecture (GOARCH=amd64) when I build the provider. This will produce the terraform provider for Azure DevOps executable.

Although it’s executable, it’s not meant to launch directly from command prompt. Instead, I will copy it to “%APPDATA%\ terraform.d\plugins\windows_amd64” folder of my machine.

Terraform Script for Azrue DevOps

Now we can write the Terraform file (.tf) that will describe the Azure DevOps Project, Team and members etc.

Terraform

With this terraform file, we can now launch the following command to initialize our terraform environment.

init

The terraform init command is used to initialize a working directory containing Terraform configuration files. This is the first command that should be run after writing a new Terraform configuration or cloning an existing one from version control.

Terraform plan

The terraform plan command is used to create an execution plan. Terraform performs a refresh, and then determines what actions are necessary to achieve the desired state specified in the configuration files. This command is a convenient way to check whether the execution plan for a set of changes matches your expectations without making any changes to real resources or to the state. For example, terraform plan might be run before committing a change to version control, to create confidence that it will behave as expected.

PLan

Figure: terraform plan output – shows exactly what is going to happen if we apply these changes to Azure DevOps

Terraform apply

The terraform apply command is used to apply the changes required to reach the desired state of the configuration, or the pre-determined set of actions generated by a terraform plan execution plan. We will launch it with an “-auto-approve” flag to assert the approval prompt.

apply

Now we can go to our Azure DevOps and sure enough there’s a new project created with the configuration as we scripted in Terraform file.

Taking it further

Now we can check in the terraform file (main.tf above) into an Azure DevOps repository and put a Branch policy to it. That will force any changes (such as creating new projects, adding removing team members) would requrie a Pull-Request and needs to be reviewed by peers (four-eyes principles). Once Pull-Requests are approved, a simple Azure Pipeline can trigger that does the terraform apply. And I have my workflow automated  and I also have nice histories in GIT – which records the purpose of any changes made in past.

Thanks for reading!

CloudOven – Terraform at ease!

TL;DR:

  • URL: CloudOven 

  • Use Google account or sign-up 
  • Google Chrome please! (I’ve not tested on other browsers yet)

e2e

Background

In recent years I have spent fair amount of time in design and implementation of Infrastructure as code in larger enterprise context. Terraform seemed to be a tool of choice when it comes to preserve the uniformity in Infrastructure as code targeting multiple cloud providers. It is rapidly becoming a de facto choice for creating and managing cloud infrastructures by writing declarative definitions. It’s popular because the syntax of its files is quite readable and because it supports several cloud providers while making no attempt to provide an artificial abstraction across those providers. The active community will add support for the latest features from most cloud providers.

However, rolling out Terraform in many enterprises has its own barrier to face. Albeit the syntax (HCL) is neat, but not every developers or Infrastructure operators in organizations finds it easy. There’s a learning curve and often many of us lose momentum discovering the learning effort. I believe if we could make the initial ramp-up easier more people would play with it.

That’s one of my motivation for this post, following is the other one.

Blazor meets Terraform

Lately I was learning Blazor – the new client-side technology from Microsoft. Like many others, I find one effective way learning a new technology by creating/building solution to a problem. I have decided to build a user interface that will help creating terraform scripts easier. I will share my journey in this post.

Resource Discovery in Terraform Providers

Terraform is powerful for its providers. You will find Terraform providers for all major cloud providers (Azure, AWS, Google etc.). The providers then allow us to define “resource” and “data source” in Terraform scripts. These resource and data source have arguments and attributes that one must know while creating terraform files. Luckily, they are documented nicely in Terraform site. However, it still requires us to jump back and forth to the documentation site and terraform file editor (i.e. VSCode).

Azure-Discovery

To make this experience easier, I wrote a crawler application that downloads the terraform providers (I am doing it for Azure, AWS and google for now) and discovers the attributes and arguments for each and every resource and data source. I also try to extract the documentation for every attributes and arguments from the terraform documentation site with a layman parsing (not 100% accurate but works for majority. Something I will improve soon).

GoogleAWS-discovery

This process generates JSON structure for each resource and data source, enriches them with the documentation and stores them in an Azure Blob Storage.

Building Infrastructure as code

Now that I have a structured data store with all resources and data sources for any terraform provider, I can leverage that building a user interface on top of it. To keep things a bit organized, I started with a concept of “project”.

workflow
The workflow

Project

I can start by creating a project (well, it can be a product too, but let’s not get to that debate). Project is merely a logical boundary here.

Blueprint

Within a project I can create Blueprint(s). Blueprint(s) are the entity that retains the elements of the infrastructure that we are aiming to create. For instance, a Blueprint targets to a Cloud provider (i.e. Azure). Then I can create the elements (resource and data sources) within the blueprint (i.e. Azure Web App, Cosmos DB etc.).

provider-configuration

Blueprints keeps the base structure of all the infrastructure elements. It allows defining variables (plain and simple terraform variables) so the actual values can vary in different environments (dev, test, pre-production, production etc.).

Once I am happy with the blueprint, I can download them as a zip – that contains the terraform scripts (main.tf and variable.tf). That’s it, we have our infrastructure as code in Terraform. I can execute them on a local development machine or check them in to source control – whatever I prefer.

storage_account

One can stop here and keep using the blueprint feature to generate Infrastructure as code. That’s what it is for. However, the next features are just to make the overall experience of running terraform a bit easier.

Environments

Next to blueprint, we can create as many environments we want. Again, just a logical entity to keep isolation of actual deployment for different environments.

Deployments

Deployment entity is the glue that ties a blueprint to a specific environment. For instance, I can define a blueprint for “order management” service (or micro-service maybe?), create an environment as “test” and then create a “deployment” for “order management” on “test”. This is where I can define constant values to the blueprint variable that are specific to the test environment.

Terraform State

Perhaps the most important aspect the deployment entity holds is the terraform state management. Terraform must store state about your managed infrastructure and configuration. This state is used by Terraform to map real world resources to your configuration, keep track of metadata, and to improve performance for large infrastructures. This state is stored by default in a local file named “terraform.tfstate”, but it can also be stored remotely, which works better in a team environment. Defining the state properties (varies in different cloud providers) in deployment entity makes the remote state management easier – specifically in team environment. It will configure the remote state to the appropriate remote backend. For instance, when the blueprint cloud provider is set to Azure, it will configure Azure Storage account as terraform state remote backend, for AWS it will pick S3 automatically.

e2e

Terraform plan

Once we have deployment entity configured, we can directly from the user interface run “terraform plan”. The terraform plan command creates an execution plan. Unless explicitly disabled, it performs a refresh, and then determines what actions are necessary to achieve the desired state specified in the blueprint. This command is a convenient way to check whether the execution plan for a set of changes matches your expectations without making any changes to real resources or to the state. For example, terraform plan might be run before committing a change to version control, to create confidence that it will behave as expected.

Terraform apply

The terraform apply command is used to apply the changes required to reach the desired state of the configuration, or the pre-determined set of actions generated by a terraform plan execution plan. Like “plan”, the “apply” command can also be issued directly from the user interface.

Terraform plan and apply both are issued in an isolated docker container and the output is captured and displayed back to the user interface. However, there’s a cost associated running docker containers on cloud, therefore, it’s disabled in the public site.

Final thoughts

It was fun to write a tool like this. I recommend you give it a go. Especially if you are stepping into Terraform. It can also be helpful for experienced Terraform developers – specifically with the on-screen documenation, type inferance and discovery features.

Some features, I have working progress:

  • Ability to define policy for each resources and data types
  • Save a Blueprint as custom module

Stay tuned!

 

Azure template to provision Docker swarm mode cluster

What is a swarm?

The cluster management and orchestration features embedded in the Docker Engine are built using SwarmKit. Docker engines participating in a cluster are running in swarm mode. You enable swarm mode for an engine by either initializing a swarm or joining an existing swarm. A swarm is a cluster of Docker engines, or nodes, where you deploy services. The Docker Engine CLI and API include commands to manage swarm nodes (e.g., add or remove nodes), and deploy and orchestrate services across the swarm.

I was recently trying to come up with a script that generates the docker swarm cluster – ready to take container work loads on Microsoft Azure. I thought, Azure Container Service (ACS) should already have supported that. However, I figured, that’s not the case. Azure doesn’t support docker swarm mode in ACS yet – at least as of today (25th July 2017). Which forced me to come up with my own RM template that does the help.

What’s in it?

The RM template will provision the following resources:

  • A virtual network
  • An availability set for manager nodes
  • 3 virtual machines with the AV set created above. (the numbers, names can be parameterized as per your needs)
  • A load balancer (with public port that round-robins to the 3 VMs on port 80. And allows inbound NAT to the 3 machine via port 5000, 5001 and 5002 to ssh port 22).
  • Configures 3 VMs as docker swarm mode manager.
  • A Virtual machine scale set (VMSS) in the same VNET.
  • 3 Nodes that are joined as worker into the above swarm.
  • Load balancer for VMSS (that allows inbound NATs starts from range 50000 to ssh port 22 on VMSS)

The design can be visualized with the following diagram:

There’s a handly powershell that can help automate provisioing this resources. But you can also just click the “Deploy to Azure” button below.

Thanks!

The entire scripts can be found into this GitHub repo. Feel free to use – as needed!

IAC – Using Azure RM templates

As cloud Software development heavily leverages virtualized systems and developers have started using Continuous Integration (CI), many things have started to change. The number of environment developers have to deal with has gone up significantly. Developers now release much frequently, in many cases, multiple times in a single day. All these releases has to be tested, validated. This brings up a new requirement to spin up an environment fast, which is identical to production.

The need for an automated way of provisioning such environments fast (in a repeatable manner) become obvious and hence IAC (stands for Infrastructure as Code) kicked in.

There are numerous tools (Puppet, Ansible, Vagrant etc.) that help building such coded-environment. Azure Resource Manager Template brings a new way of doing IAC when an application is targeted to build and run on Azure. Most of these tools (including RM template) are even idempotent, which ensures that you can run the same configuration multiple times while achieving the same result.

From Microsoft Azure web site:

Azure applications typically require a combination of resources (such as a database server, database, or website) to meet the desired goals. Rather than deploying and managing each resource separately, you can create an Azure Resource Manager template that deploys and provisions all of the resources for your application in a single, coordinated operation. In the template, you define the resources that are needed for the application and specify deployment parameters to input values for different environments. The template consists of JSON and expressions which you can use to construct values for your deployment.

I was excited the first time I saw this in action in one of the Channel9 Videos. Couldn’t wait to give it a go. The idea of having a template that describes all the Azure resources (Service Bus, SQL Azure, VMs, WebApps etc.) in a template file and having the capability to parameterized it with different values that varies over different environments could be very handy for a CI/CD scenarios. The templates can be nested, which also makes them more modularized and more manageable.

Lately I had the pleasure to dig deeper in Azure RM templates, as we are using it for the project I am working these days. I wanted to come up with a sample template that shows how to use RM template to construct resources that allows me to share my learnings. The Scripts can be found into this GitHub Repo.

One problem that I didn’t know how to handle yet, was the credentials that needed in order to provision the infrastructures. For instance, the VM passwords, SQL passwords etc. I don’t think anybody wants to check-in their passwords, into the source control systems visible in Azure RM parameter JSON files. To address this issue, the solution I came up with for now is, I uploaded the RM parameter JSON files into a private container of a Blob Storage (Note that, the storage account is into the same Azure Subscription where the Infrastructure I intend to provision in). A PowerShell script then download the Shared Access Signature (SAS) token for that Blob storage container and uses that to download the parameters JSON Blob into a PSCustomObject and removes the locally downloaded JSON file. Next step, it converts the PSCustomObject into a Hash Table which is passed through the Azure RM Cmdlet to kick of the provision process. That way, there is no need to have a file checked in to the Source control system that has credentials. Also the Administrators who manages the Azure subscription can Crete a private Blob storage and use the Azure Storage Explorer to create and update his credentials into the RM parameters JSON file. A CI process can download the parameters files just in time before provisioning infrastructures.