AI Powered SLA Calculator from Architecture Diagram

Introduction

Designing cloud native application architecture that is resilient requires a careful consideration of many aspects. Availability metrics and Recovery metrics could ease the design process. Microsoft has some decent guidance that describes the recommendations for defining availability and recovery target metrics for critical workloads and flows. Reliability targets are derived through workshop exercises with business stakeholders. The targets are refined through monitoring and testing.

Technical discussions shouldn’t drive how you define reliability targets for your critical flows. Instead, business stakeholders should focus on customers as they define a workload’s requirements. Technical experts help the stakeholders assign realistic numerical values that correlate to those requirements. As they share knowledge, technical experts allow for negotiation and mutual consensus about realistic SLOs.

Recently, I was trying to iterate solution diagrams and see the SLA impact for a specific configuration. While doing so, I felt the need to create a tool to calculate the SLA quickly from the diagram and see the effect of certain choices on SLA. To address this need, I’ve recently developed a unique application that leverages the power of Azure Open AI and other advanced technologies to provide comprehensive SLA calculations from architectural diagrams.

About the Tool

Key Features

The tool I built offers several features:

Image-Based Input: Users can simply paste an existing architecture diagram onto a canvas within the app. The application then processes this image to identify and list individual components.
Component-wise SLA Calculation: For each identified component, the app calculates the SLA based on predefined criteria and Azure services’ SLAs.
Composite SLA Calculation: The application aggregates SLAs for groups of resources or components to provide a composite SLA, reflecting the reliability of interconnected services.
Total SLA Calculation: It also calculates the total SLA for the entire application, offering a holistic view of its reliability.
Multi-Region Analysis: Users can extend their architecture across multiple regions in real-time and observe how these changes impact the overall SLA.
Real-time SLA Impact: Any adjustments in the architecture diagram or region selection instantly update the SLA calculations, providing immediate feedback.

How It Works

The application utilizes Azure Open AI, specifically the GPT-4o model, which is adept at various generative AI tasks, including image processing and response generation. Here’s a breakdown of the core functionalities:

Image Processing: The GPT-4o model analyzes the pasted architecture diagram, identifying and categorizing components.
SLA Association: Through Retrieval-Augmented Generation (RAG), the application associates each component with its respective SLA based on Azure’s extensive service offerings.
Comprehensive SLA Calculation: By combining individual SLAs, the app provides a composite SLA for grouped resources and a total SLA for the entire application.

Accessing the Application

The application is publicly accessible at https://sla.octo-lamp.nl/. However, due to the costs associated with Open AI usage, users must sign in using a Gmail or custom account to access the features. This measure ensures that the Open AI endpoint is not exploited by anonymous users, maintaining sustainable usage of the service.

Source Code Availability

For those interested in the technical aspects or looking to contribute, the source code of the application is available on GitHub: https://github.com/MoimHossain/azure-sla. This repository includes detailed documentation on setting up and running the application, as well as insights into its architectural design.

Conclusion

Developing this application has been an fun ride, blending advanced AI capabilities with practical cloud computing needs. By simplifying SLA calculations and providing real-time insights, this tool aims to empower cloud architects and developers to design more reliable and resilient applications. I invite you to explore the application, provide feedback, and contribute to its ongoing development.

Feel free to reach out with any questions or suggestions. Your input is invaluable in refining this tool and making it even more beneficial for the cloud computing community.

Introduction

About the Tool

Key Features

How It Works

Accessing the Application

Source Code Availability

Conclusion

Share this:

Related

Published by Moim Hossain

Share this:

Leave a comment Cancel reply