Skip to content

Azure.Pillar.Reliability#

v1.35.0Download CSV

Microsoft Azure Well-Architected Framework - Reliability pillar specific baseline.

Rules#

The following rules are included within the Azure.Pillar.Reliability baseline.

This baseline includes a total of 96 rules.

Name Synopsis Severity Maturity
Azure.ACR.GeoReplica Applications or infrastructure relying on a container image may fail if the registry is not available at the time they start. Important -
Azure.ACR.MinSku The Basic SKU provides limited performance and features for production container registry workloads. Important -
Azure.ADX.SLA Use SKUs that include an SLA when configuring Azure Data Explorer (ADX) clusters. Important -
Azure.AKS.AvailabilityZone AKS clusters deployed with virtual machine scale sets should use availability zones in supported regions for high availability. Important -
Azure.AKS.CNISubnetSize AKS clusters using Azure CNI should use large subnets to reduce IP exhaustion issues. Important -
Azure.AKS.MaintenanceWindow Configure customer-controlled maintenance windows for AKS clusters. Important -
Azure.AKS.MinNodeCount AKS clusters should have minimum number of system nodes for failover and updates. Important -
Azure.AKS.MinUserPoolNodes User node pools in an AKS cluster should have a minimum number of nodes for failover and updates. Important -
Azure.AKS.PoolVersion AKS node pools should match Kubernetes control plane version. Important -
Azure.AKS.UptimeSLA AKS clusters should have Uptime SLA enabled for a financially backed SLA. Important -
Azure.AKS.Version Older versions of Kubernetes may have known bugs or security vulnerabilities, and may have limited support. Important -
Azure.APIM.AvailabilityZone API Management instances should use availability zones in supported regions for high availability. Important -
Azure.APIM.CertificateExpiry Renew certificates used for custom domain bindings. Important -
Azure.APIM.MultiRegion Enhance service availability and resilience by deploying API Management instances across multiple regions. Important -
Azure.APIM.MultiRegionGateway API Management instances should have multi-region deployment gateways enabled. Important -
Azure.AppConfig.GeoReplica Replicate app configuration store across all points of presence for an application. Important -
Azure.AppConfig.PurgeProtect Consider purge protection for app configuration store to ensure store cannot be purged in the retention period. Important -
Azure.AppConfig.SKU App Configuration should use a minimum size of Standard. Important -
Azure.AppGw.AvailabilityZone Application Gateway (App Gateway) should use availability zones in supported regions for improved resiliency. Important -
Azure.AppGw.MigrateWAFPolicy Migrate to Application Gateway WAF policy. Critical -
Azure.AppGw.MinInstance Application Gateways should use a minimum of two instances. Important -
Azure.AppService.AlwaysOn Configure Always On for App Service apps. Important -
Azure.AppService.AvailabilityZone Deploy app service plan instances using availability zones in supported regions to ensure high availability and resilience. Important -
Azure.AppService.PlanInstanceCount App Service Plan should use a minimum number of instances for failover. Important -
Azure.AppService.WebProbe Configure and enable instance health probes. Important -
Azure.AppService.WebProbePath Configure a dedicated path for health probe requests. Important -
Azure.ASE.AvailabilityZone Deploy app service environments using availability zones in supported regions to ensure high availability and resilience. Important -
Azure.AVD.ScheduleAgentUpdate Define a windows for agent updates to minimize disruptions to users. Important -
Azure.ContainerApp.AvailabilityZone Use Container Apps environments that are zone redundant to improve reliability. Important -
Azure.ContainerApp.MinReplicas Use multiple replicas to remove a single point of failure. Important -
Azure.ContainerApp.Storage Use of Azure Files volume mounts to persistent storage container data. Awareness -
Azure.Cosmos.ContinuousBackup Enable continuous backup on Cosmos DB accounts. Important -
Azure.Cosmos.SLA Use a paid tier to qualify for a Service Level Agreement (SLA). Important -
Azure.DataFactory.Version Consider migrating to DataFactory v2. Awareness -
Azure.EntraDS.MinReplicas Applications or infrastructure relying on a managed domain may fail if the domain is not available. Important -
Azure.EntraDS.SKU The default SKU for Microsoft Entra Domain Services supports resiliency in a single region. Important -
Azure.Firewall.AvailabilityZone Deploy firewall instances using availability zones in supported regions to ensure high availability and resilience. Important -
Azure.FrontDoor.Probe Use health probes to check the health of each backend. Important -
Azure.FrontDoor.ProbeMethod Configure health probes to use HEAD requests to reduce performance overhead. Important -
Azure.FrontDoor.ProbePath Configure a dedicated path for health probe requests. Important -
Azure.Grafana.Version Grafana workspaces should be on Grafana version 10. Important -
Azure.KeyVault.PurgeProtect Enable Purge Protection on Key Vaults to prevent early purge of vaults and vault items. Important -
Azure.KeyVault.SoftDelete Enable Soft Delete on Key Vaults to protect vaults and vault items from accidental deletion. Important -
Azure.LB.AvailabilityZone Load balancers deployed with Standard SKU should be zone-redundant for high availability. Important -
Azure.LB.Probe Use a specific probe for web protocols. Important -
Azure.LB.StandardSKU Load balancers should be deployed with Standard SKU for production workloads. Important -
Azure.Log.Replication Log Analytics workspaces should have workspace replication enabled to improve service availability. Important -
Azure.MariaDB.GeoRedundantBackup Azure Database for MariaDB should store backups in a geo-redundant storage. Important -
Azure.Monitor.ServiceHealth Configure Service Health alerts to notify administrators. Important -
Azure.MySQL.GeoRedundantBackup Azure Database for MySQL should store backups in a geo-redundant storage. Important -
Azure.MySQL.MaintenanceWindow Configure a customer-controlled maintenance window for Azure Database for MySQL servers. Important -
Azure.MySQL.UseFlexible Use Azure Database for MySQL Flexible Server deployment model. Important -
Azure.MySQL.ZoneRedundantHA Deploy Azure Database for MySQL servers using zone-redundant high availability (HA) in supported regions to ensure high availability and resilience. Important -
Azure.NIC.UniqueDns Network interfaces (NICs) should inherit DNS from virtual networks. Awareness -
Azure.NSG.DenyAllInbound When all inbound traffic is denied, some functions that affect the reliability of your service may not work as expected. Important -
Azure.PostgreSQL.GeoRedundantBackup Azure Database for PostgreSQL should store backups in a geo-redundant storage. Important -
Azure.PostgreSQL.MaintenanceWindow Configure a customer-controlled maintenance window for Azure Database for PostgreSQL servers. Important -
Azure.PostgreSQL.ZoneRedundantHA Deploy Azure Database for PostgreSQL servers using zone-redundant high availability (HA) in supported regions to ensure high availability and resilience. Important -
Azure.PublicIP.AvailabilityZone Public IP addresses deployed with Standard SKU should use availability zones in supported regions for high availability. Important -
Azure.PublicIP.StandardSKU The basic SKU is being retired on 30 September 2025, and does not include several reliability and security features. Important -
Azure.Redis.AvailabilityZone Premium Redis cache should be deployed with availability zones for high availability. Important -
Azure.Redis.Version Azure Cache for Redis should use the latest supported version of Redis. Important -
Azure.RedisEnterprise.Zones Enterprise Redis cache should be zone-redundant for high availability. Important -
Azure.RSV.ReplicationAlert Recovery Services Vaults (RSV) without replication alerts configured may be at risk. Important -
Azure.RSV.StorageType Recovery Services Vaults (RSV) not using geo-replicated storage (GRS) may be at risk. Important -
Azure.Search.IndexSLA Use a minimum of 3 replicas to receive an SLA for query and index updates. Important -
Azure.Search.QuerySLA Use a minimum of 2 replicas to receive an SLA for index queries. Important -
Azure.SignalR.SLA Use SKUs that include an SLA when configuring SignalR Services. Important -
Azure.SQL.MaintenanceWindow Configure a customer-controlled maintenance window for Azure SQL databases. Important -
Azure.SQLMI.MaintenanceWindow Configure a customer-controlled maintenance window for Azure SQL Managed Instances. Important -
Azure.Storage.ContainerSoftDelete Enable container soft delete on Storage Accounts. Important -
Azure.Storage.FileShareSoftDelete Enable soft delete on Storage Accounts file shares. Important -
Azure.Storage.SoftDelete Enable blob soft delete on Storage Accounts. Important -
Azure.Storage.UseReplication Storage Accounts using the LRS SKU are only replicated within a single zone. Important -
Azure.Template.LocationDefault Set the default value for the location parameter within an ARM template to resource group location. Awareness -
Azure.TrafficManager.Endpoints Traffic Manager should use at lest two enabled endpoints. Important -
Azure.VM.ASAlignment Use availability sets aligned with managed disks fault domains. Important -
Azure.VM.ASDistributeTraffic Ensure high availability by distributing traffic among members in an availability set. Important -
Azure.VM.ASMinMembers Availability sets should be deployed with at least two virtual machines (VMs). Important -
Azure.VM.BasicSku Virtual machines (VMs) should not use Basic sizes. Important -
Azure.VM.MaintenanceConfig Use a maintenance configuration for virtual machines. Important -
Azure.VM.Standalone Single instance VMs are a single point of failure, however reliability can be improved by using premium storage. Important -
Azure.VMSS.AutoInstanceRepairs Applications or infrastructure relying on a virtual machine scale sets may fail if VM instances are unhealthy. Important -
Azure.VMSS.AvailabilityZone Deploy virtual machine scale set instances using availability zones in supported regions to ensure high availability and resilience. Important -
Azure.VMSS.ZoneBalance Deploy virtual machine scale set instances using the best-effort zone balance in supported regions. Important -
Azure.VNET.BastionSubnet VNETs with a GatewaySubnet should have an AzureBastionSubnet to allow for out of band remote access to VMs. Important -
Azure.VNET.FirewallSubnetNAT Zonal-deployed Azure Firewalls should consider using an Azure NAT Gateway for outbound access. Awareness -
Azure.VNET.LocalDNS Virtual networks (VNETs) should use DNS servers deployed within the same Azure region. Important -
Azure.VNET.SingleDNS Virtual networks (VNETs) should have at least two DNS servers assigned. Important -
Azure.VNG.ERAvailabilityZoneSKU Use availability zone SKU for virtual network gateways deployed with ExpressRoute gateway type. Important -
Azure.VNG.ERLegacySKU Migrate from legacy SKUs to improve reliability and performance of ExpressRoute (ER) gateways. Critical -
Azure.VNG.MaintenanceConfig Use a customer-controlled maintenance configuration for virtual network gateways. Important -
Azure.VNG.VPNActiveActive Use VPN gateways configured to operate in an Active-Active configuration to reduce connectivity downtime. Important -
Azure.VNG.VPNAvailabilityZoneSKU Use availability zone SKU for virtual network gateways deployed with VPN gateway type. Important -
Azure.VNG.VPNLegacySKU Migrate from legacy SKUs to improve reliability and performance of VPN gateways. Critical -
Azure.WebPubSub.SLA Use SKUs that include an SLA when configuring Web PubSub Services. Important -