Last updated: May 2026
Azure Virtual Machines Intermediate AZ-104 ⏱ 14 min read

Azure VM Scale Sets

What happens when your web app suddenly gets 10x the normal traffic? With individual VMs, you'd have to manually create more VMs, configure them, add them to the load balancer, and hope you do it fast enough. VM Scale Sets (VMSS) automate all of this — they automatically add or remove VMs based on real-time demand. This is how modern elastic applications handle variable load.

What you'll learn What VM Scale Sets are · Orchestration modes (Uniform vs Flexible) · Auto-scaling policies — metric-based, schedule-based, predictive · Scale-in and scale-out · Scaling limits · Zone-spanning scale sets · Creating a scale set · Real-world scaling scenarios

What Are VM Scale Sets?

A VM Scale Set is a group of identical VMs managed together as a single unit. You define the VM configuration once (image, size, OS, software), and the Scale Set creates and manages multiple copies. The number of copies scales automatically based on rules you define.

ℹ️
Key Benefit Instead of manually managing 10 individual VMs, you manage one Scale Set. Azure handles creating, deleting, updating, and load balancing the VMs automatically. When CPU hits 80%, Azure adds more VMs. When it drops to 20%, Azure removes them and you stop paying.

What Scale Sets Give You

  • Elasticity — Scale from 1 to 1,000 VMs automatically
  • High Availability — Automatically spreads VMs across fault domains and Availability Zones
  • Consistent Configuration — All VMs are identical — same image, same software, same config
  • Integrated Load Balancing — Works natively with Azure Load Balancer and Application Gateway
  • Rolling Updates — Update VMs one at a time without downtime

Orchestration Modes

Azure VMSS offers two orchestration modes:

Uniform ModeFlexible Mode
VM identityAll VMs are identical instancesEach VM is a full standalone VM
ManagementManaged as a groupCan manage individual VMs independently
Max instances1,0001,000
Mixed VM sizesNo — all same sizeYes — different sizes allowed
Best forStateless web tiers, microservicesStateful workloads, mixed workloads
💡
Which to Use? For most web and application tiers — use Flexible mode. It's the newer, more powerful option. Uniform mode is the original mode, still used for large homogeneous workloads.

Scaling Policies

There are three ways to configure when and how your Scale Set scales:

1. Metric-Based Scaling (Most Common)

Scale based on real-time metrics — CPU usage, memory, queue length, custom metrics.

Rule ExampleAction
CPU > 75% for 5 minutesAdd 2 VMs (scale out)
CPU < 25% for 10 minutesRemove 1 VM (scale in)
Queue length > 100 messagesAdd 3 VMs
Memory > 80%Add 1 VM

2. Schedule-Based Scaling

Scale based on a fixed schedule — useful for predictable patterns.

  • Scale to 20 VMs every weekday at 8 AM (business hours peak)
  • Scale down to 2 VMs every weekday at 8 PM (after hours)
  • Scale to 50 VMs on the first day of each month (batch processing day)

3. Predictive Scaling

Uses machine learning to predict future load based on historical patterns and pre-scales before the demand arrives — rather than reacting after CPU already spikes. Available on Flexible mode scale sets.

Scale-Out vs Scale-In

ActionMeaningWhen
Scale OutAdd more VMs (horizontal scaling)Load increases
Scale InRemove VMs (horizontal scaling)Load decreases
Scale UpIncrease VM size (vertical scaling)Need more power per VM
Scale DownDecrease VM size (vertical scaling)Over-provisioned
ℹ️
Scale Sets Do Horizontal Scaling VM Scale Sets handle scale-out and scale-in (adding/removing VMs). Vertical scaling (changing VM size) is not automated — it requires manual resizing and a VM restart.

Scale-In Policy

When scaling in, Azure needs to decide which VMs to remove. The default policy removes the VM with the highest instance ID first. You can configure:

  • Default — Remove highest instance ID first
  • OldestVM — Remove the oldest VM first (good for keeping fresh instances)
  • NewestVM — Remove the newest VM first

Scaling Limits

SettingDescriptionExample
Minimum instancesAlways keep at least this many VMs running2 (never scale below 2)
Maximum instancesNever exceed this many VMs50 (cap at 50 to control costs)
Default instancesStart with this many if no metric data3
Cooldown periodWait this long between scaling actions5 minutes (avoid rapid flapping)
⚠️
Always Set a Maximum! Without a maximum instance limit, a misconfigured scaling rule could spin up hundreds of VMs and generate a massive unexpected bill. Always set a sensible maximum based on your budget and expected load.

Zone-Spanning Scale Sets

VM Scale Sets can span multiple Availability Zones — giving you both elasticity and zone-level high availability in one service. Azure automatically distributes VM instances across zones as it scales out.

Azure CLI Create a zone-spanning Scale Set
az vmss create \
  --resource-group myResourceGroup \
  --name myScaleSet \
  --image Ubuntu2204 \
  --vm-sku Standard_B2s \
  --instance-count 3 \
  --zones 1 2 3 \
  --admin-username azureuser \
  --generate-ssh-keys \
  --orchestration-mode Flexible \
  --load-balancer myLoadBalancer

Creating a VM Scale Set

Azure CLI Create a Scale Set with auto-scaling
# Create the Scale Set
az vmss create \
  --resource-group myResourceGroup \
  --name myScaleSet \
  --image Ubuntu2204 \
  --vm-sku Standard_B2s \
  --instance-count 2 \
  --admin-username azureuser \
  --generate-ssh-keys

# Add auto-scaling: scale out when CPU > 70%
az monitor autoscale create \
  --resource-group myResourceGroup \
  --resource myScaleSet \
  --resource-type Microsoft.Compute/virtualMachineScaleSets \
  --name autoscale-vmss \
  --min-count 2 \
  --max-count 10 \
  --count 2

# Add scale-out rule
az monitor autoscale rule create \
  --resource-group myResourceGroup \
  --autoscale-name autoscale-vmss \
  --condition "Percentage CPU > 70 avg 5m" \
  --scale out 2

# Add scale-in rule
az monitor autoscale rule create \
  --resource-group myResourceGroup \
  --autoscale-name autoscale-vmss \
  --condition "Percentage CPU < 25 avg 10m" \
  --scale in 1

Real-World Scaling Scenarios

E-Commerce Sale Event

An online store normally runs 5 VMs but expects 20x traffic during a sale. Configure:

  • Schedule: Scale to 30 VMs 30 minutes before the sale starts
  • Metric: If CPU exceeds 80%, add 5 more VMs (up to max 100)
  • Schedule: Scale back to 5 VMs 2 hours after the sale ends

Business Hours Application

An internal HR application used only during office hours:

  • Schedule: Scale to 10 VMs at 8 AM weekdays
  • Schedule: Scale to 2 VMs at 7 PM weekdays
  • Schedule: Scale to 2 VMs all day Saturday and Sunday
💡
AZ-104 Exam Tip Know the difference between scale-out (more VMs) and scale-up (bigger VMs). Know that Scale Sets support metric-based, schedule-based, and predictive scaling. Know that zone-spanning Scale Sets provide both elasticity and high availability.
📝 Practice Questions
Click an option to check your answer. AZ-104 style questions.
Q1. What is the primary purpose of Azure VM Scale Sets?
A To run multiple different workloads on a single VM
B To automatically scale the number of identical VMs based on demand
C To automatically resize VMs to larger sizes when load increases
D To create automatic backups of virtual machines
Q2. What is "scale-out" in the context of VM Scale Sets?
A Increasing the size (vCPU/RAM) of existing VMs
B Adding more VM instances to handle increased load
C Removing VM instances when load decreases
D Moving VMs to a different Azure region
Q3. Why should you always configure a maximum instance count on a VM Scale Set?
A To improve VM performance
B To prevent runaway scaling that could generate unexpected huge costs
C Because Azure Load Balancer requires it
D To enable Availability Zone spanning
Q4. An application has predictable high traffic every weekday between 9 AM and 6 PM. Which scaling policy is most appropriate?
A Metric-based scaling on CPU utilisation
B Schedule-based scaling to scale up at 9 AM and down at 6 PM weekdays
C Manual scaling by updating the instance count twice a day
D Predictive scaling based on historical data
Q5. What is the maximum number of VM instances a Scale Set can contain?
A 100
B 500
C 1,000
D Unlimited
Comments
Disclaimer: RedKite Cloud is an independent educational resource and is not affiliated with, endorsed by, or officially connected to Microsoft Corporation. All product names, logos, and trademarks are property of their respective owners. Content is written independently for educational purposes only.