Flexible and Simple Solution to Start and Stop VMs

This post has been republished via RSS; it originally appeared at: Core Infrastructure and Security Blog articles.

Hi folks! My name is Felipe Binotto, Cloud Solution Architect, based in Australia.

This post will be about a solution I recently deployed to a customer to Start/Stop VMs on a schedule. You may be asking yourself why we need another solution if two official solutions are already available for the same purpose. The answer is straightforward – the solution I propose is simple and flexible and, in my opinion, the existing solutions are not.

The two official solutions still do a great job and may be a better fit for your environment. You can check them in the following links:

The solution I propose is centralized for management and operations teams (admins) but decentralized for the consumers (users). The following are some of the core benefits:

  • Flexibility – You chose the VMs you want to start/stop
  • Simplicity – The only action required is to tag the VMs to onboard with the service
  • Multi schedule – Many options for scheduling
  • Integration – Integrate other solutions to fit your requirements
  • Low cost – Runbooks are cheap, and the cost will be much smaller than the savings
  • Parallelism – VMs can be started/stopped in parallel

I will not cover all the code in this post, I will just highlight what I deem important. You can download the full runbook from my GitHub.

Pre-Requisites

The following are the prerequisites which I will not cover in this post, and you should already have them in place before you start:

  • Azure Subscription
  • Automation Account
  • Virtual Machines

High-Level Steps

The following are the high-level steps on what we will do and the order we will do it:

  • Clone repo which contains the runbook
  • Create a user assigned identity
  • Provide the identity with the required access
  • Customize runbook
  • Import runbook to Automation Account
  • Tag VM
  • Validate solution

Tags

Before we get to the nitty-gritty of the solution, let’s look at the tags. The following are the tags, whether they are required, and their use.

  • vm-start-stop-enable (Required) – this is the main tag. If a VM doesn’t have this tag or the value is anything other than ‘true’, it is not even considered in the main logic of the runbook
  • vm-start-stop-schedule (Required) – this is the tag where you specify the schedule when the VM should be started or stopped. The following are the possible formats for this tag. All times are UTC

 

mon=S12:00-E13:00 – this means the VM will Start every Monday at 12:00 and End (Stop) at 13:00

mon=S12:00-E13:00|tue=S14:00-E18: 00 – this means the VM will Start every Monday at 12:00 and End (Stop) at 13:00 and Start every Tuesday at 14:00 and End (Stop) at 18:00

mon,tue,wed,thu,fri=S12:00-E13:00|sat,sun=S10:00-E15:00 – this means the VM will Start every day of the week at 12:00 and End (Stop) at 13:00 and Start every weekend at 10:00 and End (Stop) at 15:00

weekdays=S12:00-E13:00 – this means the VM will Start every day of the week at 12:00 and End (Stop) at 13:00

weekends=S12:00-E13:00 – this means the VM will Start every day of the weekend at 12:00 and End (Stop) at 13:00

sun,mon=S12:00-E13:00 – this means the VM will Start every Sunday and Monday at 12:00 and End (Stop) at 13:00

mon=S12:00 – this means the VM will Start every Monday at 12:00 if not already started

mon=E12:00 – this means the VM will End (Stop) every Monday at 12:00 if not already stopped

 

  • vm-start-stop-sequence (Optional) – this is the tag where you specify the order for the VM to be started. When stopped, the sequence will be considered in reverse order. For example, let’s say you have a 3-tier application. 1 VM for Web, 1 for App and 1 for DB. You always want  the DB to be the first to start and the last to stop, therefore in this case, you would tag your VMs as follows: DB (1), App (2), Web (3).

Getting Started

Clone the repo by running the following command:

git clone https://github.com/fbinotto/startstopvm.git

Create a new User Managed Assigned Identity.

$id = (New-AzUserAssignedIdentity -Name vmstartstop -ResourceGroupName REPLACE_WITH_YOUR_RG –Location australiaeast).principalId

 

Assign the identity VM Contributor rights in your subscription(s) so it can start and stop your VMs.

New-AzRoleAssignment -ObjectId $id ` -RoleDefinitionName 'Virtual Machine Contributor' ` -Scope /subscriptions/<subscriptionId>

 

Import the runbook to your automation account. Make sure you run the next command from the folder which was cloned.

Import-AzAutomationRunbook -Path ".\startstopvm.ps1" -Name StartStopVM – Published:$true -ResourceGroupName REPLACE_WITH_YOUR_RG -AutomationAccountName REPLACE_WITH_YOUR_AA -Type PowerShellWorkflow

 

Open the script in VS Code or you can edit straight in your automation account. Now I will highlight some of the important sections, so you have a clear understanding of what is going on.

At the very top you can see we are using PowerShell workflow.

workflow startstopvm

 

Next, you will see how you can exclude subscriptions from the scope of the runbook.

$excludedSubs = @((Get-AutomationVariable -Name 'excludedSubs').split(","))

 

This is just to be in the safe side because any VMs which don’t have the required Tags will not be in scope anyway. You just need to create a variable in your automation account with the subscription Ids separated by a comma.

The following is how we connect to Azure. Copy the value from the $id which we retrieved earlier and replace in the following command.

$null = Connect-AzAccount -Identity -AccountId REPLACE_WITH_USER_MANAGED_IDENTITY_ID

 

Here is where it gets more interesting. We run an Azure Graph API query (which is super-fast) to retrieve all Virtual Machines which match the following criteria:

  • Has a tag vm-start-stop-enable with value true
  • And the day specified in the vm-start-stop-schedule matches the current day
$query = "resources | where type in~ ('microsoft.compute/virtualmachines') | where tags['vm-start-stop-enable'] == 'true' | where tags['vm-start-stop-schedule'] contains '$day' or (tags['vm-start-stop-schedule'] contains 'weekends' and 'sat-sun' contains '$day') or (tags['vm-start-stop-schedule'] contains 'weekdays' and 'mon-tue-wed-thu-fri' contains '$day')"

 

We do this to minimize the number of VMs which the query will return. We don’t want to be evaluating VMs which are not in scope. Therefore, this is the first thing we do. This also ensures we only start and stop VMs which have the required tags and don’t unintentionally start or stop VMs which shouldn’t be.

 

Next, we group the VMs per Resource Group. The reason here is to be able to start and stop in sequence. If the VMs are not grouped in RGs, we could have many VMs with the same order to be started or stopped. As per best practices, you should have all the VMs for an application in the same RG, because they share the same lifecycle.

$groupVMs = $vms | Group-Object resourceGroup

 

The next several lines of code are some functions which are used throughout the script. The following is their functionality:

  • Start-Validation: this function performs the following validations:
    • Ensures the vm-start-stop-sequence tag is populated and the value is an integer
    • Ensures the vm-start-stop-schedule is exists, it is not empty, there are no blank spaces and it is in the required format

 

  • Get-DayOfWeek: returns the first three letters of the day of the week
  • Get-StartStopTag: returns the value of the vm-start-stop-schedule so we know if it is time to start/stop the VM or not

From this point, the main logic of the script starts. Let’s break it down and understand what is going on. We will first cover the IF statement which is for VMs which will be started/stopped in parallel (not in sequence).

# Iterate through each group of VMs in the same RG in parallel foreach -Parallel ($group in $groupVMs) { # If there is 1 or less VMs which have the sequence tag in the same RG then sequence is not required if(($group.group.tags -match "vm-start-stop-sequence").count -le 1){ # Iterate through each VM in parallel foreach -Parallel ($vm in $group.Group){ $valid = Start-Validation -vm $vm if($valid -eq $true){ $currentDate = (Get-Date).ToUniversalTime() $time = $currentDate.TimeOfDay.TotalMinutes # Get VM schedule $arrayOfDays = Get-StartStopTag -vm $vm # Get just the time(s) $schedules = ($arrayOfDays.split("=")[1].split("-")) # If there is only a single time and it starts with S, then that is the Start Time if($schedules.count -eq 1 -and $schedules -match "S"){ Write-Output "$($vm.Name) set to start at $utcStartTime" $utcStartTime = [DateTime]$schedules.replace("S","") $singleTime = $true } # If there is only a single time and it starts with E, then that is the Stop Time if($schedules.count -eq 1 -and $schedules -match "E"){ $utcStopTime = [DateTime]$schedules.replace("E","") Write-Output "$($vm.Name) set to stop at $utcStopTime" $singleTime = $true } # If there are two times, then the first is the Start Time and the second is the Stop Time if($schedules.count -eq 2){ Write-Output "$($vm.Name) set to start at $utcStartTime" Write-Output "$($vm.Name) set to stop at $utcStopTime" $utcStartTime = [DateTime]$schedules[0].replace("S","") $utcStopTime = [DateTime]$schedules[1].replace("E","") } # Transform the time in Total Minutes $utcStartTimeTotalMinutes = $utcStartTime.TimeOfDay.TotalMinutes $utcStopTimeTotalMinutes = $utcStopTime.TimeOfDay.TotalMinutes # Work out duration of downtime if($singleTime -ne $true){ if(($utcStartTime-$utcStopTime).TotalHours -is [int]){ $duration = ($utcStartTime-$utcStopTime).TotalHours } else { $duration = ($utcStartTime-$utcStopTime).TotalHours + 24 } } # If current time is greater or equal the (time to start - 15 minutes) and current time is less or equal the (time to start + 15 minutes) and VM is not running or starting # This means the VM may start 15 minutes earlier but, in theory, never later than the schedule if ($time -ge ($utcStartTimeTotalMinutes - 15) -and $time -le ($utcStartTimeTotalMinutes + 15) -and $vm.properties.extended.instanceView.powerState.displayStatus -notmatch "running" -and $vm.properties.extended.instanceView.powerState.displayStatus -notmatch "starting") { # Select VM subscription $currentSub = Select-AzSubscription -SubscriptionId $vm.SubscriptionId Write-Output "Starting VM $($vm.Name) at $(Get-Date)..." Start-AzVM -Name $vm.Name -ResourceGroupName $vm.resourceGroup -NoWait } # If current time is greater or equal the time to stop and current time is less than the (time to start + 15 minutes) and VM is not deallocated or stopping # This means the VM may stop 15 minutes later, but never earlier than the schedule if ($time -ge $utcStopTimeTotalMinutes -and $time -lt ($utcStopTimeTotalMinutes + 15) -and $vm.properties.extended.instanceView.powerState.displayStatus -notmatch "deallocated" -and $vm.properties.extended.instanceView.powerState.displayStatus -notmatch "deallocating") { # Select VM subscription $currentSub = Select-AzSubscription -SubscriptionId $vm.SubscriptionId Write-Output "Stopping VM $($vm.Name) at $(Get-Date)..." # Stop VM Stop-AzVM -Name $vm.Name -ResourceGroupName $vm.ResourceGroup -Force -NoWait } } else{ Write-Output "$($vm.name): $($valid.Values)" } } }

 

As you can see the code has many comments but let me go a bit deeper here.

  1. We start iterating through each Resource Group
  2. We then count the VMs in that RG which have the vm-start-stop-sequence. If there is only 1 or none of the VMs with that tag, the VM doesn’t have to be started/stopped in sequence, for obvious reasons
  3. We run the validation function which I covered before
  4. We get the current time
  5. We run the Get-StartStopTag function which I covered before
  6. We do some manipulation on the value from the tag to get the start and stop time
  7. If the schedules variable has a single value, it means the VM was only tagged to be started or stopped, if it has two values then it was tagged with a start and a stop value
  8. Now it is the last part of the IF statement. Here is where we decide if the VM should be started or stopped.

    First, we check if the current time is greater or equal the time to start set in the tag minus 15 minutes, and if the current time is less or equal the time to start set in the tag plus 15 minutes, and if it is, we START the VM. We make this to ensure the VM still never start later than the schedule, but it is ok if it starts a bit before. Worth mentioning, we also check if the VM is not already running or starting in which case no action is required.

    Second, we check if the current time is greater or equal the time to stop set in the tag, and if the current time is less or equal the time to stop set in the tag plus 15 minutes, and if it is, we STOP the VM. We make this to ensure the VM still never stop earlier than the schedule, but it is ok if it stops a bit later. Worth mentioning, we also check if the VM is not already deallocated or deallocating in which case no action is required.

 

Awesome, we have covered the code for the VMs which will start in parallel. Now, it is time to cover the code for the VMs which will start in sequence which are part of the main ELSE statement.

else{ $arrayofVMsToStart = @() $arrayofVMsToStop = @() foreach($vm in $group.Group){ $valid = Start-Validation -vm $vm if($valid -eq $true){ $currentDate = (Get-Date).ToUniversalTime() $time = $currentDate.TimeOfDay.TotalMinutes $dw = Get-DayOfWeek # Get schedules $arrayOfDays = Get-StartStopTag -vm $vm # Get just the time(s) $schedules = ($arrayOfDays.split("=")[1].split("-")) # If there is only a single time and it starts with S, then that is the Start Time if($schedules.count -eq 1 -and $schedules -match "S"){ $utcStartTime = ([DateTime]$schedules.replace("S","")).ToUniversalTime() Write-Output "$($vm.Name) set to start at $utcStartTime" $singleTime = $true } # If there is only a single time and it starts with E, then that is the Stop Time if($schedules.count -eq 1 -and $schedules -match "E"){ $utcStopTime = ([DateTime]$schedules.replace("E","")).ToUniversalTime() Write-Output "$($vm.Name) set to stop at $utcStopTime" $singleTime = $true } # If there are two times, then the first is the Start Time and the second is the Stop Time if($schedules.count -eq 2){ $utcStartTime = ([DateTime]$schedules[0].replace("S","")).ToUniversalTime() $utcStopTime = ([DateTime]$schedules[1].replace("E","")).ToUniversalTime() Write-Output "$($vm.Name) set to start at $utcStartTime" Write-Output "$($vm.Name) set to stop at $utcStopTime" } # Transform the time in Total Minutes $utcStartTimeTotalMinutes = $utcStartTime.TimeOfDay.TotalMinutes $utcStopTimeTotalMinutes = $utcStopTime.TimeOfDay.TotalMinutes if($singleTime -ne $true){ # Work out duration of downtime if(($utcStartTime-$utcStopTime).TotalHours -is [int]){ $duration = ($utcStartTime-$utcStopTime).TotalHours } else { $duration = ($utcStartTime-$utcStopTime).TotalHours + 24 } } Write-Output "Current time in TotalMinutes is: $time" Write-Output "Start time in TotalMinutes is: $utcStartTimeTotalMinutes" Write-Output "Stop time in TotalMinutes is: $utcStopTimeTotalMinutes" Write-Output "$($vm.Name) downtime duration will be $duration" # If current time is greater or equal the (time to start - 15 minutes) and current time is less or equal the (time to start + 15 minutes) and VM is not running or starting # This means the VM may start 15 minutes earlier but, in theory, never later than the schedule if ($time -ge ($utcStartTimeTotalMinutes - 15) -and $time -le ($utcStartTimeTotalMinutes + 15) -and $vm.properties.extended.instanceView.powerState.displayStatus -notmatch "running" -and $vm.properties.extended.instanceView.powerState.displayStatus -notmatch "starting") { # Select VM subscription $currentSub = Select-AzSubscription -SubscriptionId $vm.SubscriptionId # If VM should start in sequence Write-Output "Found VM to Start." if ($vm.tags."vm-start-stop-sequence") { # Create array of VMs to control sequence [array]$arrayofVMsToStart += $vm Write-Output "$($vm.Name) will be started in the sequence: $($vm.tags.'vm-start-stop-sequence')" } # Else, just start the VM else { Write-Output "Starting VM $($vm.Name) at $(Get-Date)..." Start-AzVM -Name $vm.Name -ResourceGroupName $vm.resourceGroup -NoWait Start-Sleep 15 } } # If current time is greater or equal the time to stop and current time is less than the (time to start + 15 minutes) and VM is not deallocated or stopping # This means the VM may stop 15 minutes later, but never earlier than the schedule if ($time -ge $utcStopTimeTotalMinutes -and $time -lt ($utcStopTimeTotalMinutes + 15) -and $vm.properties.extended.instanceView.powerState.displayStatus -notmatch "deallocated" -and $vm.properties.extended.instanceView.powerState.displayStatus -notmatch "deallocating") { # Select VM subscription $currentSub = Select-AzSubscription -SubscriptionId $vm.SubscriptionId Write-Output "Found VM to Stop." # If VM should stop in sequence if ($vm.tags."vm-start-stop-sequence") { # Create array of VMs to control sequence [array]$arrayofVMsToStop += $vm Write-Output "$($vm.Name) will be stopped in the sequence: $($vm.tags.'vm-start-stop-sequence')" } # Else, just stop the VM else { Write-Output "Stopping VM $($vm.Name) at $(Get-Date)..." # Stop VM Stop-AzVM -Name $vm.Name -ResourceGroupName $vm.ResourceGroup -Force -NoWait } } } else{ Write-Output "$($vm.name): $($valid.Values)" } }
  1. We start iterating through each Resource Group. We already know that more than 1 VM in those RGs have the sequence tag
  2. We run the validation function which I covered before
  3. We get the current time
  4. We run the Get-StartStopTag function which I covered before
  5. We do some manipulation on the value from the tag to get the start and stop time
  6. If the schedules variable has a single value, it means the VM was only tagged to be started or stopped, if it has two values then it was tagged with a start and a stop value
  7. Now it is the last part of the ELSE statement. Here is where we decide if the VM should be started or stopped in sequence. However, we are not stopping or starting them yet. We add them to an array to be able to sort it and define the right sequence

    First, we check if the current time is greater or equal the time to start set in the tag minus 15 minutes, and if the current time is less or equal the time to start set in the tag plus 15 minutes, and if it is, we add the VM to the $arrayofVMsToStart. We make this to ensure the VM still never start later than the schedule, but it is ok if it starts a bit before. Worth mentioning, we also check if the VM is not already running or starting in which case no action is required.

    Second, we check if the current time is greater or equal the time to stop set in the tag, and if the current time is less or equal the time to stop set in the tag plus 15 minutes, and if it is, we add the VM to the $arrayofVMsToStop. We make this to ensure the VM still never stop earlier than the schedule, but it is ok if it stops a bit later. Worth mentioning, we also check if the VM is not already deallocated or deallocating in which case no action is required.

 

And now for the last bit of code, we sort those arrays in ascending or descending order and Start or Stop them.

# Start VMs in sequence foreach($vm in $arrayofVMsToStart | Sort-Object -property @{e={$_.tags.'vm-start-stop-sequence'}}){ Write-Output "Starting VM $($vm.Name) - Sequence $($_.tags.'vm-start-stop-sequence') at $(Get-Date)..." Start-AzVM -Name $vm.Name -ResourceGroupName $vm.ResourceGroup -NoWait Start-Sleep 15 } # Stop VMs in sequence foreach($vm in ($arrayofVMsToStop | Sort-Object -property @{e={$_.tags.'vm-start-stop-sequence'}} -Descending)){ Write-Output "Stopping VM $($vm.Name) at $(Get-Date)..." # Stop VM Stop-AzVM -Name $vm.Name -ResourceGroupName $vm.ResourceGroup -Force -NoWait Start-Sleep 15 }

 

Conclusion

Congratulations, you got to the end. As I mentioned before, there are solutions out there which can perform start and stop of VMs on a schedule. However, in my view, this is the solution that provides the most simplicity and flexibility to fit your requirements.

As a bonus, look at my previous post to learn how you can integrate in this solution, something that can allow you to perform other tasks which cannot be performed as part of the runbook.

I hope this was informative to you and thanks for reading!

 

Disclaimer

The sample scripts are not supported under any Microsoft standard support program or service. The sample scripts are provided AS IS without warranty of any kind. Microsoft further disclaims all implied warranties including, without limitation, any implied warranties of merchantability or of fitness for a particular purpose. The entire risk arising out of the use or performance of the sample scripts and documentation remains with you. In no event shall Microsoft, its authors, or anyone else involved in the creation, production, or delivery of the scripts be liable for any damages whatsoever (including, without limitation, damages for loss of business profits, business interruption, loss of business information, or other pecuniary loss) arising out of the use of or inability to use the sample scripts or documentation, even if Microsoft has been advised of the possibility of such damages.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.