Citrix Virtual Desktop provides a nice experience for both users and administrators, with a lot of enhanced features. The communication between the client VM (VDA) and the server infrastructure is mandatory for a health check and client registration.
A session that failed to register itself with the Delivery controller makes the VM inaccessible for the user and this will mark the VDA as Unregistered.
Table of Contents
The Problem of unregistered VM
By checking the VM status in Desktop Studio, it will look like this.
Checking the VM via Powershell, this is the status of the failed VM
You can download the script from the link below. After downloading it, just rename it to Recover-XenDesktopFailVM.ps1
Make sure that Virtual Desktop Snapin are installed and loaded, Please check How to configure PowerShell SDK and execute commands remotely in XenApp/XenDesktop 7.x on Citrix.com
There are multiple reasons why this may occur, possibly a network problem, or a problem with the VDA installation. But in this tutorial, the discussion is on how to force the reboot of a failed unregistered VM.
It’s normal to have Unregistered VM in the studio, but these without the exclamation mark in the studio, or with a Faultstate other than Unregistered. These are normal to be unregistered, as it depends on your setup and configuration.
Finding Unregistered VM using Powershell
To make long story short this is the main filter I use. This PowerShell command will get all the VDA with:
- RegistrationState = Unregistered
- FaultState = Unregistered
- SummaryState = Unregistered
If this combination was found, this VM is Unregistered and must be rebooted.
Add-PSSnapin Citrix* Get-BrokerMachine -AdminAddress 'DeliveryControllerServer' | where {($_.RegistrationState -like "*Unregistered*") -and ($_.FaultState -like "*Unregistered*") -and ($_.SummaryState -like "*Unregistered*")} | select RegistrationState,MachineInternalState,LastConnectionFailure,FaultState,SummaryState,MachineName,LastConnectionUser
The command line above will list all the VM with the Orange Triangle of Unregistered VM. After this, a force reboot must be triggered.
Setting a Threshold and protection from a mass reboot
But there is a big risk of having the entire VMs get a reboot in case of a failure in the database, or in a network that blocks/prevents the VDA access to the registration server. This might mark all the VDA Faultstate as unregistered.
There should be a safe reboot threshold that will prevent a bulk reboot. Setting a limit for the allowed failed VM was set, so if the number of the failed VM is higher than the specified threshold the reboot won’t be triggered.
How to use the Powershell Script
Creating the Powershell Configuration.
CreateConfig: This will start a Powershell wizard which will create the required configuration such as the IP of the Delivery Controller, ESXi Hypervisor, Safe Reboot Lock (MaxFailed), and other configs for the script to work. This should be the first thing to run.
Two files are required which will be generated from calling the -CreateConfig
– psxenfailvm.cfg : Contain the configuration
– psxenfailvm.crd: Contain the encrypted credential for your hypervisor. these credentials are encrypted and only the user which ran the script to generate the config can read the password. So if you need to run the script in different servers, you will need to re-run the script with -createconfig parameter
Prepare-XenDesktopFailedVM -CreateConfig
Reading the Powershell Configuration
ReadFromConfig: After creating the configuration, all that you need is to call the script using the -ReadFromConfig parameter, no additional parameter should be provided. The script will read the configuration as they are stored in the same directory $PSScriptRoot.
Running the Script as parameters
The script can run using a parameter mode so passing the configuration as parameters and argument, you can use the following parameters:
- ESXiServer: ESXi Server IP address or name (Required – String)
- ESXiCred: Hypervisor credentials (Required – String) later on SecureString.
- CitrixDeliveryControler: Delivery Controller IP or name (Required – String)
- MaxFailprotection: The Maximum number of failed unregistered VMs allowed for a reboot, for example, if this value is set to 5, then if 5 VM got unregistered, these VMs will be rebooted. But if the number of failed VMs is more than 5 a notification will be sent to the admin informing that there is an unexpected failure. (Not Required, Default=6, Int)
- AdminNotificationAddress: your email address to receive a notification. (Require, String)
- SMTPAddress: The SMTP address to use. (Require, String)
- Rechecktime: The Checking cycle, so the script will automatically check for a failed VM every Rechecktime in seconds. (Not Require, Default=460, Int)
Executing the script in action.
The script checks if there is any failed VM, is none, then the script will write to the console:
First Check round started at 10/18/2020 22:08:33 and a recheck will occur every 460.
After the 460-second pass, which is defined in the setting or using the ReCheckTime, the script after that will do another check, if no failure, then a console output will be written, but if there is a failure, the script will write.
A list of information related to the failed VM will be added to the console, and an email will be sent to the admin with this information, Plus it will add an Eventlog entry in the Application Log, XEN Scripting as a source.
<#PSScriptInfo .VERSION 1.0 .GUID 34b9b481-062b-41af-afbf-5df26edfb1c9 .AUTHOR Faris Malaeb .COMPANYNAME Faris Malaeb .COPYRIGHT 2020 .PROJECTURI https://www.powershellcenter.com/ .Usage To Create a config use: .\Prepare-XenDesktopFailedVM -CreateConfig To load Config use .\Prepare-XenDesktopFailedVM -ReadFromConfig To load via Parameters and arguments use: .\Prepare-XenDesktopFailedVM.ps1 -ESXiServer X.X.X.X -ESXiCred (Get-Credential) -CitrixDeliveryControler X.X.X.X -MaxFailprotection 6 -AdminNotificationAddress Admin@Domain.com -SMTPAddress SMTP.Server.com -Rechecktime 460 #> #Requires -Module VMware.VimAutomation.Core #Requires -Module Citrix.Broker.Commands <# .DESCRIPTION Parameter list: ################## Creating a config file CreateConfig: This will start a PS wizard to create the required configuration such as the IP of the Delivery Controller, ESXi Hypervisor, Safe Reboot Lock (MaxFailed), and other config for the script to work. This should be the first thing to run. ########################## ###################Load Config File ReadFromConfig: After creating the configuration, all that you need is to call the script using the ReadFromConfig parameter, no additional argument should be provided. The script will read the configuration as they are stored in the same directory $PSScriptRoot. Two files are required which will be generated from calling the -CreateConfig - psxenfailvm.cfg : Contain the configuration - psxenfailvm.crd: Contain the encrypted credential for your hypervisor. These credentials are encrypted and only the user which ran the script to generate the config can read the password. ######################## ---------------- Running using paramter and arguments-------------- ESXiServer: ESXi Server IP address or name (Required - String) ESXiCred: Hypervisor credentials (Required - String) later on SecureString. CitrixDeliveryControler: Delivery Controller IP or name (Required - String) MaxFailprotection: The Maximum number of failed unregistered VMs which is allowed for a reboot, for example, if this value is set to 5, then if 5 VM got unregistered, these VMs will be rebooted, but if the number of failed VMs is more than 5 a notification will be sent to the admin informing that there is an unexpected failure. AdminNotificationAddress: your email address to receive notification. SMTPAddress: The SMTP address to use. Rechecktime: The Checking cycle, as this script will keep check and recheck for a failed VM, so you run it once and keep it in the background, it will automatically recheck for a failed VM every Rechecktime in seconds. #> [cmdletbinding(DefaultParameterSetName='LoadConfig')] param( [parameter(mandatory=$True,ParameterSetName='createconfig',Position=0)][switch]$CreateConfig, [parameter(mandatory=$True,ParameterSetName='loadconfig',Position=0)][switch]$ReadFromConfig, [parameter(mandatory=$True,ParameterSetName='Loadparam')]$ESXiServer, [parameter(mandatory=$true,ParameterSetName='Loadparam')]$ESXiCred, [parameter(mandatory=$true,ParameterSetName='Loadparam')]$CitrixDeliveryControler, [parameter(mandatory=$false,ParameterSetName='Loadparam')]$MaxFailprotection=6, [parameter(mandatory=$true,ParameterSetName='Loadparam')]$AdminNotificationAddress, [parameter(mandatory=$true,ParameterSetName='Loadparam')]$SMTPAddress, [parameter(mandatory=$false,ParameterSetName='Loadparam')]$Rechecktime=460 ) Function CreateConfig { Write-Host "This script will create XenDesktop Config file to be used with the PS Script" Write-Host "This config file will be used by Recover-XenDesktopVM.PS1 file" Write-Host "Please make sure that you fill all the information correctly" Write-Host "You can rerun the script to correct any value" Write-Host "If you need to run the script in another machine, please ensure to rerun this configuration wizard again" Write-Host "as the encrypted password are valid on this machine (machine key)" Write-Host "*********************(0/7)******************" Write-Host "Please make sure that this script will Run As ADMIN" -ForegroundColor Green Write-Host "" if ((Test-Path "$($PSScriptRoot)\psxenfailvm.cfg") -like $true){ Write-Host "Configuration file already found in $($PSScriptRoot) and will be overwritten" -ForegroundColor Red } Write-Host "Step (1/7) - Tell me the VMware ESXi Server which hosting the Citrix Environment" -ForegroundColor Green $RHESXIP=Read-Host "Please type the IP Address" set-content -Value $RHESXIP -Path "$($PSScriptRoot)\psxenfailvm.cfg" -Force Write-Host "" Write-Host "" Write-Host "Step (2/7) - Please type the username and password for which account that have access to your Hypervisor" -ForegroundColor Green $RHcred=Get-Credential $RHcred.Password | ConvertFrom-SecureString | Set-Content -Path "$($PSScriptRoot)\psxenfailvm.crd" -Force add-content -Value $RHcred.UserName -Path "$($PSScriptRoot)\psxenfailvm.cfg" Write-Host "" Write-Host "" Write-Host "Step (3/7) - Please type the IP Address of your Citrix Delivery Controller" -ForegroundColor Green $RHDCIP=Read-Host "Please type the address" add-content -Value $RHDCIP -Path "$($PSScriptRoot)\psxenfailvm.cfg" Write-Host "" Write-Host "" Write-Host "Step (4/7) - Please type your email address to recive notification" -ForegroundColor Green $RHEMAIL=Read-Host "your email address" add-content -Value $RHEMAIL -Path "$($PSScriptRoot)\psxenfailvm.cfg" Write-Host "" Write-Host "" Write-Host "Step (5/7) - Please type your SMTP Server IP address, BTW, the script is not configured to authenticate before sending" -ForegroundColor Green $RHSMTP=Read-Host "your SMTP address" add-content -Value $RHSMTP -Path "$($PSScriptRoot)\psxenfailvm.cfg" Write-Host "Step (6/7) - The MaxAllowedFail will prevent the script from doing a bulk reboot incase there is an unexpected failure in your infra" -ForegroundColor Green Write-Host "This can happense when for a reason such as database failure or network block which prevent the vm from register it self with the Delivery Controller" -ForegroundColor Green Write-Host "which cause to be considered as unregistered." -ForegroundColor Green Write-Host "I recommend to set this value to 6, it was fine for me so incase of 6 or less VM failed to register together a reboot will be triggered for each VM" -ForegroundColor Green Write-Host "But, if more than 6 VM failed, the script will quit and reboot wont be placed." -ForegroundColor Green Write-Host "Usually when there is a failure in DB or network block and it got resolved, the VDA will register it self again." -ForegroundColor Green $RHMAXFail=Read-Host "Soooo, what value should I use, are you happy with 6, type 6" add-content -Value $RHMAXFail -Path "$($PSScriptRoot)\psxenfailvm.cfg" Write-Host "" Write-Host "" Write-Host "" Write-Host "Step (7/7) - every when in Seconds the script should check for failed VM." -ForegroundColor Green Write-Host "I recommend to set this number to between 250 second and 600 second" -ForegroundColor Green $RHSleepTime=read-Host "Type the number of Seconds to check " add-content -Value $RHSleepTime -Path "$($PSScriptRoot)\psxenfailvm.cfg" Write-Host "Creating Eventlog Source ... Please wait" try{ [system.diagnostics.EventLog]::CreateEventSource(“XEN Scripting”, “Application”) } catch{ Write-Host "Unable to create the Event Source, Maybe its already there and this error can be ignored" -ForegroundColor Green Write-Host $_.Exception.message } Write-Host Write-Host "All Done, feel free to run this configuration wizard again to reconfigure the environment, or simply edit the psxenfailvm.cfg file" -ForegroundColor Yellow Write-Host "Please note that if you want to run this script in anohter server, make sure to rerun this configuration wizard again to reregister and store the password with that server machine key" -ForegroundColor Yellow } ############# Variables $VIServer="" $VICred="" $XenDCAddr="" $MaxFailed="" $AdminEmail="" $SMTPAddrs="" $CheckAgain=0 [System.Collections.ArrayList]$FailedServers=@() ####################### ########### Checking how to the script was called, Load Config or via paramters if ($PSCmdlet.ParameterSetName -like "createconfig"){ Write-Host "Switching to Create config file" -ForegroundColor Green CreateConfig Write-Host "Config File are created, Please re-run the script with -ReadFromConfig Parameter" -ForegroundColor Green return } if ($PSCmdlet.ParameterSetName -like "loadconfig"){ if ((Test-Path "$($PSScriptRoot)\psxenfailvm.cfg") -like $False){ Write-Host "Configuration file NOT found or not presented in the following directory $($PSScriptRoot)\psxenfailvm.cfg" -ForegroundColor Red Write-Host "Please make sure that you run first the Prepare-XenFailedVM.ps1" -ForegroundColor Red return } if ((Test-Path "$($PSScriptRoot)\psxenfailvm.crd") -like $False){ Write-Host "Credential file NOT found or not presented in the following directory $($PSScriptRoot)\psxenfailvm.crd" -ForegroundColor Red Write-Host "Please make sure that you run first the Prepare-XenFailedVM.ps1" -ForegroundColor Red return } $Config=Get-Content "$($PSScriptRoot)\psxenfailvm.cfg" $VIServer=$Config[0] $PWD=Get-Content "$($PSScriptRoot)\psxenfailvm.crd" | ConvertTo-SecureString $VICred = New-Object System.Management.Automation.PsCredential(($Config)[1],$PWD) $XenDCAddr=$Config[2] $AdminEmail=$Config[3] $SMTPAddrs=$Config[4] $MaxFailed=$Config[5] $CheckAgain=$Config[6] $Config } Else{ $VIServer=$ESXiServer $VICred = $ESXiCred $XenDCAddr=$CitrixDeliveryControler $AdminEmail=$AdminNotificationAddress $SMTPAddrs=$SMTPAddress $MaxFailed=$MaxFailprotection $CheckAgain=$Rechecktime } ####### Setting up.. Write-Host "preparing to load the required modules..." Import-Module Citrix.Broker.Commands Import-Module VMware.VimAutomation.Core Connect-VIServer $VIServer -Credential $VICred Write-host "First Check round started at $(get-date) and a recheck will occure every $($Rechecktime)" Try{ while ($True) { $FailedVM=Get-BrokerMachine -AdminAddress $XenDCAddr | where {($_.RegistrationState -like "*Unregistered*") -and ($_.FaultState -like "*Unregistered*") -and ($_.SummaryState -like "*Unregistered*")} | select RegistrationState,MachineInternalState,LastConnectionFailure,FaultState,SummaryState,MachineName,LastConnectionUser $FailedVM if ($FailedVM -notlike $null){ foreach ($SingleFailedVM in $FailedVM){ if ($FailedVM.Count -gt $MaxFailed){ Write-Host "WARNING, BULK FAILURE... Go and Check, I will not do anything... quitting" Send-MailMessage -From 'BULKVDIFail@xendesktop.com' -to $AdminEmail -Body "BULK VDI Failure, Please check the Database or other resources, Reboot will not be placed" -SmtpServer $SMTPAddrs -Subject "BULK VDI FAilure" Return } if (($FailedServers.IndexOf($SingleFailedVM.MachineName)) -ge 0){} #The Failed VM is already in the list and should not be parsed again else{ Write-EventLog -LogName Application -EventId 8763 -Message $SingleFailedVM.MachineName -Source "XEN Scripting" -EntryType Error Write-Verbose "I wrote to Eventlog " Send-MailMessage -From 'VDIFailure@xendesktop.com' -to $AdminEmail -Body "The VDI Image $($SingleFailedVM.MachineName) has Failed, Reboot will be initiated" -SmtpServer $SMTPAddrs -Subject "VDI FAilure" Write-Host "Restart initiated for $($SingleFailedVM.MachineName)" ## Connect to VI## if ((Get-Module).name -like "VMware.VimAutomation.Core"){ if ($DefaultVIServer){ get-vm ($SingleFailedVM.MachineName.split("\"))[1] | Restart-vm -Confirm:$false } Else{ Connect-VIServer $VIServer -Credential $VICred get-vm ($SingleFailedVM.MachineName.split("\"))[1] | Restart-vm -Confirm:$false } } $FailedServers.Add($Failedvm.MachineName) } } } if ($FailedVM -eq $null){ Write-Verbose "Clearing the variable " $FailedServers.Clear() } sleep -Seconds $CheckAgain Write-Host "Another Check triggered on " (get-date) } } Catch{ Write-Host $_.Exception.Message Send-MailMessage -From 'BULKVDIFail@xendesktop.com' -to $AdminEmail -Body $_.Exception.Message -SmtpServer $SMTPAddrs -Subject "VDI Script Failure" }
Conclusion
After all the above, This script should save some time and give you time to troubleshoot the real cause instead of keeping distracted by the user’s call asking for a reboot.