SysAdmin Monkey

Technical tidbits from the sysadmin world...
 

Azure VPN Gateway OpenVPN P2S with RADIUS

The Azure VPN Gateway supports a variety of connection methods for Point to Site VPN’s, one of the more interesting options that came out not too long ago is support for OpenVPN which is compelling as it ticks off a lot of boxes in terms of support such as:

  • Tunnels over HTTPS so traverses firewalls easily
  • Multi platform support – Windows, Mac OSX, iOS, Android, Linux
  • RADIUS Authentication
  • Can be used without Admin rights when setup correctly on Windows

The RADIUS authentication option is really interesting if you use Network Policy Server (NPS) included with Windows Server as you can hook in the Azure MFA Module to provide Multi factor Authentication. Unfortunately the Azure documentation does not outline the required NPS settings to support OpenVPN with RADIUS so after a support ticket, here is the required NPS configuration required to get this working:

Configure Azure

  • Enable Point to Site VPN on your VPN Gateway and Select OpenVPN
  • Pick an unused IP Range that is not part of a VNET’s range for your VPN Clients to use
  • Set the IP of your RADIUS/NPS Server and the secret you will use

Configure RADIUS Client

In Network Policy Server, add a RADIUS client – the IP Address should be the subnet range of your GatewaySubnet

Configure Connection Request Policy

The key item in here is to Override the Authentication Methods, it must be configured as the picture below (Unencrypted PAP/SPAP)

Configure Network Policy

Primary change is the constraints should be set to Microsoft: Secured Password (EAP-MSCHAP v2) if you are doing password based auth (which can then fire off to Azure MFA if setup)

Connect with OpenVPN

Download the VPN Configuration in the Azure Portal and load the OpenVPN connection profile – you should now be able to connect! I highly recommend setting up the Azure MFA module to secure the connection as well.

Errors with February 2018 SQL Connector for Azure Key Vault

Microsoft released an updated version of the SQL Server Connector for Microsoft Azure Key Vault on 26/2/2018 (version 15.0.300.96) which seemed to be unable to connect to the Key Vault on existing or new SQL Servers.

Typically you run this first step in creating the Asymmetric Key

CREATE ASYMMETRIC KEY SAM_TDE FROM PROVIDER [AzureKeyVault_EKM_Prov] with PROVIDER_KEY_NAME = 'SAM_TDE', CREATION_DISPOSITION = OPEN_EXISTING
GO

With the latest version of the SQL Connector you get the following error message

Msg 33049, Level 16, State 2, Line 54

Key with name 'SAM_TDE' does not exist in the provider or access is denied. Provider error code: 2058.  (Provider Error - No explanation is available, consult EKM Provider for details)

Not very descriptive is it?  If you check the Application logs in Event Viewer you should see an Event ID 2 for the Key with the following details:

Operation: SQLCryptGetKeyInfoByName

Message: Error when accessing registry:5

What causes this?

Well it turns out there is a new undocumented requirement for a registry key.  The only problem is neither the SQL connector installer nor the connector DLL or SQL Server has the rights to create it!

To fix this do the following:

  • In the registry, navigate to HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft
  • Create a new Key called SQL Server Cryptographic Provider
  • Give Full Control permissions to this key to the Windows service account that runs SQL Server
  • Try the operation again and it should work

If you check back in that registry key after creating the Asymmetric key, you should see a registry entry for the key vault name.

So far I haven’t seen this documented anywhere online, Azure support was able to provide these details.  Hopefully this helps for those that encounter the same issue.

 

 

 

Multi-site MSSQL Availability Group & Disaster Recovery

There is a lot of information scattered around different blogs and Microsoft documentation about running SQL Availability Groups across multiple sites for DR or reporting requirements, this post outlines how to get a 3 node cluster working again when you have to failover to your DR site.

 

Assumptions:

  • Failover Cluster created with all Cluster members, Cloud Witness or Fileshare Witness (in DR site or 3rd site) created
  • SQL AG Listener IPs configured for both regions

 

The following guide is based on a 3 node cluster across two sites (PRI and DR):

  • PRI-SQL01
    • Synchronous Commit
    • Automatic Failover
    • Read Only Replica
  • PRI-SQL02
    • Synchronous Commit
    • Automatic Failover
    • Read Only Replica
  • DR-SQL01
    • Synchronous Commit (if network permits)/Asynchronous Commit
    • Manual Failover
    • No Read Only Replica
    • Failover cluster vote removed so it does not interfere with Quorum
(get-clusternode -name "DR-SQL01").NodeWeight=0

 

Disaster Occurs! Time to Bring up the Disaster Recovery Site

First issue – the Cluster will most likely have lost quorum and the cluster has stopped – so you can’t failover, now what? On the DR SQL Server you do the following (Source for these steps):

  • Stop the Cluster Service with the following Powershell
Stop-Service ClusSvc
  • Start the Cluster, fixing/ignoring Quorum
Start-ClusterNode -name DR-SQL01 -FixQuorum
  • Check the status of the Cluster with the following command, it will transition from Joining to Up (good to go)
Get-ClusterNode -Name DR-SQL01
  • Return the DR SQL server a vote in the Cluster Quorum
(Get-ClusterNode -name "DR-SQL01").NodeWeight=1
  • Remove votes from Offline Primary site servers
(get-clusternode -name "PRI-SQL01").NodeWeight=0
(get-clusternode -name "PRI-SQL02").NodeWeight=0
  • Confirm the votes assigned to the Cluster
Get-ClusterNode | fl Name,NodeWeight
  • The Failover Cluster should now be up, but the Role(s) for the SQL Availability Group will be offline or failed
  • Start SQL Server Management Studio and run the following to check the health of the databases
    • If Failover Ready = 1 – there should be no dataloss
    • If Failover Ready = 0 – there will be dataloss (will always be this value for Asynchronous Commit)
    • Keep a record of Recovery_lsn and Truncation_lsn to compare against the PRI servers when they come back up to see how much was missing
select dharcs.replica_server_name, dhdrcs.is_failover_ready, dhdrcs.database_name, dhdrcs.recovery_lsn, dhdrcs.truncation_lsn
from sys.dm_hadr_database_replica_cluster_states dhdrcs join sys.dm_hadr_availability_replica_cluster_states dharcs
on(dhdrcs.replica_id = dharcs.replica_id)
where dharcs.replica_server_name = @@servername
  • Failover to the DR Server inside SQL Availability Groups
    • I’ve found this works best in this scenario with T-SQL instead of using the Wizard
//** In this example the AG name is SYSMONKEY-AG **//
ALTER AVAILABILITY GROUP "SYSMONKEY-AG" FORCE_FAILOVER_ALLOW_DATA_LOSS
  • Database should now be active, check that it shows as Primary in SSMS

Primary Site is back up and running, lets move back to there

Failing back without dataloss won’t be available at first because the returning Cluster Members are now in a paused state as they recognise the change in Primary for the Availability Group and Failover Cluster (source)

  • Return node votes to the other cluster members
(get-clusternode -name "PRI-SQL01").NodeWeight=1
(get-clusternode -name "PRI-SQL02").NodeWeight=1
  • Resume the replication on each server by running the following T-SQL on each per database
//** Run this against each database on each of the servers in the Primary site that should now show as 'secondary' database servers **//
 
alter database dbname set hadr resume
  • Let the servers true up their data, if there has been a high rate of change or they have been offline for too long then you may have to remove them from the Availability Group and reseed them.
  • Once up to date, you will be able to failover without dataloss back to the Primary Site.
  • Make sure to remove the Vote for the DR site again from the Cluster
(get-clusternode -name "DR-SQL01").NodeWeight=0

Hopefully if you get to here you should be back to normal operations!

 

Reduce Failover time between subnets – planned & unplanned

One key thing to remember is if your Apps support it, you should use the MultiSubnetFailover=True SQL Connection String option to ensure speedy failover between sites.

.NET Framework 4 is required for this, and essentially when set to True apps will query DNS for all IPs associated with the Availability Group and try to connect on the Database Server port (1433 or other…) on all IPs at the same time.

Since only one of the Sites IPs is active it will immediately use that one.  This greatly speeds up failover during normal operations as well.

Should your apps not support this then you should follow the information in this Microsoft Document on how to disable this feature and reduce the DNS TTL instead.

 

Azure Recovery Services Vault – Restore disks via powershell

The long overdue update to Azure’s Recovery Vault to support ARM virtual machines has finally arrived.  With it are some nice changes to how you backup and recover your VMs, however as is typical with Azure on release of a new or updated feature the documentation is not the best!

We’ve been waiting for the update to this part of Azure for quite a while at work and dove right in once it came out.  One of our main requirements is that in case of a restore we wanted to restore just the disks so that we can redeploy the server from Powershell with a more ‘advanced’ configuration (Availability Set, choose the IP, disk name etc).

Unfortunately the documentation on how to do a disk only restore is terribly incomplete but with some trial and error here is how you do it below.  A reminder that you will need at least Azure Powershell v1.4.0 to have the cmdlets.

Connect to the vault
#connect to the vault
$vault = Get-AzureRmRecoveryServicesVault -Name AzMonkey-RSV -ResourceGroupName AZMonkey
Set-AzureRmRecoveryServicesVaultContext -Vault $vault
Find which protected VM you want to restore from
#find protected VMs
Get-AzureRmRecoveryServicesBackupItem -BackupManagementType AzureVM -WorkloadType AzureVM

Which gives you an output similar to this:

Name                                     ContainerType        ContainerUniqueName                      WorkloadType         ProtectionStatus    
----                                     -------------        -------------------                      ------------         ----------------    
iaasvmcontainerv2;azmonkey;azmonkeyvm01  AzureVM              iaasvmcontainerv2;azmonkey;azmonkeyvm01  AzureVM              Healthy             
iaasvmcontainerv2;azmonkey;azmonkeyvm02  AzureVM              iaasvmcontainerv2;azmonkey;azmonkeyvm02  AzureVM              Healthy             
Select the VM (use the last portion of the name e.g. azmonkeyvm01 from the above output)
#Select the VM you wish to restore from
$backupitem=Get-AzureRmRecoveryServicesBackupItem -BackupManagementType AzureVM -WorkloadType AzureVM -Name azmonkeyvm02
See what restore points are available
#Check what restore points are available (last 7 days - adjust the -7 for more days)
$startDate = (Get-Date).AddDays(-7)
$endDate = Get-Date
Get-AzureRMRecoveryServicesBackupRecoveryPoint -Item $backupitem -StartDate $startdate.ToUniversalTime() -EndDate $enddate.ToUniversalTime()

Which gives you a list of recovery points (last 7 days from the command but you can change the start date!)

RecoveryPointId    RecoveryPointType  RecoveryPointTime      ContainerName                        ContainerType  
---------------    -----------------  -----------------      -------------                        -------------  
6653637261308      FileSystemConsi... 14/05/2016 1:06:25 AM  IaasVMContainer;iaasvmcontainerv2... AzureVM        
17221207186083     FileSystemConsi... 14/05/2016 12:33:33 AM IaasVMContainer;iaasvmcontainerv2... AzureVM        

Select the restore point (top in the list is 0 (1:06:25am), second listed is 1)
#Set the first part of the $rp variable to select a recovery point
$rp = Get-AzureRMRecoveryServicesBackupRecoveryPoint -Item $backupitem -StartDate $startdate.ToUniversalTime() -EndDate $enddate.ToUniversalTime()
#Select the restore point - latest is 0, next is 1 etc...
$rp[0]
Execute the restore
#Load the restore variable with the recovery point and storage account location to restore to (it uses a guid for the container and vhd name)
$restorejob = Restore-AzureRMRecoveryServicesBackupItem -RecoveryPoint $rp[0] -StorageAccountName azmonkeyslrs -StorageAccountResourceGroupName AZMonkey
#Execute the restore
$restorejob

Once you kick off the restore you can check in the Portal or powershell to see how the job is going (under backup jobs).

#Check progress
Get-AzureRmRecoveryServicesBackupJobDetails -job $restorejob

WorkloadName Operation Status StartTime EndTime JobID 
------------ --------- ------ --------- ------- ----- 
azmonkeyvm02 Restore InProgress 14/05/2016 4:43:53 AM 1/01/0001 12:00:00 AM c29fdfea-0986-4cee-a43f-72d5d818bee6

When it is finished you’ll see the following in the portal with the name and location of your restored vhd files:

Azure Recovery Services - Restore Complete

And with that, job done – provision your VM again with your powershell scripts.

Data disks in Azure Resource Manager Powershell

The search for how to do some of what is below was one of the main reasons I started thinking about writing a technical blog… On to the interesting stuff!

First up – check your LUNs!

If you have already added data disks via the portal you should check what their current LUNs are to avoid any errors, run the following and look for the LUN tag in the disk output.

Note that the OS Disk is on a separate controller so ignore its LUN and only look for data disks!

Get-AzurermVM -ResourceGroupName AZMONKEY -Name AZMONKEYVM01

 

Add a new (empty) Data Disk to a VM

This is relatively straight forward, you specify how big you want the disk and set it to the empty creation option – one thing to keep in mind is if you are attaching multiple data disks you will need to specify the LUN number so there are no conflicts

#Add Data Disks - Suggest only adding same caching type at once and setup in Windows to avoid confusion 

#Specify your VM Name 
$vmName="AZMONKEYVM01" 
#Specify your Resource Group 
$rgName = "AZMONKEY" 
#Specify your Storage account name
$saName="azmonkeyplrs"
#This pulls your storage account info for use later 
$storageAcc=Get-AzureRmStorageAccount -ResourceGroupName $rgName -Name $saName 
#Pulls the VM info for later 
$vmdiskadd=Get-AzurermVM -ResourceGroupName $rgname -Name $vmname 
#Sets the URL string for where to store your vhd files - converts to https://azmonkeyplrs.blob.core.windows.net/vhds
#Also adds the VM name to the beginning of the file name 
$DataDiskUri=$storageAcc.PrimaryEndpoints.Blob.ToString() + "vhds/" + $vmName 
Add-AzureRmVMDataDisk -CreateOption empty -DiskSizeInGB 1023 -Name $vmname-Data01 -VhdUri $DataDiskUri-Data01.vhd -VM $vmdiskadd -Caching ReadOnly -lun 0 
Add-AzureRmVMDataDisk -CreateOption empty -DiskSizeInGB 1023 -Name $vmName-Data02 -VhdUri $DataDiskUri-Data02.vhd -VM $vmdiskadd -Caching ReadOnly -lun 1 
Add-AzureRmVMDataDisk -CreateOption empty -DiskSizeInGB 1023 -Name $vmName-Data03 -VhdUri $DataDiskUri-Data03.vhd -VM $vmdiskadd -Caching None -lun 2 
Add-AzureRmVMDataDisk -CreateOption empty -DiskSizeInGB 1023 -Name $vmName-Data04 -VhdUri $DataDiskUri-Data04.vhd -VM $vmdiskadd -Caching ReadWrite -lun 3 
#Updates the VM with the disk config - does not require a reboot 
Update-AzureRmVM -ResourceGroupName $rgname -VM $vmdiskadd

 

Re-attach an existing disk

So say you moved some data disk vhds from premium to standard storage or visa versa – you need to reattach those disks after you detached them… how do you do it? Well its not too hard with the right syntax!

#attach existing datadisks to VM
$rgName = "AZMONKEY"
$vmname = "AZMONKEYVM01"
$vmdiskadd=Get-AzurermVM -ResourceGroupName $rgname -Name $vmname

Add-AzureRMVMDataDisk -Name AZMONKEYVM01-Data03 -VM $vmdiskadd -VhdUri https://azmonkeyplrs.blob.core.windows.net/vhds/AZMONKEYVM01-Data03.vhd -LUN 2 -Caching None -CreateOption Attach -DiskSizeInGB 1023
Add-AzureRMVMDataDisk -Name AZMONKEYVM01-Data04 -VM $vmdiskadd -VhdUri https://azmonkeyplrs.blob.core.windows.net/vhds/AZMONKEYVM01-Data04.vhd -LUN 3 -Caching None -CreateOption Attach -DiskSizeInGB 1023
Update-AzureRmVM -ResourceGroupName $rgname -VM $vmdiskadd