Sunday, September 29, 2013

Hyper-V 2012 Live Storage Migration Stops at 99% Dell Compellent Storage ODX

Recently with a Hyper-V 2012 implementation I came across an issues where Live Storage Migrations would stop at 99% and would inevitably fail.

The customer was using Dell Compellent Shared Storage with iSCSI connectivity.
The Dell Compellent Storage supported ODX functionality and thus ODX functionality in Windows Server 2012 was enabled.

This issue was resolved with the following Microsoft Hotfix:
Offloaded Data Transfers fail on a computer that is running Windows 8 or Windows Server 2012

Hyper-V 2012 CSV IO issues when deleting or moving Virtual Disks

Recently I was involved in a Hyper-V 2012 implementation. The environment consisted of Dell M620 Blades and Dell Compellent Shared Storage over 10Gbps iSCSI connectivity.

After the implementation, we began migrating VM's from the previous VMware environment to the new Hyper-V 2012 environment. As part of this process we were performing V2V conversions in Virtual Machine Manager 2012 SP1. As you may be aware during a V2V conversion process the VM is converted to VHD, to leverage the new VHDX capacity and performance features we then converted each VHD to VHDX format. This process then involved removing the left over VHD....and this is where things began to appear to be a problem.

During the process of deleting the VHD from a Clustered Shared Volume (CSV) we began to see all the VM's on the hosts to pause or restart. After further investigation we came across the following event entries:

A.After we deleted the vhd files, there is “IO_TIMEOUT” issue reported on the host machine.

00002e64.00001a4c::2013/07/23-13:46:37.844 INFO  [DCM] CsvFs event CsvFsVolumeStateChangeFromIO for volume NSW_NSWHYPERV-01_06_T23_DR:bcd163b1-00bd-4aa0-b524-9e3758f106ec, status STATUS_IO_TIMEOUT(c00000b5)
00002e64.00001a4c::2013/07/23-13:46:37.844 INFO  [DCM] UnmapPolicy::enter_CountingToBad(bcd163b1-00bd-4aa0-b524-9e3758f106ec): goodTimer P0...75, badTimer R0...150, badCounter 1 state CountingToBad

B. To handle the CSV c00000b5 event, the volume state was changed into ‘Draining State’, then ‘Pause’;

00002e64.00001a4c::2013/07/23-13:46:37.849 INFO  [DCM] CsvFs Listener: state [volume bcd163b1-00bd-4aa0-b524-9e3758f106ec, sequence <4:9><23>, state CsvFsVolumeStateChangeFromDrain->CsvFsVolumeStateDraining, status 0x0]
00002e64.00000ac0::2013/07/23-13:46:37.849 INFO  [DCM] FilterAgent: MarkBad: volume NSW_NSWHYPERV-01_06_T23_DR, state ActiveDirectIO, csvfs state CsvFsVolumeStateDraining, status CsvFsVolumeStateDraining
00002e64.00000ac0::2013/07/23-13:46:37.849 INFO  [DCM] FilterAgent: MarkBad() Volume \\?\GLOBALROOT\Device\Harddisk3\ClusterPartition2\ has gone into draining state
00002e64.00000ac0::2013/07/23-13:46:37.851 INFO  [DCM] MappingManager::PauseVolume 'NSW_NSWHYPERV-01_06_T23_DR'
00002e64.00000ac0::2013/07/23-13:46:37.851 INFO  [DCM] FilterAgent: pausing volume bcd163b1-00bd-4aa0-b524-9e3758f106ec, target \\?\GLOBALROOT\Device\Harddisk3\ClusterPartition2\
00002e64.00000ac0::2013/07/23-13:46:37.851 INFO  [DCM] FilterAgent: ChangeCsvFsState: uniqueId bcd163b1-00bd-4aa0-b524-9e3758f106ec, state CsvFsVolumeStatePaused, sequence <4:10><24>
00002e64.00001a4c::2013/07/23-13:46:37.881 INFO  [DCM] CsvFs Listener: state [volume bcd163b1-00bd-4aa0-b524-9e3758f106ec, sequence <4:10><24>, state CsvFsVolumeStateChangeFromPause->CsvFsVolumeStatePaused, status 0x0]
00002e64.00000ac0::2013/07/23-13:46:37.881 INFO  [DCM] short name is C:\CLUSTE~1\NSW_NS~4
00002e64.00000ac0::2013/07/23-13:46:37.881 INFO  [DCM] PauseNFilter bcd163b1-00bd-4aa0-b524-9e3758f106ec, NTFS: 00000000-0000-0000-0000-000000000000
00002e64.00000e24::2013/07/23-13:46:37.881 INFO  [DCM] Set NfltPauseComplete event for  bcd163b1-00bd-4aa0-b524-9e3758f106ec
00002e64.00000ac0::2013/07/23-13:46:37.881 INFO  [DCM] PauseNFilter completed for bcd163b1-00bd-4aa0-b524-9e3758f106ec
00002e64.00000ac0::2013/07/23-13:46:37.881 INFO  [DCM] FilterAgent: PauseSnapshots completed for NSW_NSWHYPERV-01_06_T23_DR:bcd163b1-00bd-4aa0-b524-9e3758f106ec
00002e64.00000ac0::2013/07/23-13:46:37.881 INFO  [DCM] volume paused 'NSW_NSWHYPERV-01_06_T23_DR'
00002e64.000031a4::2013/07/23-13:46:37.881 INFO  [DCM] PauseDisk completed for resource 'NSW_NSWHYPERV-01_06_T23_DR'

C. After several minutes, the CSV comes online.
00002e64.00000ac0::2013/07/23-13:49:15.602 INFO  [DCM] ActivateNetworkPath bcd163b1-00bd-4aa0-b524-9e3758f106ec completed
00002e64.00000ac0::2013/07/23-13:49:15.602 INFO  [DCM] ActivateVolume: NSW_NSWHYPERV-01_06_T23_DR:bcd163b1-00bd-4aa0-b524-9e3758f106ec
00002e64.00000ac0::2013/07/23-13:49:15.602 INFO  [DCM] Bitlocker not installed or loaded
00002e64.00000ac0::2013/07/23-13:49:15.602 INFO  [DCM] FilterAgent: ChangeCsvFsState: uniqueId bcd163b1-00bd-4aa0-b524-9e3758f106ec, state CsvFsVolumeStateActive, sequence <><31>
00002e64.00001a4c::2013/07/23-13:49:15.602 INFO  [DCM] CsvFs Listener: state [volume bcd163b1-00bd-4aa0-b524-9e3758f106ec, sequence <><31>, state CsvFsVolumeStateChangeFromActivate->CsvFsVolumeStateActive, status 0x0]

During a period of up to 15 minutes, VM's would be hung or restarted. The recovery of the VM's would vary and in some extreme cases we actually had data corruption.

Needless to say we got Microsoft Premier Support involved.

We were quickly advised to apply:
Disable ODX in our environment:
Set-ItemProperty hklm:\system\currentcontrolset\control\filesystem -Name "FilterSupportedFeaturesMode" -Value 1

However we were reluctant to do this as we in fact had a nice new storage array that supports this super cool functionality. We applied the hotfix and didn't disable ODX (as mentioned above).

Unfortunately this didn't resolve our issues...DAMN!

We then came across the following hotfix's and also applied these to all hosts:

Unfortunately this didn't resolve our issues either...DAMN!

We then went and tried to update the Broadcom Network Drivers on all hosts
No luck either!

Finally we applied the following hotfix (KB2870270):

Horaaa!!!! This was the lucky hotfix. This took over a month to resolve and was a very time consuming and frustrating process. I hope this helps anyone else out there who is currently going through similar pain...

Wednesday, September 25, 2013

Pause SCCM Task Sequence with PowerShell

Lately I have found myself adding allot of pauses into Task Sequences for some buggy applications. It seems to be allot more prevalent in SCCM 2012 SP1 and Windows 7 Task Sequences.

Anyhow for whatever reason you need to create a pause in a Task Sequence (Sophos is an application that comes to mind) here is the steps:

1. Create a 'Run Command Line' step in the task sequence.

2. Add the following PowerShell command to the command line field in the new step:
powershell.exe -executionpolicy bypass -command "Start-Sleep -s 120"

Note: -s is used to specify the number of seconds to pause for. -m switch can be used however this is milliseconds NOT minutes :)

Happy Task Sequencing!

Thursday, September 12, 2013

Disable Features in SCCM Task Sequence

Back doing an SOE for a customer and thought I would share how i disable and enable features in a SCCM Task Sequence. This is a good alternative if you do not want to modify the unnatend.xml file.

For a while now DISM.exe has been used to modify features and files in a WIM image. I have put together a simple command that will disable a feature. You can run this within your task sequence as a seperate command or run multple iterations within a batch script.

Here is an example of how to disable all the games in a Task Sequence:

start "" /w /d "%SystemRoot%\System32" "dism.exe" /online /disable-feature /featurename:InboxGames /Quiet /NoRestart
start "" /w /d "%SystemRoot%\System32" "dism.exe" /online /disable-feature /featurename:"More Games" /Quiet /NoRestart
start "" /w /d "%SystemRoot%\System32" "dism.exe" /online /disable-feature /featurename:Solitaire /Quiet /NoRestart
start "" /w /d "%SystemRoot%\System32" "dism.exe" /online /disable-feature /featurename:SpiderSolitaire /Quiet /NoRestart
start "" /w /d "%SystemRoot%\System32" "dism.exe" /online /disable-feature /featurename:Hearts /Quiet /NoRestart
start "" /w /d "%SystemRoot%\System32" "dism.exe" /online /disable-feature /featurename:FreeCell /Quiet /NoRestart
start "" /w /d "%SystemRoot%\System32" "dism.exe" /online /disable-feature /featurename:Minesweeper /Quiet /NoRestart
start "" /w /d "%SystemRoot%\System32" "dism.exe" /online /disable-feature /featurename:PurblePlace /Quiet /NoRestart
start "" /w /d "%SystemRoot%\System32" "dism.exe" /online /disable-feature /featurename:Chess /Quiet /NoRestart
start "" /w /d "%SystemRoot%\System32" "dism.exe" /online /disable-feature /featurename:Shanghai /Quiet /NoRestart
start "" /w /d "%SystemRoot%\System32" "dism.exe" /online /disable-feature /featurename:"Internet Games" /Quiet /NoRestart
start "" /w /d "%SystemRoot%\System32" "dism.exe" /online /disable-feature /featurename:"Internet Checkers" /NoRestart
start "" /w /d "%SystemRoot%\System32" "dism.exe" /online /disable-feature /featurename:"Internet Backgammon" /Quiet /NoRestart
start "" /w /d "%SystemRoot%\System32" "dism.exe" /online /disable-feature /featurename:"Internet Spades" /Quiet /NoRestart

I have put these commands into a batch file named customisations.bat. I then add the customisations to a package in SCCM.

From there i create a command line step in the Task Sequence to the batch file from the package.

Monday, September 9, 2013

Testing Windows Server 2012 R2 Hyper-V Storage QoS with IOMeter

Recently I did some testing of Storage QoS in Server 2012 R2 with IOMeter. I was quite impressed that IO can quickly be reduced on an IO hungry VM.
For those who haven't heard of this great new functionality.

Storage Quality of Service (QoS)
Provides the ability to specify maximum and minimum I/O loads in terms of operations per second (IOPS) for each virtual disk on the VM.
Prevents VM’s from consuming all of the available I/O bandwidth to the underlying physical resource.
Supports Dynamic, Fixed and Differencing Virtual Disks.

Here is a demo of this in action. At first I maxed out the IOPs with IOMeter, I then went on to configure Storage QoS and straight away the IOP's are reduced.

SCCM 2012 SP1 Reports not running from the Config Manager Console

Recently I had an issue running SCCM 2012 SP1 reports from within the Configuration Manager Console. They would run from SQL Reporting Services fine.
The SQL Reporting Services were running from a remote server.

I checked the smsadminui.log a came across the following error:

The issue was a result of Report Viewer not working as expected.
To resolve the issue I removed the Report Viewer and reinstalled.
You can re-run the Report Viewer from the following path:
C:\Program Files\Microsoft Configuration Manager\AdminConsole\bin\ReportViewer.exe

Sunday, September 8, 2013

Hyper-V Replica over a dedicated network

Recently I was involved in a Hyper-V Replica engagement with one of our large customers. One of the project requirements was to use a dedicated and isolated Replication Network.

The customer required Hyper-V Replica to replicate VM's between two Hyper-V Failover Clusters and as a result the Hyper-V Replica Broker role was installed

As you may have already discovered is that by default the Hyper-V Replica IP will be configured on the Parent Network. This is problematic when requiring large amounts of data to be replicated over a shared management network.

Before beginning it is important that the following is ready:
  • Your dedicated network or team.
  • You have an internal Certificate Authority (if you are using an isolated network). Alternatively you can make your Replica Network routable and you can use Kerberos Authentication for Hyper-V Replica. If this is a test environment a self-signed certificate will suffice also J
  • Create a Server and Client Authentication Certificate Template that allows the use of Subject Alternative Names. This is very important when replicating between Failover Clusters. This blog provides great detail on how to configure a Hyper-V Replica Certificate Template:

To configure Hyper-V Replica over a dedicated and isolated replication network there are a few things I have discovered along the way. Here are the steps I took:

1. Ensure you have adequate teamed network to provide Hyper-V Replica Traffic, in this instance I wanted to share the Hyper-V Replica traffic with the CSV Network. In an ideal world you would have a dedicated team or converged fabric for this.

2. Before running the Hyper-V Replica Broker Role install. The Network must be configured to ‘Allow clients to connect through this network’ in Failover Cluster Manager, Networks. Right click the network you want to use for Hyper-V Replica traffic and click ‘Properties’

3. Install the Hyper-V Replica Broker role with your required name and corresponding IP Address, note that by default the Parent Network is selected. You need to change this to the dedicated network team. Replica Broker Install steps are here:

4. Request a certificate from you CA, the requirements for the Certificate is to ensure that the following names exist as Subject alternative names (alternatively for large environments you could use a wildcard). As mentioned follow this great blog post on how to create the Certificate Template and Request from the Hyper-V Hosts.

                - All Hyper-V Replica Broker Role Fully Qualified Domain Names.

                - All Hyper-V Host Fully Qualified Domain Names.

5. Hyper-V Replica can use one of two primary methods to trust inbound/outbound replication, Kerberos authentication or Certificate-based authentication. By default, Hyper-V Replica is configured using Kerberos authentication, in order for Kerberos authentication to function the Hyper-V Replica Network must be able to communicate with Active Directory. In scenarios where Hyper-V is not Domain-joined or the Hyper-V Replica network is isolated, certificate-based authentication is the only authentication method that can be leveraged.
6. If you are using separate networks for Hyper-V Replica Traffic, you will need to configure a persistent route to ensure Hyper-V Replica traffic will route correctly. Use route add –p command on all Hyper-V hosts.

7. Configure your Hosts file to include the FQDN and NetBIOS names of each Hyper-V host that needs to be replicated “to and from”. It was discovered the Hyper-V Replica Broker will still use DNS to resolve the names of each Hyper-V host and as a result will still resolve the Parent Network IP Address. The only option is to update the C:\windows\system32\drivers\etc\hosts file to include the replica network IP address on EVERY SINGLE HYPER-V Host.

8. You can now configure Hyper-V Replica on your Virtual Machines. Check Task Manager Network performance to ensure that the Initial Replication is copying across the dedicated Hyper-V Replica Network.

That’s it! This will need to configured identically on the corresponding Replica Site.