Inconsistent Virtual Disk Locking in Distributed High Availability with Provisioning Services 5.6
Symptoms
When using Provisioning Services 5.6 and a distributed High Availability (HA) model for the virtual disk stores, the environment level view of virtual disk locks can become out of synchronization with the actual current status of the devices connected to specific virtual disks. In certain circumstances, this can lead to virtual disk corruption.
Note: This behavior relates to the virtual disk locking mechanism. HA failover ability is not affected by this behavior.
Scenario 1:
A target machine boots normally, however, a virtual disk lock is only created on the Provisioning Services server serving the virtual disk and not on the second Provisioning Services server.
Result:
The properties of the virtual disk can now be edited on the second Provisioning Services server because it is not aware of the lock on the first server. This could potentially lead to virtual disk corruption.
A target machine boots from one Provisioning Services server but a HA failover event occurs which causes the target to failover to the second Provisioning Services server. In this case, the only lock shown in the Provisioning Services console is the lock on the second Provisioning Services server, which is currently serving the virtual disk.
Result:
The properties of the virtual disk can be edited on the first Provisioning Services server, because it is not aware that the second Provisioning Services server is now streaming the virtual disk.
Scenario 3:
Following on from Scenario 2, if the target device fails over from the second Provisioning Services server back to the first, the lock is removed from the second server but never recreated on the first Provisioning Services server.
Result:
Neither server now shows a lock on the virtual disk. This means either server can be used to modify the virtual disk, leading to data corruption of the virtual disk itself.
Cause
Because Provisioning services does not centrally store the virtual disk locks, in a situation when HA functionality is executed in a distributed HA environment, the Provisioning Services servers view of the locks becomes inconsistent.
Workaround
Because the problem only occurs when using a distributed HA model for Provisioning Services virtual disk stores, this issue does not occur if a shared storage solution is used.
More Information
CTX127549 – Provisioning Services 5.6 Best Practices