Recover the Management vSphere Cluster

Perform the following tasks to recover the management domain vSphere cluster.

Export the Cluster Settings from the Restored vCenter Server

Before you can restore vSphere clusters of the restored vCenter Server, you must first export the settings so they can be reapplied to the restored cluster.
  • vCenter Server has been restored and powered on.
  1. Start PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console.
    The values in this example are for the management domain. Replace with the values for the specific workload domain you are recovering.
    $restoredVcenterFqdn = "sfo-m01-vc01.sfo.rainpole.io" $restoredVcenterAdmin = "administrator@vsphere.local" $restoredVcenterAdminPassword = "VMw@re1!" $clusterName = "sfo-m01-cl01"
  3. Perform the configuration by running the command in the PowerShell console.
    Connect-VIServer -server $restoredVcenterFqdn -user $restoredVcenterAdmin -password $restoredVcenterAdminPassword Backup-ClusterVMOverrides -clusterName $clusterName Backup-ClusterVMLocations -clusterName $clusterName Backup-ClusterDRSGroupsAndRules -clusterName $clusterName Backup-ClusterVMTags -clusterName $clusterName Disconnect-VIServer * -confirm:$false
  4. Save the JSON file output for later use.
  5. Repeat for all clusters in the vCenter Server.

Remove Non-Responsive ESXi Hosts from the Inventory

Before you can repair a failed vSphere cluster, you must first remove the non-responsive hosts from the cluster.

Prerequisites

Ensure you have exported all cluster settings before proceeding.

UI Procedure

  1. Log into the NSX Manager for the workload domain and navigate to
    System
    Fabric
    Hosts
    Clusters
    .
    In NSX 3.x, the relevant navigation path is
    System
    Fabric
    Nodes
    Host Transport Nodes
    and choose the vCenter from
    Managed by
  2. Select the check box next for the relevant vSphere cluster and click
    Remove NSX
    .
  3. Deselect the check box next for the relevant vSphere cluster.
  4. Expand the cluster and wait for all hosts in the cluster to go into an
    Orphaned
    State
  5. Select the check box that selects all hosts in the cluster without selecting the cluster object itself and select
    Remove NSX
    .
  6. Select the
    Force
    option and submit.
    Wait until all hosts show as unconfigured.
  7. Log in to the vCenter Server with the non-responsive hosts and navigate to the cluster.
  8. Select the cluster, and in the right pane, navigate to the
    Hosts
    tab.
  9. Select the check box for each non-responsive host, right click the selected hosts and select
    Remove from Inventory
    .
    If the cluster use vSphere Lifecycle Manager images, wait for about a minute to allow the background tasks in NSX to complete the removal of the NSX solution from the relevant cluster before proceeding to the next step.
  10. Log in to the NSX Manager for the workload domain and navigate back to
    System
    Fabric
    Hosts
    Clusters
    In NSX 3, the relevant navigation path is
    System
    Fabric
    Nodes
    Host Transport Nodes
    and choose the vCenter from
    Managed by
  11. Select the check box next for the relevant vSphere cluster and click
    Configure NSX
    .
  12. Select the relevant Transport Node Profile and Submit.

PowerShell Procedure

  1. Start PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console.
    The values in this example are for the management domain. Replace with the values for the specific workload domain you are recovering.
    $restoredVcenterFqdn = "sfo-m01-vc01.sfo.rainpole.io" $restoredVcenterAdmin = "administrator@vsphere.local" $restoredVcenterAdminPassword = "VMw@re1!" $clusterName = "sfo-m01-cl01" $restoredNsxManagerFqdn = "sfo-m01-nsx01.sfo.rainpole.io" $restoredNsxManagerAdmin = "admin" $restoredNsxManagerAdminPassword = "VMw@re1!VMw@re1!" $restoredNsxManagerRootPassword = "VMw@re1!VMw@re1!"
  3. Perform the configuration by running the command in the PowerShell console.
    Remove-NonResponsiveHosts -vCenterFQDN $restoredVcenterFqdn -vCenterAdmin $restoredVcenterAdmin -vCenterAdminPassword $restoredVcenterAdminPassword -clusterName $clusterName -NsxManagerFQDN $restoredNsxManagerFqdn -NsxManagerAdmin $restoredNsxManagerAdmin -NsxManagerAdminPassword $restoredNsxManagerAdminPassword -NsxManagerRootPassword $restoredNsxManagerRootPassword

Configure vSAN to Ignore Cluster Member Updates

To permit moving the vSAN cluster from the temporary vCenter Server to the restored vCenter Server, you must configure vSAN to ignore cluster member updates.

Manual Procedure

  1. Connect to each ESXi host using SSH and login as
    root
    user.
  2. Run the following command:
esxcli system settings advanced set --int-value=1 --option=/VSAN/IgnoreClusterMemberListUpdates

PowerShell Procedure

  1. Start PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console.
    $tempVcenterFqdn = "sfo-m01-vc02.sfo.rainpole.io" $tempVcenterAdmin = "administrator@vsphere.local" $tempVcenterAdminPassword = "VMw@re1!" $clusterName = "sfo-m01-cl01" $setting = "enable" $extractedSDDCDataFile = ".\extracted-sddc-data.json"
  3. Perform the configuration by running the command in the PowerShell console.
    Set-ClusterHostsvSanIgnoreClusterMemberList -vCenterFQDN $tempVcenterFqdn -vCenterAdmin $tempVcenterAdmin -vCenterAdminPassword $tempVcenterAdminPassword -clusterName $clusterName -setting $setting -extractedSDDCDataFile $extractedSDDCDataFile

Migrate Host Networking from vSphere Distributed Switch to vSphere Standard Switch

Before you can move the new vSAN cluster to the restored vCenter Server, you must first disconnect the hosts from the temporary vSphere Distributed Switch by using a temporary vSphere Standard Switch.

UI Procedure

  1. Disconnect a physical vmnic for each host from the vSphere Distributed Switch on the temporary vCenter Server.
    1. Log in to the temporary vCenter Server at
      https://<temporary_vcenter_server_fqdn>/ui
      by using the vSphere Client.
    2. In the
      Networking
      inventory, right-click the distributed switch and select
      Add and Manage Hosts
      .
    3. Select the
      Manage host networking
      task.
    4. On the
      Member hosts
      tab, select all hosts and click
      Next
      .
    5. On the
      Manage physical adapters
      page, select one vmnic, for example vmnic1, and then click
      Unassign adapter
      .
    6. In the
      Confirm Unassign Adapter
      dialog box, select
      Apply this operation to all other hosts
      and then click
      Unassign
      .
    7. Click
      Next
      .
    8. Click
      Next
      ,
      Next
      and
      Finish
      .
  2. Create a standard switch on each ESXi host.
    1. In the
      Hosts and clusters
      inventory, select the first ESXi host, and on the Configure tab, and select
      Networking
      Virtual switches
      .
    2. Click
      Add Standard Virtual Switch
      .
    3. Set the vSwitch Name to
      vSwitch0
      .
    4. Ensure the MTU is set to 9,000.
    5. Repeat this task to complete these steps for each ESXi host.
  3. Connect ESXi hosts to vSphere Standard Switch.
    1. On the
      Virtual switches
      page for the host, click
      Add Networking
      .
    2. In the
      Add Networking
      wizard, select
      Physical Network Adapter
      and click
      Next
      .
    3. On the
      Select target device
      page, select
      vSwitch0
      and click
      Next
      .
    4. On the
      Add physical network adapter
      page, select an unassigned physical adapter, for example vmnic1, from the adapter list, and move it under
      Active adapters
      , and click
      Next
      .
    5. Review the information on the Ready to complete page and click
      Finish
      .
  4. Create a temporary management port group on the temporary vSphere standard switch.
    1. On the
      Virtual switches
      page for the host, expand vSWitch0 and click
      Add Networking
      .
    2. In the
      Add Networking
      wizard, select
      Virtual Machine Port Group for a Standard Switch
      and click
      Next
      .
    3. On the
      Select target device
      page, the standard switch will default to vSwitch0, click
      Next
      .
    4. On the
      Connection settings
      page, change the
      Network label
      to
      temp_mgmt
      , update the VLAN ID with the correct VLAN ID for the management VLAN, and click Next.
    5. Review the information on the
      Ready to complete
      page and then click
      Finish
      .

PowerShell Procedure

  1. Start PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console.
    $tempVcenterFqdn = "sfo-m01-vc02.sfo.rainpole.io" $tempVcenterAdmin = "administrator@vsphere.local" $tempvCenterAdminPassword = "VMw@re1!" $clusterName = "sfo-m01-cl01" $vmnic = "vmnic1" $mtu = "9000"
  3. Perform the configuration by running the command in the PowerShell console.
    Move-ClusterHostNetworkingTovSS -vCenterFqdn $tempVcenterFqdn -vCenterAdmin $tempVcenterAdmin -vCenterAdminPassword $tempvCenterAdminPassword -clusterName $clusterName -extractedSDDCDataFile $extractedSDDCDataFile -vmnic $vmnic -mtu $mtu
    This PowerShell cmdlet is IdemPotent and can be re-run if you encounter issues with connectivity to the management interfaces of the ESXi hosts while performing this operation.

Prepare to Migrate the New vSphere Cluster to the Restored vCenter Server

Before you can migrate the new vSphere cluster to the restored vCenter Server, you must first prepare the cluster in the temporary vCenter Server.

UI Procedure

  1. Log in to the temporary vCenter Server at
    https://<temporary_vcenter_server_fqdn>/ui
    by using the vSphere Client.
  2. Set vSphere DRS to manual on the target cluster.
    1. In the
      Hosts and clusters
      inventory, select the target cluster.
    2. On the
      Configure
      tab, select
      vSphere DRS
      Edit
      .
    3. Change the DRS level to
      Manual
      and click
      Save
  3. Migrate all VMs to the first ESXi host in the cluster.
    1. Right-click on each VM and select
      Migrate
      .
    2. Select the first host in the cluster as a target and click
      Migrate
      .

PowerShell Procedure

  1. Start PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console.
    $tempVcenterFqdn = "sfo-m01-vc02.sfo.rainpole.io" $tempVcenterAdmin = "administrator@vsphere.local" $tempvCenterAdminPassword = "VMw@re1!" $clusterName = "sfo-m01-cl01"
  3. Set vSphere DRS to manual by running the command in the PowerShell console.
    Connect-VIServer $tempVcenterFqdn -user $tempVcenterAdmin -password $tempvCenterAdminPassword Set-Cluster -cluster $clusterName -DrsAutomationLevel "Manual" -confirm:$false Disconnect-VIServer -Server $global:DefaultVIServers -Force -Confirm:$false

Move All Management VMs to a Temporary Port Group on the vSphere Standard Switch

Before you can move the new vSAN cluster to the restored vCenter Server, you must first move all management VMs to a temporary port group on the temporary vSphere Standard Switch.

UI Procedure

  1. Log in to the temporary vCenter Server.
  2. In the
    VMs and templates
    inventory, right-click the VM for the temporary vCenter Server and select
    Edit Settings
    .
  3. Move network adapter 1 to the
    temp_mgmt
    port group on the standard switch, and click
    OK
    .
  4. Repeat for all VMs in the temporary vCenter Server.

PowerShell Procedure

  1. Start PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console.
    $tempVcenterFqdn = "sfo-m01-vc02.sfo.rainpole.io" $tempVcenterAdmin = "administrator@vsphere.local" $tempvCenterAdminPassword = "VMw@re1!" $clusterName = "sfo-m01-cl01"
  3. Perform the configuration by running the command in the PowerShell console.
    Move-MgmtVmsToTempPg -vCenterFQDN $tempVcenterFqdn -vCenterAdmin $tempVcenterAdmin -vCenterAdminPassword $tempvCenterAdminPassword -clusterName $clusterName

Remove the ESXi Hosts from the vSphere Distributed Switch on the Temporary vCenter Server

Before you can move the new vSAN cluster to the restored vCenter Server, you must first remove all ESXi hosts from the temporary vSphere Distributed Switch.

UI Procedure

  1. Disconnect the remaining physical adapters from the vSphere Distributed Switch.
    1. Log in to the temporary vCenter Server at
      https://<temporary_vcenter_server_fqdn>/ui
      by using the vSphere Client.
    2. In the
      Networking
      inventory, right-click the distributed switch and select
      Add and Manage Hosts
      .
    3. Select the
      Manage host networking
      task.
    4. On the
      Member hosts
      tab, select all hosts and click
      Next
      .
    5. On the
      Manage physical adapters
      page, select the remaining physical adapters, for example vmnic0, and then select
      Unassign adapter
      .
    6. In the
      Confirm Unassign Adapter
      dialog box, select
      Apply this operation to all other hosts
      and then click
      Unassign
      .
      You are returned to the
      Manage physical adapters
      screen. Note the vmnic0 is marked as Unassigned.
    7. On he
      Manage physical adapters
      page, click
      Next
      .
    8. Click
      Next
      ,
      Next
      , and
      Finish
      .
  2. Remove the ESXi hosts from the vSphere Distributed Switch on the temporary vCenter Server.
    1. Connect to the temporary vCenter UI and select Networking.
    2. In the
      Networking
      inventory, right-click the distributed switch and select
      Add and Manage Hosts
      .
    3. Select the
      Remove hosts
      task.
    4. On the
      Member hosts
      tab, select all hosts and click
      Next
      .
    5. Review the information on the
      Ready to complete
      page and click
      Finish
      .

PowerShell Procedure

  1. Start PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console.
    $tempVcenterFqdn = "sfo-m01-vc02.sfo.rainpole.io" $tempVcenterAdmin = "administrator@vsphere.local" $tempvCenterAdminPassword = "VMw@re1!" $clusterName = "sfo-m01-cl01" $vdsName = "sfo-m01-cl01-vds01"
  3. Perform the configuration by running the command in the PowerShell console.
    Remove-ClusterHostsFromVds -vCenterFQDN $tempVcenterFqdn -vCenterAdmin $tempVcenterAdmin -vCenterAdminPassword $tempvCenterAdminPassword -clusterName $clusterName -vdsName $vdsName

Migrate ESXi Hosts from the Temporary vCenter Server to the Restored vCenter Server

You migrate the newly deployed vSAN cluster to the restored vCenter Server.

UI Procedure

  1. Log in to the restored vCenter Server for the management domain by using the vSphere Client.
  2. Right-click the restored cluster and click
    Add Hosts
    .
  3. In the
    Add hosts
    wizard, enter the ESXi host names for the management cluster including the user name and password, and click
    Next
    .
    A security alert dialog box appears because vCenter Server is not able to verify the certificate thumbprint.
  4. In the security alert dialog box, select all hosts and click
    Ok
    .
  5. Review the information on the
    Host summary
    page and click
    Next
    .
  6. Review the information on the
    Ready to complete
    page and click
    Finish
    .

PowerShell Procedure

  1. Start PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console.
    $tempvCenterFqdn = "sfo-m01-vc02.sfo.rainpole.io" $tempvCenterAdmin = "administrator@vsphere.local" $tempvCenterAdminPassword = "VMw@re1!" $restoredvCenterFQDN = "sfo-m01-vc01.sfo.rainpole.io" $restoredvCenterAdmin = "administrator@vsphere.local" $restoredvCenterAdminPassword = "VMw@re1!" $clusterName = "sfo-m01-cl01" $extractedSDDCDataFile = ".\extracted-sddc-data.json"
  3. Perform the configuration by running the command in the PowerShell console.
    Move-ClusterHostsToRestoredVcenter -tempvCenterFqdn $tempvCenterFqdn -tempvCenterAdmin $tempvCenterAdmin -tempvCenterAdminPassword $tempvCenterAdminPassword -restoredvCenterFQDN $restoredvCenterFQDN -restoredvCenterAdmin $restoredvCenterAdmin -restoredvCenterAdminPassword $restoredvCenterAdminPassword -clusterName $clusterName -extractedSDDCDataFile $extractedSDDCDataFile

Migrate the ESXi Hosts and VMkernel Adapters to the vSphere Distributed Switch on the Restored vCenter Server

You connect the new cluster hosts to the vSphere Distributed Switch and migrate the VMkernel adapters.

UI Procedure

  1. Log in to the restored vCenter Server by using the vSphere Client.
  2. Add the hosts to the vSphere Distributed Switch.
    1. In
      Networking
      inventory, right-click the vSphere Distributed Switch and select
      Add and Manage Hosts
      .
    2. Select
      Add Hosts
      and click
      Next
      .
    3. On the
      Select hosts
      page, select all the ESX hosts, and click
      Next
      .
    4. On the
      Manage physical adapters
      page, select a free physical adapter, for example vmnic0, and click
      Assign Uplink
      .
    5. Select uplink1 , and click
      Next
      .
    6. On the
      Manage VMkernel adapters
      page, update the following VMkernel adapters to assign them to the appropriate port group on the new distributed switch.
      VMkernel Migration by Domain Type
      Management Domain
      VI Workload Domain
      vmk0 – Management Network port group
      vmk0 – Management Network port group
      vmk1 – vMotion Network port group
      N/A - Not yet created
      vmk2 – vSAN Network port group
      N/A - Not yet created
    7. To assign the VMkernel adapters, select the adapter and under actions for the corresponding port group, click
      Assign
      .
    8. Click
      Next
      .
    9. On the
      Migrate VM networking
      step, click
      Next
      .
    10. Review the information on the
      Ready to complete
      page and click
      Finish
      .
  3. If this is a management domain cluster, migrate the management VMs to the original management port group.
    1. Right-click the temporary management port group and select
      Migrate VMs to Another Network
      .
    2. For destination network, select the management port group on the vSphere Distributed Switch, for example sfo-m01-vc01-vds01-management, and click
      Next
      .
    3. On the
      Select VMs to migrate
      page, select all management VMs and click
      Next
      .
    4. On the
      Ready to complete
      page, click
      Finish
      .
  4. Remove the temporary standard switch on each ESXi Host.
    1. Select the first ESXi host and, on the Configure tab, select
      Networking
      Virtual Switches
      .
    2. Expand
      vSwitch0
      and click the horizontal ellipsis.
    3. Click
      Remove
      and click
      Yes
      .
  5. Add additional host uplinks to the vSphere Distributed Switch.
    1. Right-click the distributed switch and select
      Add and Manager Hosts
      .
    2. Select
      Manage Host Networking
      and click
      Next
      .
    3. On the
      Select hosts
      step, select all the ESX hosts and click
      Next
      .
    4. On the
      Manage physical adapters
      step, select the required free physical adapter(s), for example vmnic1, and from
      Assign Uplink
      select the desired uplinks to corresponding physical adapters, and click
      Next
      .
    5. Click
      Next
      and click
      Next
      .
    6. Review the information on the
      Ready to complete
      page and click
      Finish
      .
If you are running NSX 4.1.2 or later, the NSX installation on each host in the vSphere cluster should self-heal. Monitor the self-healing process until complete in the NSX Manager UI at
System
Fabric
Hosts
Clusters
before proceeding.

PowerShell Procedure

  1. Start PowerShell.
    Replace the values in the sample code with your values and run the commands in the PowerShell console.
    $restoredvCenterFQDN = "sfo-m01-vc01.sfo.rainpole.io" $restoredvCenterAdmin = "administrator@vsphere.local" $restoredvCenterAdminPassword = "VMw@re1!" $clusterName = "sfo-m01-cl01" $extractedSDDCDataFile = ".\extracted-sddc-data.json"
  2. Perform the configuration by running the command in the PowerShell console.
    New-RebuiltVdsConfiguration -vCenterFQDN $restoredVcenterFqdn -vCenterAdmin $restoredVcenterAdmin -vCenterAdminPassword $restoredVcenterAdminPassword -clusterName $clusterName -extractedSDDCDataFile $extractedSDDCDataFile
At this point, the NSX installation on each host in the vSphere cluster should self-heal. Monitor the self-healing process until complete in the NSX Manager UI at
System
Fabric
Hosts
Clusters
before proceeding. It might take several minutes for the process to initiate. If you see an error on the hosts that they are not part of the distributed switch, it just means that the self-healing process is yet to start.

Configure vSAN Back to Honour Cluster Member Updates

After the vSAN cluster has been moved to the restored vCenter Server, you must configure back vSAN to honour cluster member updates.

UI Procedure

  1. Connect to each ESXi host using SSH and login as
    root
    .
  2. Run the following command.
    esxcli system settings advanced set --int-value=0 --option=/VSAN/IgnoreClusterMemberListUpdates

PowerShell Procedure

  1. Start PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console.
    $restoredVcenterFqdn = "sfo-m01-vc01.sfo.rainpole.io" $restoredVcenterAdmin = "administrator@vsphere.local" $restoredVcenterAdminPassword = "VMw@re1!" $clusterName = "sfo-m01-cl01" $setting = "disable" $extractedSDDCDataFile = ".\extracted-sddc-data.json"
  3. Perform the configuration by running the command in the PowerShell console.
    Set-ClusterHostsvSanIgnoreClusterMemberList -vCenterFQDN $restoredVcenterFqdn -vCenterAdmin $restoredVcenterAdmin -vCenterAdminPassword $restoredVcenterAdminPassword -clusterName $clusterName -setting $setting -extractedSDDCDataFile $extractedSDDCDataFile