Recovery Plan for a
Instance

This section describes the tasks required to perform the recovery of an entire
instance

Recovery Timeline for the
Instance

To minimize the time of overall recovery in
, you can perform recovery tasks across multiple workload domains by following this timeline, adapted for your setup. The timeline is for the following example setup:
  • 3 VI workload domains.
  • VI Workload Domain 1 and VI Workload Domain 2 are in the same vCenter Single Sign-On domain as the management domain. They are in Enhanced Link Mode (ELM).
  • 5.x only. VI Workload Domain 3 is in an isolated vCenter Single Sign-On (SSO) domain.
  • The restore pattern for VI workload domain in the same SSO domain can be extended if more VI workload domains are connected to the management vCenter Single Sign-On domain.
Task
Management Domain
VI Workload Domain 1 (ELM)
VI Workload Domain 2 (ELM)
VI Workload Domain 3 (Isolated SSO)
Notes
1
Prepare for Recovery
Prepare for Recovery
Prepare for Recovery
Prepare for Recovery
2
Deploy ESXi Hosts
Deploy ESXi Hosts
Deploy ESXi Hosts
Deploy ESXi Hosts
3
Deploy VMware Cloud Builder
4
Prepare for Partial Bring-Up
5
Perform Partial Bring-Up
6
  • Deploy vCenter Server from an OVA File
  • Deploy the first NSX Manager Node from an OVA File
  • Deploy of SDDC Manager from an OVA FIle
  • Deploy vCenter Server from an OVA File
  • Deploy the first NSX Manager Node from an OVA File
  • Deploy vCenter Server from an OVA File
  • Deploy the first NSX Manager Node from an OVA File
  • Deploy vCenter Server from an OVA File
  • Deploy the first NSX Manager Node from an OVA File
7
  • Restore vCenter Server
  • Restore NSX Manager
Restore NSX Manager
Restore NSX Manager
Restore NSX Manager
8
Restore SDDC Manager
Restore vCenter Server
Restore vCenter Server
9
Back Up Cluster Configuration
Back Up Cluster Configuration
Restore vCenter Server
Back Up Cluster Configuration
10
Recover the vSphere Cluster
Recover the vSphere Cluster
Back Up Cluster Configuration
Recover the vSphere Cluster
11
Remediate the NSX Installation on the vSphere Cluster
Remediate NSX Installation on the vSphere Cluster
Recover vSphere Cluster
Remediate the NSX Installation on vSphere Cluster
You can start restoring workloads that do not use NSX Edge nodes in VI Workload Domain 1 and VI Workload Domain 3.
12
Optional. Recover the NSX Manager Cluster*
Optional. Recover the NSX Manager Cluster*
Remediate the NSX Installation on Cluster
Optional. Recover NSX Manager Cluster*
You can start restoring workloads that do not use NSX Edge nodes in VI Workload Domain 2.
13
Recover NSX Edge Nodes
Recover the NSX Edge Nodes
Optional. Recover the NSX Manager Cluster*
Recover the NSX Edge Nodes
14
Conditional. Recover the NSX Manager Cluster**
Conditional. Recover the NSX Manager Cluster**
Recover the NSX Edge Nodes
Conditional. Recover the NSX Manager Cluster**
15
Perform Post-Recovery Tasks***
Perform Post-Recovery Tasks***
Conditional. Recover the NSX Manager Cluster**
Perform Post-Recovery Tasks***
16
Perform Post-Recovery Tasks***
* If high availability of the NSX Manager nodes is considered less important than recovery time, you can perform this task later.
** Must be done if you skipped the deployment of the additional NSX Manager nodes earlier.
*** Restore vSphere DRS, VM Location, VM Overrides and VM Tags only after all workload VMs are restored to the cluster.

Preparing for Recovery of the
Instance

Preparation Tasks for
Recovery
Task
Task Name
Additional Information
1
Locate the most recent known good file-based backups for the following components:
  • SDDC Manager
  • All vCenter Server instances
  • All NSX Manager instances
2
  • Locate image or file-based backups of all other management VMs.
3
4
Create a DNS record in your external DNS server for a temporary vCenter Server instance used during recovery.
5
For partial bring-up operations that are manually performed by using the
UI, either locate the original or create a new bring-up specification file, that is, a
bring-up.json
file or a deployment parameter workbook.
  • If additional ESXi hosts are added to the management cluster after the initial bring-up, you must perform the following tasks:
    • Perform the partial bring-up operation by using a JSON file instead of using a deployment parameter workbook.
    • Add the additional ESXi hosts to the bring-up JSON file.
  • Verify that the
    vCenterHostName
    property in the bring-up specification file is set to the name of the temporary vCenter Server instance.
  • Verify that the
    vcenterIP
    in the bring-up specification file is set to the IP address of the temporary vCenter Server instance.
  • Verify that the passwords and SSH keys in the bring-up specification file are updated to the current set of passwords and SSH keys retrieved in Step 3.
6
Locate version and build number for vCenter Server, NSX Manager, and SDDC Manager.
  • You can see the vCenter Server build number in the
    backup-metadata.json
    in the vCenter Server file-based backup.
  • You can see the NSX Manager version and build number from the file name of the backup file.
7
Download the OVA files of vCenter Server and NSX Manager for the required version and build number.
8
Download the Refresh SSH Key Script from VMware Knowledge Base article 79004
9
Re-install ESXi on the hosts.
See
VMware Cloud Foundation Deployment Guide
.

Post-Recovery Cleanup Plan for

Management Domain Recovery

This section describes the tasks required to perform a recovery of the VMware Cloud Foundation Management domain.

Gather Information from the SDDC Manager Backup

The encrypted SDDC Manager backup contains information required for the recovery process.

UI Procedure

  1. Identify a backup file for the restore operation and download it from the SFTP server to your host machine.
  2. On your host machine, open a terminal and run the following command to extract the content of the backup file.
    OPENSSL_FIPS=1 openssl enc -d -aes-256-cbc -md sha256 -in filename-of-restore-file | tar -xz
  3. When prompted, enter the
    encryption_password
    .
  4. In the extracted folder, locate and open the
    metadata.json
    file in a text editor.
  5. Locate the
    sddc_manager_ova_location
    value and copy the URL.
  6. In a Web browser, paste the URL and download the SDDC Manager OVA file.
  7. In the extracted folder, locate and view the contents of the
    security_password_vault.json
    file.
  8. Write down the following information for each component:
    Component
    Information
    ESXi hosts
    • FQDN
    • IP address
    • root
      user password
    • Service account user name
    vCenter Server
    • FQDN
    • IP address
    • root
      user password
    • SSO administrator password
    NSX Manager nodes
    • FQDN
    • IP address
    • admin
      user password
    • root
      user password
    • audit
      user password
    NSX Edge nodes
    • FQDN
    • IP address
    • root
      user password
    • admin
      user password
    • audit
      user password
    SDDC Manager
    • FQDN
    • IP address
    • root
      user password
    • Backup user password
    • FIPS configuration

PowerShell Procedure

  1. Windows PowerShell を起動します。
  2. サンプル コードの値を
    の値に置き換え、PowerShell コンソールでコマンドを実行します。
    $vcfBackupFilePath = ".\vcf-backup-sfo-vcf01-sfo-rainpole-io-2024-05-21-15-51-04.tar.gz" $encryptionPassword = "VMw@re1!VMw@re1!" $managementVcenterBackupFolderPath = "10.221.78.133/F$/backup/vCenter/sn_sfo-m01-vc01.sfo.rainpole.io/M_8.0.1.00100_20231121-104120_"
  3. PowerShell コンソールでコマンドを実行して、構成を実行します。
    New-ExtractDataFromSDDCBackup -vcfBackupFilePath $vcfBackupFilePath -managementVcenterBackupFolderPath $managementVcenterBackupFolderPath -encryptionPassword $encryptionPassword

Modify the
Workflow for a Partial Bring-up of

To recover the management domain of
, you use
to perform a partial bring-up.

UI Procedure

  1. Open an SSH connection to the
    appliance and change to
    root
    user.
    ssh admin@cloudbuilder su -
  2. Go to the directory that contains the workflow configuration files.
    cd /opt/vmware/bringup/webapps/bringup-app/conf/workflowconfig/
  3. Make a backup copy of the workflow file and create an empty workflow file.
    cp workflowspec-ems.json workflowspec-ems.json.backup rm workflowspec-ems.json touch workflowspec-ems.json chown vcf_bringup:vcf workflowspec-ems.json chmod 740 workflowspec-ems.json
  4. Edit the workflow file.
    1. Open the file for editing.
      vi workflowspec-ems.json
    2. Paste the following contents into the newly created
      workflowspec-ems.json
      { "state": "Processing", "name": "bring-up-Ems", "description": "bring-up", "inputs": {}, "outputs": {}, "statetransitions": [ { "Action": "RegisterbringupDeploymentForCEIP" }, { "Action": "ValidateThumbprints" }, { "Action": "TrustCertificates" }, { "Action": "ImportSSHKeys" }, { "Action": "InitialEnvironmentSetup" }, { "Action": "VCDeployment" }, { "Action": "ManagementClusterContractConfiguration", "OutputMap": { "clusterMoid": "clusterMoid", "clusterName": "clusterName" } }, { "Action": "ManagementClusterConfiguration" }, { "Action": "EnableVsanDedupOnCluster" }, { "Action": "PostManagementClusterConfiguration" }, { "Action": "EnableVsphereClusterServices", "InputMap": { "clusterMoid": "clusterMoid" } }, { "Action": "ApplyEsxLicense" }, { "Action": "EnableVsanMonitoring", "InputMap": { "clusterMoid": "clusterMoid" } }, { "Action": "VCenterServiceAccountsConfiguration", "OutputMap": { "vcenterServiceAccount": "vcenterServiceAccount", "nsxtVcenterServiceAccount": "nsxtVcenterServiceAccount" } } ] }
    3. Save your changes and exit the editor.

PowerShell Procedure

  1. Windows PowerShell を起動します。
  2. サンプル コードの値を
    の値に置き換え、PowerShell コンソールでコマンドを実行します。
    $extractedSDDCDataFile = ".\extracted-sddc-data.json" $cloudBuilderFQDN = "sfo-cb01.sfo.rainpole.io" $cloudBuilderAdminUserPassword = "VMw@re1!" $cloudBuilderRootUserPassword = "VMw@re1!"
  3. PowerShell コンソールでコマンドを実行して、構成を実行します。
    New-PrepareforPartialBringup -extractedSDDCDataFile $extractedSDDCDataFile -cloudBuilderFQDN $cloudBuilderFQDN -cloudBuilderAdminUserPassword $cloudBuilderAdminUserPassword -cloudBuilderRootUserPassword $cloudBuilderRootUserPassword

Perform a Partial Bring-up of

You perform a partial bring-up to enable the recovery of the management domain vSAN cluster using a temporary vCenter Server instance.
The
deployment process will finish after deploying vCenter Server.

UI Procedure

Follow the
VMware Cloud Foundation Deployment Guide
for your
version using the modified bring-up JSON file or deployment parameters workbook.

PowerShell Procedure

  1. Start PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console.
    $extractedSDDCDataFile = ".\extracted-sddc-data.json" $tempVcenterIp ="172.17.31.170" $tempVcenterHostname ="sfo-m01-vc02" $localUserPassword ="VMw@re1!VMw@re1!" $rootUserPassword ="VMw@re1!" $basicAuthUserPassword ="VMw@re1!" $vcfUserPassword ="VMw@re1!" $transportVlanId ="2288" $mgmtVcenterServerSize ="small" $cloudBuilderFQDN = sfo-cb01.sfo.rainpole.io $cloudBuilderAdminUserPassword = "VMw@re1!"
  3. Generate a partial bring-up JSON file by running the command in the PowerShell console. When prompted, select the physical NICs to be assigned to each distributed switch.
    New-ReconstructedPartialBringupJsonSpec -extractedSDDCDataFile ".\extracted-sddc-data.json" -tempVcenterIp $tempVcenterIp -tempVcenterHostname $tempVcenterHostname -vcfLocalUserPassword $localUserPassword -vcfRootUserPassword $rootUserPassword -vcfRestApiPassword $basicAuthUserPassword -vcfSecondUserPassword $vcfUserPassword -transportVlanId $transportVlanId -dedupEnabled $false -vcenterServerSize $mgmtVcenterServerSize
  4. Connect to VMware Cloud Builder by running the command in the PowerShell console.
    Connect-VcfCloudBuilderServer -server sfo-cb01.sfo.rainpole.io -User 'admin' -Password $cloudBuilderAdminUserPassword
  5. Validate the bring-up JSON file on VMware Cloud Builder by running the command in the PowerShell console.
    $extractedSDDCDataFile = "F:\sddc-manager-backup\extracted-sddc-data.json" $partialBringupSpecFile = (($extractedSddcData.workloadDomains | Where-Object {$_.domainType -eq "MANAGEMENT"}).domainName + "-partial-bringup-spec.json") $cloudBuilderFQDN = "sfo-cb01.sfo.rainpole.io" $cloudBuilderAdminUserPassword = "VMw@re1!VMw@re1!" New-PartialManagementDomainDeployment -partialBringupSpecFile $partialBringupSpecFile -extractedSDDCDataFile $extractedSDDCDataFile -cloudBuilderFQDN $cloudBuilderFQDN -cloudBuilderAdminUserPassword $cloudBuilderAdminUserPassword
  6. Monitor the bring-up progress in the VMware Cloud Builder UI.

Deploy a vCenter Server Appliance from an OVA File

To restore a vCenter Server instance, you first deploy a new appliance from an OVA file on the temporary vCenter Server, deployed during partial bring-up. You then use this appliance to restore the management domain or VI workload domain.

UI Procedure

  1. Mount the vCenter Server ISO in Windows Explorer.
  2. Log in to the temporary vCenter Server, deployed at the partial bring-up operation, and navigate to the default cluster.
  3. Right-click the cluster and select
    Deploy OVF Template
    .
  4. Navigate to the
    vcsa
    folder on the mounted ISO, select the vCenter Server OVA file, and click
    Next
    .
  5. Enter the virtual machine name and click
    Next
    .
  6. On the
    Compute resource
    page, select the cluster and click
    Next
    .
  7. On the
    Review details
    page, if warnings appear, click
    Ignore
    and click
    Next
    .
  8. On the
    License agreements
    page, click
    I accept all license agreements
    and click
    Next
    .
  9. On the
    Configuration
    page, select the correct size for the vCenter Server, and click
    Next
    .
  10. On the
    Select storage
    page, select the vSAN datastore, and click
    Next
    .
  11. On the
    Select networks
    page, select the management network, and click
    Next
    .
  12. On the
    Customize template
    page enter the following information, and click
    Next
    .
    Section
    Setting
    Value
    Networking Configuration
    Host Network IP Address Family
    ipv4
    Host Network Mode
    static
    Host Network IP Address
    <vcenter_server_ip_address>
    Host Network Prefix
    <management_network_prefix>
    Host Network Default Gateway
    <management_network_gateway>
    Host Network DNS Servers
    <dns_server_comma_separated_list>
    Host Network Identity
    <vcenter_server_fqdn>
    SSO Configuration
    N/A
    N/A
    System Configuration
    Root Password
    <vcenter_server_root_password>
    Upgrade Configuration
    N/A
    N/A
    Miscellaneous
    N/A
    N/A
    Networking Properties
    Domain Name
    <dns_domain_name>
    Domain Search Path
    <domain_search_path_comma_separated_list>
  13. On the
    Ready to complete
    page, review the details and click
    Finish
    .

PowerShell Procedure

  1. Start PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console.
    $tempVcenterFqdn = "sfo-m01-vc02.sfo.rainpole.io" $tempvCenterAdmin = "administrator@vsphere.local" $tempvCenterAdminPassword = "VMw@re1!" $extractedSDDCDataFile = ".\extracted-sddc-data.json" $workloadDomain = "sfo-m01" $restoredvCenterDeploymentSize = "small" $vCenterOvaFile = "F:\OVA\VMware-vCenter-Server-Appliance-7.0.3.01400-21477706_OVF10.ova"
  3. Perform the configuration by running the command in the PowerShell console.
    New-vCenterOvaDeployment -vCenterFqdn $tempVcenterFqdn -vCenterAdmin $tempvCenterAdmin -vCenterAdminPassword $tempvCenterAdminPassword -extractedSDDCDataFile $extractedSDDCDataFile -workloadDomain $workloadDomain -restoredvCenterDeploymentSize $restoredvCenterDeploymentSize -vCenterOvaFile $vCenterOvaFile

Deploy the First NSX Manager Cluster Node from an OVA File

When the three nodes in an NSX Manager cluster are in a failed state, you begin the restore process by restoring the first cluster node from an OVA file.
The first NSX Manager node is defined as the node that took the backup of the NSX Manager cluster. To identify this node, examine the name of the folder created during backup, as it will contain the node IP address as part of the folder name.

UI Procedure

  1. In a Web browser, log in to the management domain vCenter Server at
    https://<vcenter_server_fqdn>/ui
    by using the vSphere Client .
  2. Right-click the default cluster of the management domain and select
    Deploy OVF Template
    .
  3. On the
    Select an OVF template
    page, select
    Local file
    , click
    Upload files
    , navigate to the location of the NSX Manager OVA file, click
    Open
    , and click
    Next
    .
  4. On the
    Select a name and folder
    page, enter the VM name and click
    Next
    .
  5. On the
    Select a compute resource
    page, select the cluster and click
    Next
    .
  6. On the
    Review details
    page, click
    Next
    .
  7. On the
    Configuration
    page, select the appropriate size and click
    Next
    .
  8. For the management domain select
    Medium
    and for workload domains select
    Large
    unless you changed these defaults during deployment.
  9. On the
    Select storage
    page, select the management vSAN datastore, and click
    Next
    .
  10. On the
    Select networks
    page, from the
    Destination network
    drop-down menu, select management distributed port group, and click
    Next
    .
  11. On the
    Customize template
    page, enter these values and click
    Next
    .
    Setting
    Value
    System root user password
    <first_nsx_cluster_node_root_password>
    CLI admin user password
    <first_nsx_cluster_node_admin_password>
    You must enter a password. Do not leave the default value.
    CLI audit user password
    <first_nsx_cluster_node_audit_password>
    You must enter a password. Do not leave the default value.
    Hostname
    <first_nsx_cluster_node_fqdn>
    Default IPv4 gateway
    <first_nsx_cluster_node_gw>
    Management network IPv4 address
    <first_nsx_cluster_node_ip>
    Management network netmask
    <first_nsx_cluster_node_mask>
    DNS server list
    <dns_server_list>
    NTP server list
    <ntp_server_list>
    Enable SSH
    Selected
    Allow root SSH logins
    Deselected
  12. On the
    Ready to complete
    page, review the deployment details and click Finish.

PowerShell Procedure

  1. Start PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console.
    The values in this example are for the management domain. Replace with the values for the specific workload domain you are recovering.
    $tempvCenterFqdn = "sfo-m01-vc02.sfo.rainpole.io" $tempvCenterAdmin = "administrator@vsphere.local" $tempvCenterAdminPassword = "VMw@re1!" $extractedSDDCDataFile = ".\extracted-sddc-data.json" $workloadDomain = "sfo-m01" $restoredNsxManagerDeploymentSize = "medium" $nsxManagerOvaFile = "F:\OVA\nsx-unified-appliance-3.2.2.1.0.21487565.ova"
  3. Perform the operation by running the command in the PowerShell console.
    New-NSXManagerOvaDeployment -vCenterFqdn $tempvCenterFqdn -vCenterAdmin $tempvCenterAdmin -vCenterAdminPassword $tempvCenterAdminPassword -extractedSDDCDataFile $extractedSDDCDataFile -workloadDomain $workloadDomain -restoredNsxManagerDeploymentSize $restoredNsxManagerDeploymentSize -nsxManagerOvaFile $nsxManagerOvaFile

Deploy the SDDC Manager Appliance from an OVA File

You deploy a new SDDC Manager appliance by using the OVA file that you downloaded during the preparation for the restore.

UI Procedure

  1. In a Web browser, log in to the management domain vCenter Server at
    https://<vcenter_server_fqdn>/ui
    by using the vSphere Client.
  2. 仮想マシンおよびテンプレート
    インベントリで、管理ドメインの vCenter Server ツリーを展開し、管理ドメイン データセンターを展開します。
  3. Right-click the management folder and select
    Deploy OVF Template
    .
  4. On the
    Select an OVF template
    page, select
    Local file
    , and upload the SDDC Manager OVA file, and click
    Next
    .
  5. On the
    Select a name and folder
    page, in the
    Virtual machine name
    text box, enter a virtual machine name, and click Next.
  6. On the
    Select a compute resource
    page, click
    Next
    .
  7. On the Review details page, review the settings and click
    Next
    .
  8. On the
    License agreements
    page, accept the license agreement and click
    Next
    .
  9. On the
    Select storage
    page, select the vSAN datastore and click
    Next
    .
    The datastore must match the
    vsan_datastore
    value in the
    metadata.json
    file that you downloaded from the SDDC Manager backup during the preparation for the restore.
  10. On the
    Select networks
    page, from the
    Destination network
    drop-down menu, select the management network distributed port group and click
    Next
    .
    The distributed port group must match the
    port_group
    value in the
    metadata.json
    file that you downloaded during the preparation for the restore.
  11. On the
    Customize template
    page, enter the following values and click Next.
    Setting
    Description
    Enter root user password
    You can use the original
    root
    user password or a new password.
    Enter login (vcf) user password
    You can use the original
    vcf
    user password or a new password.
    Enter basic auth user password
    You can use the original
    admin
    user password or a new password.
    Enter backup (backup) user password
    The backup password that you saved during the preparation for the restore. This password can be changed later.
    Enter Local user password
    You can use the original Local user password or a new password.
    Hostname
    The FQDN must match the
    hostname
    value in the
    metadata.json
    file that you downloaded during the preparation for the restore.
    NTP sources
    The NTP server details for the appliance.
    Enable FIPs
    The FIPS configuration in the
    metadata.json
    file that you saved during the preparation for the restore. You cannot restore a FIPS backup to a non-FIPS configured appliance.
    PSC Address
    Leave blank.
    Initial PSC SSH password
    Leave blank.
    SSO Username
    Leave blank.
    SSO password
    Leave blank.
    Network 1 IP address
    The IP address for the appliance.
    Network 1 Subnet Mask
    The subnet mask for the appliance.
    Network Default Gateway
    The default gateway for the appliance.
    DNS Domain name
    The domain name for the appliance.
    Domain search path
    The domain search path(s) for the appliance.
    Domain name servers
    The DNS servers for the appliance.
  12. On the
    Ready to complete
    page, click
    Finish
    and wait for the process to complete.
  13. When the SDDC Manager appliance deployment completes, expand the management folder.
  14. Right-click the SDDC Manager appliance and select
    Snapshots
    >
    Take Snapshot
    .
  15. Right-click the SDDC Manager appliance, select
    Power
    >
    Power On
    .

PowerShell Procedure

  1. Start PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console.
    $tempvCenterFqdn = "sfo-m01-vc02.sfo.rainpole.io" $tempvCenterAdmin = "administrator@vsphere.local" $tempvCenterAdminPassword = "VMw@re1!" $extractedSDDCDataFile = ".\extracted-sddc-data.json" $sddcManagerOvaFile="F:\OVA\VCF-SDDC-Manager-Appliance-4.5.1.0-21682411.ova" $rootUserPassword = "VMw@re1!" $vcfUserPassword = "VMw@re1!" $localUserPassword = "VMw@re1!" $basicAuthUserPassword = "VMw@re1!"
  3. Perform the configuration by running the command in the PowerShell console.
    New-SDDCManagerOvaDeployment -vCenterFqdn $tempvCenterFqdn -vCenterAdmin $tempvCenterAdmin -vCenterAdminPassword $tempvCenterAdminPassword -extractedSDDCDataFile $extractedSDDCDataFile -sddcManagerOvaFile $sddcManagerOvaFile -rootUserPassword $rootUserPassword -vcfUserPassword $vcfUserPassword -localUserPassword $localUserPassword -basicAuthUserPassword $basicAuthUserPassword

Restore vCenter Server from a File-Based Backup

Restore the management domain vCenter Server from a file-based backup on the appliance you deployed from an OVA file.

UI Procedure

  1. Log in to the appliance management interface (VAMI) of the management domain vCenter Server at
    https://<vcenter_server_fqdn>:5480
    as
    root
    .
  2. Navigate to the default cluster of the management domain.
  3. Select the
    Restore
    .
  4. Provide the backup server address, protocol and the path to the backup file, and click
    Next
    .
  5. Provide the SSO administrator user name and password, and click
    Next
    .
  6. On the Ready to complete page, review the details and click
    Finish
    .

PowerShell Procedure

  1. Start PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console.
    $tempvCenterFqdn = "sfo-m01-vc02.sfo.rainpole.io" $tempvCenterAdmin = "administrator@vsphere.local" $tempvCenterAdminPassword = "VMw@re1!" $extractedSDDCDataFile = ".\extracted-sddc-data.json" $workloadDomain ="sfo-m01" $vCenterBackupPath = "10.221.78.133/F$/backup/vCenter/sn_sfo-m01-vc01.sfo.rainpole.io/M_8.0.1.00100_20231121-104120_" $locationtype ="SMB" $locationUser ="Administrator" $locationPassword ="VMw@re123!"
  3. Perform the configuration by running the command in the PowerShell console.
    Invoke-vCenterRestore -vCenterFqdn $tempvCenterFqdn -vCenterAdmin $tempvCenterAdmin -vCenterAdminPassword $tempvCenterAdminPassword -extractedSDDCDataFile $extractedSDDCDataFile -workloadDomain $workloadDomain -vCenterBackupPath $vCenterBackupPath -locationtype $locationtype -locationUser $locationUser -locationPassword $locationPassword

Restore NSX Manager from a File-Based Backup

You restore the file-based backup of the first NSX Manager cluster node to the newly-deployed NSX Manager instance.

UI Procedure

  1. In a Web browser, log in to the newly-deployed first NSX Manager cluster node by using the user interface as an
    Enterprise Administrator
    .
  2. On the main navigation bar, click
    System
    .
  3. In the left navigation pane, under
    Lifecycle management
    , click
    Backup & Restore
    .
  4. In the
    NSX configuration
    pane, under
    SFTP server
    , click
    Edit
    .
  5. In the
    Backup configuration
    dialog box, enter the details for your backup server, and click
    Save
    .
  6. Under
    Backup history
    , select the source backup, and click
    Restore
    .
  7. During the restore, when prompted, reject adding NSX Manager nodes by clicking
    I understand
    and
    Resume
    .
  8. If a
    Confirm CM/VC Connectivity
    dialog box appears, click
    I understand the message mentioned above and wish to proceed
    , and click
    Resume
    .
  9. If a fabric node discovery time out dialog box appears, click
    Resume
    to continue the restore.

PowerShell Procedure

  1. Start Windows PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console.
    $extractedSDDCDataFile = ".\extracted-sddc-data.json" $workloadDomain ="sfo-m01" $sftpServer ="10.50.5.66" $sftpUser ="svc-vcf-bkup" $sftpPassword ="VMw@re1!" $sftpServerBackupPath ="/media/backups" $backupPassphrase ="VMw@re1!VMw@re1!"
  3. Perform the configuration by running the command in the PowerShell console.
    Invoke-NSXManagerRestore -extractedSDDCDataFile $extractedSDDCDataFile -workloadDomain $workloadDomain -sftpServer $sftpServer -sftpUser $sftpUser -sftpPassword $sftpPassword -sftpServerBackupPath $sftpServerBackupPath -backupPassphrase $backupPassphrase

Verify the State of the First NSX Manager Cluster Node

After you restore the first NSX Manager cluster node, you verify the services state from the Web console of the restored node VM.
  1. In a Web browser, log in to the temporary vCenter Server at
    https://<temp_management_vcenter_server_fqdn>/ui
    by using the vSphere Client.
  2. Click the VM name of the newly-deployed first NSX Manager cluster node, click
    Launch Web Console
    , and log in by using administrator credentials.
  3. Run the command to view the cluster status.
    get cluster status
    The services on the single-node NSX Manager cluster appear as
    UP
    .

Restore SDDC Manager

Perform the following tasks to restore SDDC Manager.

Update vCenter Server SSH Keys in SDDC Manager Backup

To enable a restore of SDDC Manager, you must update the SSH keys for the new management vCenter Server appliance in the backup archive.

Prerequisites

Verify that the vCenter Server restore operation is complete.

Manual Procedure

  1. Copy the encrypted backup file to the
    /tmp
    folder on the newly-deployed SDDC Manager appliance.
  2. Run the following command to extract the backup archive.
    OPENSSL_FIPS=1 openssl enc -d -aes-256-cbc -md sha256 -in <backup_archive_name>.tar.gz | tar -xz
    When prompted, enter the
    encryption_password
    .
  3. Retrieve the new
    ecdsa-sha2-nistp256
    and
    ssh-rsa
    SSH keys for the management vCenter Server.
    ssh-keyscan <management_vc_fqdn>
  4. Open the backup of the known hosts and replace the existing SSH keys with the new SSH keys for the management vCenter Server from Step 3, and save the file.
    vi <backup_archive_folder_name>/appliancemanager_ssh_knownHosts.json
  5. Rename the original backup archive.
    mv <backup_archive_name>.tar.gz <backup_archive_name>.tar.gz.original
  6. Re-encrypt the backup folder as an archive using the same archive name.
    tar -cz <backup_archive_folder_name> | OPENSSL_FIPS=1 openssl enc -aes-256-cbc -md sha256 -out <backup_archive_name>.tar.gz

PowerShell Procedure

  1. Start PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console.
    $rootUserPassword = "VMw@re1!" $vcfUserPassword = "VMw@re1!" $backupFilePath = "F:\backup\vcf-backup-sfo-vcf01-sfo-rainpole-io-2023-09-19-10-53-02.tar.gz" $encryptionPassword = "VMw@re1!VMw@re1!" $extractedSDDCDataFile = "F:\backup\extracted-sddc-data.json" $tempvCenterFqdn = "sfo-m01-vc02.sfo.rainpole.io" $tempvCenterAdmin = "Administrator@vsphere.local" $tempvCenterAdminPassword = "VMw@re1!"
  3. Perform the configuration by running the command in the PowerShell console.
    New-UploadAndModifySDDCManagerBackup -rootUserPassword $rootUserPassword -vcfUserPassword $vcfUserPassword -backupFilePath $backupFilePath -encryptionPassword $encryptionPassword -extractedSDDCDataFile $extractedSDDCDataFile -vCenterFqdn $tempvCenterFqdn -vCenterAdmin $tempvCenterAdmin -vCenterAdminPassword $tempvCenterAdminPassword

Restore SDDC Manager from a File-Based Backup

You restore the file-based backup on the newly-deployed SDDC Manager appliance.

Manual Procedure

  1. Open an SSH connection to the SDDC Manager appliance as
    vcf
    user.
  2. Change to
    root
    user and edit the
    restore_status.json
    file.
  3. su - cd /opt/vmware/sddc-support/backup cp restore_status.json restore_status.json.bak vi restore_status.json
  4. Delete the “PostRestoreNfsRefresh” task in two places.
  5. Save the file.
  6. Obtain the authentication token from the SDDC Manager appliance by running the following command so that you can perform the restore process.
    TOKEN=$(curl https://<sddc_manager_fqdn>/v1/tokens -k -X POST -H "Content-Type: application/json" -d '{ "username": "admin@local", "password": "<admin@local_password>" }' | awk -F "\"" '{ print $4}')
  7. Run the following command to verify the token.
    echo $TOKEN
  8. Run the following command to start the restore process.
    The command output contains the ID of the restore task.
    RESTOREID=$(curl https://<sddc_manager_fqdn>/v1/restores/tasks -k -X POST -H "Content-Type: application/json" -H "Authorization: Bearer $TOKEN" \ -d '{ "elements" : [ { "resourceType" : "SDDC_MANAGER" } ], "backupFile" : "<backup_file>", "encryption" : { "passphrase" : "<encryption_password>" } }' | json_pp | jq '.id' | cut -d '"' -f 2)
  9. Monitor the restore task by using the following command until the status becomes Successful.
    curl https://<sddc_manager_fqdn>/v1/restores/tasks/$RESTOREID -k -X GET -H "Content-Type: application/json" -H "Authorization: Bearer $TOKEN" | json_pp

PowerShell Procedure

  1. Start PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console.
    $extractedSDDCDataFile = "F:\backup\extracted-sddc-data.json" $backupFilePath = "F:\backup\vcf-backup-sfo-vcf01-sfo-rainpole-io-2023-11-21-10-42-38.tar.gz" $vcfUserPassword ="VMw@re1!" $localUserPassword ="VMw@re1!VMw@re1!" $rootUserPassword ="VMw@re1!"
  3. Perform the configuration by running the command in the PowerShell console.
    Invoke-SDDCManagerRestore -extractedSDDCDataFile $extractedSDDCDataFile -backupFilePath $backupFilePath -vcfUserPassword $vcfUserPassword -localUserPassword $localUserPassword -rootUserPassword $rootUserPassword

Recover the Management vSphere Cluster

Perform the following tasks to recover the management domain vSphere cluster.

Export the Cluster Settings from the Restored vCenter Server

Before you can restore vSphere clusters of the restored vCenter Server, you must first export the settings so they can be reapplied to the restored cluster.
  • vCenter Server has been restored and powered on.
  1. Start PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console.
    The values in this example are for the management domain. Replace with the values for the specific workload domain you are recovering.
    $restoredVcenterFqdn = "sfo-m01-vc01.sfo.rainpole.io" $restoredVcenterAdmin = "administrator@vsphere.local" $restoredVcenterAdminPassword = "VMw@re1!" $clusterName = "sfo-m01-cl01"
  3. Perform the configuration by running the command in the PowerShell console.
    Connect-VIServer -server $restoredVcenterFqdn -user $restoredVcenterAdmin -password $restoredVcenterAdminPassword Backup-ClusterVMOverrides -clusterName $clusterName Backup-ClusterVMLocations -clusterName $clusterName Backup-ClusterDRSGroupsAndRules -clusterName $clusterName Backup-ClusterVMTags -clusterName $clusterName Disconnect-VIServer * -confirm:$false
  4. Save the JSON file output for later use.
  5. Repeat for all clusters in the vCenter Server.

Remove Non-Responsive ESXi Hosts from the Inventory

Before you can repair a failed vSphere cluster, you must first remove the non-responsive hosts from the cluster.

Prerequisites

Ensure you have exported all cluster settings before proceeding.

UI Procedure

  1. Log into the NSX Manager for the workload domain and navigate to
    System
    Fabric
    Hosts
    Clusters
    .
    In NSX 3.x, the relevant navigation path is
    System
    Fabric
    Nodes
    Host Transport Nodes
    and choose the vCenter from
    Managed by
  2. Select the check box next for the relevant vSphere cluster and click
    Remove NSX
    .
  3. Deselect the check box next for the relevant vSphere cluster.
  4. Expand the cluster and wait for all hosts in the cluster to go into an
    Orphaned
    State
  5. Select the check box that selects all hosts in the cluster without selecting the cluster object itself and select
    Remove NSX
    .
  6. Select the
    Force
    option and submit.
    Wait until all hosts show as unconfigured.
  7. Log in to the vCenter Server with the non-responsive hosts and navigate to the cluster.
  8. Select the cluster, and in the right pane, navigate to the
    Hosts
    tab.
  9. Select the check box for each non-responsive host, right click the selected hosts and select
    Remove from Inventory
    .
    If the cluster use vSphere Lifecycle Manager images, wait for about a minute to allow the background tasks in NSX to complete the removal of the NSX solution from the relevant cluster before proceeding to the next step.
  10. Log in to the NSX Manager for the workload domain and navigate back to
    System
    Fabric
    Hosts
    Clusters
    In NSX 3, the relevant navigation path is
    System
    Fabric
    Nodes
    Host Transport Nodes
    and choose the vCenter from
    Managed by
  11. Select the check box next for the relevant vSphere cluster and click
    Configure NSX
    .
  12. Select the relevant Transport Node Profile and Submit.

PowerShell Procedure

  1. Start PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console.
    The values in this example are for the management domain. Replace with the values for the specific workload domain you are recovering.
    $restoredVcenterFqdn = "sfo-m01-vc01.sfo.rainpole.io" $restoredVcenterAdmin = "administrator@vsphere.local" $restoredVcenterAdminPassword = "VMw@re1!" $clusterName = "sfo-m01-cl01" $restoredNsxManagerFqdn = "sfo-m01-nsx01.sfo.rainpole.io" $restoredNsxManagerAdmin = "admin" $restoredNsxManagerAdminPassword = "VMw@re1!VMw@re1!" $restoredNsxManagerRootPassword = "VMw@re1!VMw@re1!"
  3. Perform the configuration by running the command in the PowerShell console.
    Remove-NonResponsiveHosts -vCenterFQDN $restoredVcenterFqdn -vCenterAdmin $restoredVcenterAdmin -vCenterAdminPassword $restoredVcenterAdminPassword -clusterName $clusterName -NsxManagerFQDN $restoredNsxManagerFqdn -NsxManagerAdmin $restoredNsxManagerAdmin -NsxManagerAdminPassword $restoredNsxManagerAdminPassword -NsxManagerRootPassword $restoredNsxManagerRootPassword

Configure vSAN to Ignore Cluster Member Updates

To permit moving the vSAN cluster from the temporary vCenter Server to the restored vCenter Server, you must configure vSAN to ignore cluster member updates.

Manual Procedure

  1. Connect to each ESXi host using SSH and login as
    root
    user.
  2. Run the following command:
esxcli system settings advanced set --int-value=1 --option=/VSAN/IgnoreClusterMemberListUpdates

PowerShell Procedure

  1. Start PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console.
    $tempVcenterFqdn = "sfo-m01-vc02.sfo.rainpole.io" $tempVcenterAdmin = "administrator@vsphere.local" $tempVcenterAdminPassword = "VMw@re1!" $clusterName = "sfo-m01-cl01" $setting = "enable" $extractedSDDCDataFile = ".\extracted-sddc-data.json"
  3. Perform the configuration by running the command in the PowerShell console.
    Set-ClusterHostsvSanIgnoreClusterMemberList -vCenterFQDN $tempVcenterFqdn -vCenterAdmin $tempVcenterAdmin -vCenterAdminPassword $tempVcenterAdminPassword -clusterName $clusterName -setting $setting -extractedSDDCDataFile $extractedSDDCDataFile

Migrate Host Networking from vSphere Distributed Switch to vSphere Standard Switch

Before you can move the new vSAN cluster to the restored vCenter Server, you must first disconnect the hosts from the temporary vSphere Distributed Switch by using a temporary vSphere Standard Switch.

UI Procedure

  1. Disconnect a physical vmnic for each host from the vSphere Distributed Switch on the temporary vCenter Server.
    1. Log in to the temporary vCenter Server at
      https://<temporary_vcenter_server_fqdn>/ui
      by using the vSphere Client.
    2. In the
      Networking
      inventory, right-click the distributed switch and select
      Add and Manage Hosts
      .
    3. Select the
      Manage host networking
      task.
    4. On the
      Member hosts
      tab, select all hosts and click
      Next
      .
    5. On the
      Manage physical adapters
      page, select one vmnic, for example vmnic1, and then click
      Unassign adapter
      .
    6. In the
      Confirm Unassign Adapter
      dialog box, select
      Apply this operation to all other hosts
      and then click
      Unassign
      .
    7. Click
      Next
      .
    8. Click
      Next
      ,
      Next
      and
      Finish
      .
  2. Create a standard switch on each ESXi host.
    1. In the
      Hosts and clusters
      inventory, select the first ESXi host, and on the Configure tab, and select
      Networking
      Virtual switches
      .
    2. Click
      Add Standard Virtual Switch
      .
    3. Set the vSwitch Name to
      vSwitch0
      .
    4. Ensure the MTU is set to 9,000.
    5. Repeat this task to complete these steps for each ESXi host.
  3. Connect ESXi hosts to vSphere Standard Switch.
    1. On the
      Virtual switches
      page for the host, click
      Add Networking
      .
    2. In the
      Add Networking
      wizard, select
      Physical Network Adapter
      and click
      Next
      .
    3. On the
      Select target device
      page, select
      vSwitch0
      and click
      Next
      .
    4. On the
      Add physical network adapter
      page, select an unassigned physical adapter, for example vmnic1, from the adapter list, and move it under
      Active adapters
      , and click
      Next
      .
    5. Review the information on the Ready to complete page and click
      Finish
      .
  4. Create a temporary management port group on the temporary vSphere standard switch.
    1. On the
      Virtual switches
      page for the host, expand vSWitch0 and click
      Add Networking
      .
    2. In the
      Add Networking
      wizard, select
      Virtual Machine Port Group for a Standard Switch
      and click
      Next
      .
    3. On the
      Select target device
      page, the standard switch will default to vSwitch0, click
      Next
      .
    4. On the
      Connection settings
      page, change the
      Network label
      to
      temp_mgmt
      , update the VLAN ID with the correct VLAN ID for the management VLAN, and click Next.
    5. Review the information on the
      Ready to complete
      page and then click
      Finish
      .

PowerShell Procedure

  1. Start PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console.
    $tempVcenterFqdn = "sfo-m01-vc02.sfo.rainpole.io" $tempVcenterAdmin = "administrator@vsphere.local" $tempvCenterAdminPassword = "VMw@re1!" $clusterName = "sfo-m01-cl01" $vmnic = "vmnic1" $mtu = "9000"
  3. Perform the configuration by running the command in the PowerShell console.
    Move-ClusterHostNetworkingTovSS -vCenterFqdn $tempVcenterFqdn -vCenterAdmin $tempVcenterAdmin -vCenterAdminPassword $tempvCenterAdminPassword -clusterName $clusterName -extractedSDDCDataFile $extractedSDDCDataFile -vmnic $vmnic -mtu $mtu
    This PowerShell cmdlet is IdemPotent and can be re-run if you encounter issues with connectivity to the management interfaces of the ESXi hosts while performing this operation.

Prepare to Migrate the New vSphere Cluster to the Restored vCenter Server

Before you can migrate the new vSphere cluster to the restored vCenter Server, you must first prepare the cluster in the temporary vCenter Server.

UI Procedure

  1. Log in to the temporary vCenter Server at
    https://<temporary_vcenter_server_fqdn>/ui
    by using the vSphere Client.
  2. Set vSphere DRS to manual on the target cluster.
    1. In the
      Hosts and clusters
      inventory, select the target cluster.
    2. On the
      Configure
      tab, select
      vSphere DRS
      Edit
      .
    3. Change the DRS level to
      Manual
      and click
      Save
  3. Migrate all VMs to the first ESXi host in the cluster.
    1. Right-click on each VM and select
      Migrate
      .
    2. Select the first host in the cluster as a target and click
      Migrate
      .

PowerShell Procedure

  1. Start PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console.
    $tempVcenterFqdn = "sfo-m01-vc02.sfo.rainpole.io" $tempVcenterAdmin = "administrator@vsphere.local" $tempvCenterAdminPassword = "VMw@re1!" $clusterName = "sfo-m01-cl01"
  3. Set vSphere DRS to manual by running the command in the PowerShell console.
    Connect-VIServer $tempVcenterFqdn -user $tempVcenterAdmin -password $tempvCenterAdminPassword Set-Cluster -cluster $clusterName -DrsAutomationLevel "Manual" -confirm:$false Disconnect-VIServer -Server $global:DefaultVIServers -Force -Confirm:$false

Move All Management VMs to a Temporary Port Group on the vSphere Standard Switch

Before you can move the new vSAN cluster to the restored vCenter Server, you must first move all management VMs to a temporary port group on the temporary vSphere Standard Switch.

UI Procedure

  1. Log in to the temporary vCenter Server.
  2. In the
    VMs and templates
    inventory, right-click the VM for the temporary vCenter Server and select
    Edit Settings
    .
  3. Move network adapter 1 to the
    temp_mgmt
    port group on the standard switch, and click
    OK
    .
  4. Repeat for all VMs in the temporary vCenter Server.

PowerShell Procedure

  1. Start PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console.
    $tempVcenterFqdn = "sfo-m01-vc02.sfo.rainpole.io" $tempVcenterAdmin = "administrator@vsphere.local" $tempvCenterAdminPassword = "VMw@re1!" $clusterName = "sfo-m01-cl01"
  3. Perform the configuration by running the command in the PowerShell console.
    Move-MgmtVmsToTempPg -vCenterFQDN $tempVcenterFqdn -vCenterAdmin $tempVcenterAdmin -vCenterAdminPassword $tempvCenterAdminPassword -clusterName $clusterName

Remove the ESXi Hosts from the vSphere Distributed Switch on the Temporary vCenter Server

Before you can move the new vSAN cluster to the restored vCenter Server, you must first remove all ESXi hosts from the temporary vSphere Distributed Switch.

UI Procedure

  1. Disconnect the remaining physical adapters from the vSphere Distributed Switch.
    1. Log in to the temporary vCenter Server at
      https://<temporary_vcenter_server_fqdn>/ui
      by using the vSphere Client.
    2. In the
      Networking
      inventory, right-click the distributed switch and select
      Add and Manage Hosts
      .
    3. Select the
      Manage host networking
      task.
    4. On the
      Member hosts
      tab, select all hosts and click
      Next
      .
    5. On the
      Manage physical adapters
      page, select the remaining physical adapters, for example vmnic0, and then select
      Unassign adapter
      .
    6. In the
      Confirm Unassign Adapter
      dialog box, select
      Apply this operation to all other hosts
      and then click
      Unassign
      .
      You are returned to the
      Manage physical adapters
      screen. Note the vmnic0 is marked as Unassigned.
    7. On he
      Manage physical adapters
      page, click
      Next
      .
    8. Click
      Next
      ,
      Next
      , and
      Finish
      .
  2. Remove the ESXi hosts from the vSphere Distributed Switch on the temporary vCenter Server.
    1. Connect to the temporary vCenter UI and select Networking.
    2. In the
      Networking
      inventory, right-click the distributed switch and select
      Add and Manage Hosts
      .
    3. Select the
      Remove hosts
      task.
    4. On the
      Member hosts
      tab, select all hosts and click
      Next
      .
    5. Review the information on the
      Ready to complete
      page and click
      Finish
      .

PowerShell Procedure

  1. Start PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console.
    $tempVcenterFqdn = "sfo-m01-vc02.sfo.rainpole.io" $tempVcenterAdmin = "administrator@vsphere.local" $tempvCenterAdminPassword = "VMw@re1!" $clusterName = "sfo-m01-cl01" $vdsName = "sfo-m01-cl01-vds01"
  3. Perform the configuration by running the command in the PowerShell console.
    Remove-ClusterHostsFromVds -vCenterFQDN $tempVcenterFqdn -vCenterAdmin $tempVcenterAdmin -vCenterAdminPassword $tempvCenterAdminPassword -clusterName $clusterName -vdsName $vdsName

Migrate ESXi Hosts from the Temporary vCenter Server to the Restored vCenter Server

You migrate the newly deployed vSAN cluster to the restored vCenter Server.

UI Procedure

  1. Log in to the restored vCenter Server for the management domain by using the vSphere Client.
  2. Right-click the restored cluster and click
    Add Hosts
    .
  3. In the
    Add hosts
    wizard, enter the ESXi host names for the management cluster including the user name and password, and click
    Next
    .
    A security alert dialog box appears because vCenter Server is not able to verify the certificate thumbprint.
  4. In the security alert dialog box, select all hosts and click
    Ok
    .
  5. Review the information on the
    Host summary
    page and click
    Next
    .
  6. Review the information on the
    Ready to complete
    page and click
    Finish
    .

PowerShell Procedure

  1. Start PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console.
    $tempvCenterFqdn = "sfo-m01-vc02.sfo.rainpole.io" $tempvCenterAdmin = "administrator@vsphere.local" $tempvCenterAdminPassword = "VMw@re1!" $restoredvCenterFQDN = "sfo-m01-vc01.sfo.rainpole.io" $restoredvCenterAdmin = "administrator@vsphere.local" $restoredvCenterAdminPassword = "VMw@re1!" $clusterName = "sfo-m01-cl01" $extractedSDDCDataFile = ".\extracted-sddc-data.json"
  3. Perform the configuration by running the command in the PowerShell console.
    Move-ClusterHostsToRestoredVcenter -tempvCenterFqdn $tempvCenterFqdn -tempvCenterAdmin $tempvCenterAdmin -tempvCenterAdminPassword $tempvCenterAdminPassword -restoredvCenterFQDN $restoredvCenterFQDN -restoredvCenterAdmin $restoredvCenterAdmin -restoredvCenterAdminPassword $restoredvCenterAdminPassword -clusterName $clusterName -extractedSDDCDataFile $extractedSDDCDataFile

Migrate the ESXi Hosts and VMkernel Adapters to the vSphere Distributed Switch on the Restored vCenter Server

You connect the new cluster hosts to the vSphere Distributed Switch and migrate the VMkernel adapters.

UI Procedure

  1. Log in to the restored vCenter Server by using the vSphere Client.
  2. Add the hosts to the vSphere Distributed Switch.
    1. In
      Networking
      inventory, right-click the vSphere Distributed Switch and select
      Add and Manage Hosts
      .
    2. Select
      Add Hosts
      and click
      Next
      .
    3. On the
      Select hosts
      page, select all the ESX hosts, and click
      Next
      .
    4. On the
      Manage physical adapters
      page, select a free physical adapter, for example vmnic0, and click
      Assign Uplink
      .
    5. Select uplink1 , and click
      Next
      .
    6. On the
      Manage VMkernel adapters
      page, update the following VMkernel adapters to assign them to the appropriate port group on the new distributed switch.
      VMkernel Migration by Domain Type
      Management Domain
      VI Workload Domain
      vmk0 – Management Network port group
      vmk0 – Management Network port group
      vmk1 – vMotion Network port group
      N/A - Not yet created
      vmk2 – vSAN Network port group
      N/A - Not yet created
    7. To assign the VMkernel adapters, select the adapter and under actions for the corresponding port group, click
      Assign
      .
    8. Click
      Next
      .
    9. On the
      Migrate VM networking
      step, click
      Next
      .
    10. Review the information on the
      Ready to complete
      page and click
      Finish
      .
  3. If this is a management domain cluster, migrate the management VMs to the original management port group.
    1. Right-click the temporary management port group and select
      Migrate VMs to Another Network
      .
    2. For destination network, select the management port group on the vSphere Distributed Switch, for example sfo-m01-vc01-vds01-management, and click
      Next
      .
    3. On the
      Select VMs to migrate
      page, select all management VMs and click
      Next
      .
    4. On the
      Ready to complete
      page, click
      Finish
      .
  4. Remove the temporary standard switch on each ESXi Host.
    1. Select the first ESXi host and, on the Configure tab, select
      Networking
      Virtual Switches
      .
    2. Expand
      vSwitch0
      and click the horizontal ellipsis.
    3. Click
      Remove
      and click
      Yes
      .
  5. Add additional host uplinks to the vSphere Distributed Switch.
    1. Right-click the distributed switch and select
      Add and Manager Hosts
      .
    2. Select
      Manage Host Networking
      and click
      Next
      .
    3. On the
      Select hosts
      step, select all the ESX hosts and click
      Next
      .
    4. On the
      Manage physical adapters
      step, select the required free physical adapter(s), for example vmnic1, and from
      Assign Uplink
      select the desired uplinks to corresponding physical adapters, and click
      Next
      .
    5. Click
      Next
      and click
      Next
      .
    6. Review the information on the
      Ready to complete
      page and click
      Finish
      .
If you are running NSX 4.1.2 or later, the NSX installation on each host in the vSphere cluster should self-heal. Monitor the self-healing process until complete in the NSX Manager UI at
System
Fabric
Hosts
Clusters
before proceeding.

PowerShell Procedure

  1. Start PowerShell.
    Replace the values in the sample code with your values and run the commands in the PowerShell console.
    $restoredvCenterFQDN = "sfo-m01-vc01.sfo.rainpole.io" $restoredvCenterAdmin = "administrator@vsphere.local" $restoredvCenterAdminPassword = "VMw@re1!" $clusterName = "sfo-m01-cl01" $extractedSDDCDataFile = ".\extracted-sddc-data.json"
  2. Perform the configuration by running the command in the PowerShell console.
    New-RebuiltVdsConfiguration -vCenterFQDN $restoredVcenterFqdn -vCenterAdmin $restoredVcenterAdmin -vCenterAdminPassword $restoredVcenterAdminPassword -clusterName $clusterName -extractedSDDCDataFile $extractedSDDCDataFile
At this point, the NSX installation on each host in the vSphere cluster should self-heal. Monitor the self-healing process until complete in the NSX Manager UI at
System
Fabric
Hosts
Clusters
before proceeding. It might take several minutes for the process to initiate. If you see an error on the hosts that they are not part of the distributed switch, it just means that the self-healing process is yet to start.

Configure vSAN Back to Honour Cluster Member Updates

After the vSAN cluster has been moved to the restored vCenter Server, you must configure back vSAN to honour cluster member updates.

UI Procedure

  1. Connect to each ESXi host using SSH and login as
    root
    .
  2. Run the following command.
    esxcli system settings advanced set --int-value=0 --option=/VSAN/IgnoreClusterMemberListUpdates

PowerShell Procedure

  1. Start PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console.
    $restoredVcenterFqdn = "sfo-m01-vc01.sfo.rainpole.io" $restoredVcenterAdmin = "administrator@vsphere.local" $restoredVcenterAdminPassword = "VMw@re1!" $clusterName = "sfo-m01-cl01" $setting = "disable" $extractedSDDCDataFile = ".\extracted-sddc-data.json"
  3. Perform the configuration by running the command in the PowerShell console.
    Set-ClusterHostsvSanIgnoreClusterMemberList -vCenterFQDN $restoredVcenterFqdn -vCenterAdmin $restoredVcenterAdmin -vCenterAdminPassword $restoredVcenterAdminPassword -clusterName $clusterName -setting $setting -extractedSDDCDataFile $extractedSDDCDataFile

Correct vCenter Authoritative Health on the Restored vCenter Server

After the cluster is moved to the recovered vCenter Server, correct vCenter Server authoritative health.
  1. In the restored vCenter Server, in the Hosts and Clusters inventory of the vSphere Client, navigate to the default cluster of the management cluster.
  2. On the
    Monitor
    tab, click
    vSAN
    Skyline Health
    .
  3. Follow the remediatation path according to your vSphere version.
    vSphere 7.0
    vSphere 8.0
    Select the Health check for
    vCenter state is authoritative
    to open a sub panel. In the sub panel, select
    Update ESXi Configuration
    .
    Find the
    vCenter state is authoritative
    tile and confirm that its status is green. If not, click Troubleshoot and
    Remediate Inconsistent Configuration
  4. In the confirmation dialog that appears, click
    OK
    .
  5. After the update completes, verify that the vSAN Health is all green.

Refresh the NFS Mount on the ESXi Hosts

For
versions earlier than
4.5.0, you remount the bundle repository NFS share.

Procedure

  1. Connect to each ESXi host in the cluster using SSH and run the following command.
    esxcli storage nfs list esxcli storage nfs remove --volume-name=lcm-bundle-repo esxcli storage nfs add \   --host=<sddc_manager_ip>\   --share=/nfs/vmware/vcf/nfs-mount \   --volume-name=lcm-bundle-repo

Update SSH Host Keys

For each ESXi host that you rebuilt as part of recovering the management cluster or a VI cluster, you must update the SSH host keys in the
known_hosts
file.
  1. Enable SSH on all Hosts in the Cluster by running the following powershell commands
    $restoredVcenterFqdn = "sfo-m01-vc01.sfo.rainpole.io" $restoredVcenterAdmin = "administrator@vsphere.local" $restoredVcenterAdminPassword = "VMw@re1!" $clusterName = "sfo-m01-cl01"
    Connect-VIServer -server $restoredVcenterFqdn -user $restoredVcenterAdmin -password $restoredVcenterAdminPassword Get-Cluster -name $clusterName | Get-VMHost | Get-VMHostService | Where-Object {$_.label -eq "SSH"} | Start-VMHostService Disconnect-VIServer -Server $global:DefaultVIServers -Force -Confirm:$false
  2. Refresh SDDC Manager SSH Keys.
    Replace x_x_x in the following commands with the relevant numbering for the version you downloaded.
    1. Copy the
      recovery_tools_python_x_x_x.zip
      file to the
      /tmp
      directory on the SDDC Manager virtual appliance by using a secure file copy utility as the vcf user.
    2. SSH to the SDDC Manager VM as the
      vcf
      user
    3. Switch to root and extract the
      recovery_tools_python_x_x_x.zip
      .
      su -  cd /tmp unzip recovery_tools_python_x_x_x.zip cd /tmp/recovery_tools_python_x_x_x
    4. Run the following command to update all new SSH keys without the need to accept each individual key.
      "yes" | python refreshsshkeys.pyc
    5. Alternatively, run the following command to update all new SSH keys with the need to accept each individual key.
      python refreshsshkeys.pyc
  3. Run the following command to update host attributes without the need to accept each individual key.
    python refreshhostattributes.pyc --domain=<domain-name>

Recover the Management Domain NSX Manager Cluster

Perform the following tasks to recover the NSX Manager Cluster for the management domain.

Deactivate the NSX Manager Cluster

If you are using a version of NSX that is earlier than NSX 4.x, after you restore the first node of the NSX Manager cluster, you must deactivate the cluster.
  1. In a Web browser, log in to the management domain vCenter Server
    https://<vcenter_server_fqdn>/ui
    by using the vSphere Client.
  2. Click the VM of the operational NSX Manager node in the cluster, click
    Launch Web Console
    , and log in by using
    administrator
    credentials.
  3. Run the command to deactivate the cluster.
    deactivate cluster
  4. In the
    Are you sure you want to remove all other nodes from this cluster? (yes/no)
    prompt, enter
    yes
    .

Redeploy a Failed NSX Manager Node

You deploy a new NSX Manager instance by using the configuration of the failed node.

UI Procedure

  1. In a Web browser, log in to the management domain vCenter Server by using the vSphere Client.
  2. In the
    Hosts and clusters
    inventory, sight-click the management cluster and select
    Deploy OVF Template
    .
  3. On the
    Select an OVF template
    page, select
    Local file
    , click
    Upload files
    , navigate to the location of the NSX Manager OVA file, click
    Open
    , and click
    Next
    .
  4. On the
    Select a name and folder
    page, enter the VM name and click
    Next
    .
  5. On the
    Select a compute resource
    page, select the cluster and click
    Next
    .
  6. On the
    Review details
    page, click
    Next
    .
  7. For the management domain, select
    Medium
    , and for VI workload domains, select
    Large
    unless you changed these defaults during deployment.
  8. On the
    Select storage
    page, select the management vSAN datastore, and click
    Next
    .
  9. On the
    Select networks
    page, from the
    Destination network
    drop-down menu, select management distributed port group, and click
    Next
    .
  10. On the
    Customize template
    page, enter these values and click
    Next
    .
    Setting
    Value
    System root user password
    <failed_nsx_cluster_node_root_password>
    CLI admin user password
    <failed_nsx_cluster_node_admin_password>
    CLI audit user password
    <failed_nsx_cluster_node_audit_password>
    Hostname
    <failed_nsx_cluster_node_fqdn>
    Default IPv4 gateway
    <failed_nsx_cluster_node_gw>
    Management network IPv4 address
    <failed_nsx_cluster_node_ip>
    Management network netmask
    <failed_nsx_cluster_node_mask>
    DNS server list
    <dns_server_list>
    NTP server list
    <ntp_server_list>
    Enable SSH
    Selected
    Allow root SSH logins
    Deselected
  11. On the
    Ready to complete
    page, review the deployment details and click
    Finish
    .
  12. Repeat for the remaining failed node.

PowerShell Procedure

  1. Start PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console.
    The values in this example are for the management domain. Replace with the values for the specific workload domain you are recovering.
    $tempvCenterFqdn = "sfo-m01-vc02.sfo.rainpole.io" $tempvCenterAdmin = "administrator@vsphere.local" $tempvCenterAdminPassword = "VMw@re1!" $extractedSDDCDataFile = ".\extracted-sddc-data.json" $workloadDomain = "sfo-m01" $restoredNsxManagerDeploymentSize = "medium" $nsxManagerOvaFile = "F:\OVA\nsx-unified-appliance-3.2.2.1.0.21487565.ova"
  3. Perform the operation by running the command in the PowerShell console.
    New-NSXManagerOvaDeployment -vCenterFqdn $tempvCenterFqdn -vCenterAdmin $tempvCenterAdmin -vCenterAdminPassword $tempvCenterAdminPassword -extractedSDDCDataFile $extractedSDDCDataFile -workloadDomain $workloadDomain -restoredNsxManagerDeploymentSize $restoredNsxManagerDeploymentSize -nsxManagerOvaFile $nsxManagerOvaFile
  4. Repeat for the remaining failed node.

Join NSX Manager Nodes to the NSX Manager Cluster

You retrieve the ID and API thumbprint of the NSX Manager cluster, and use it join the newly-deployed NSX Manager instance to the cluster.

UI Procedure

  1. In a Web browser, log in to the management domain vCenter Server by using the vSphere Client.
  2. In the
    VMs and templates
    inventory, click the VM of an operational NSX Manager node in the cluster, click
    Launch web console
    , and log in by using
    administrator
    credentials.
  3. Retrieve the ID of the NSX Manager cluster.
    1. Run the command to view the cluster ID.
      get cluster config | find Id:
    2. Write down the cluster ID.
  4. Retrieve the API thumbprint of the NSX Manager API certificate.
    1. Run the command to view the certificate API thumbprint.
      get certificate api thumbprint
    2. Write down the certificate API thumbprint.
  5. Close the VM Web console.
  6. In the vSphere Client, click the VM of the newly deployed NSX Manager node, click
    Launch Web console
    , and log in by using
    administrator
    credentials.
  7. Run the command to join the new NSX Manager node to the cluster.
    join
    new_node_ip
    cluster-id
    cluster_id
    thumbprint
    api_thumbprint
    username admin
  8. Repeat for the remaining failed node.

PowerShell Procedure

  1. Start Windows PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console (note the values in this example are for a Management Domain but should be replaced with the values for the specific workload domain you are recovering)
    $workloadDomain = "sfo-m01" $extractedSDDCDataFile = ".\extracted-sddc-data.json"
  3. Perform the operation by running the command in the PowerShell console.
    Add-AdditionalNSXManagers -workloadDomain $workloadDomain -extractedSDDCDataFile $extractedSDDCDataFile

Restore the SSL Certificate of NSX Manager Node

If the version of NSX in your environment is earlier than NSX 4, then after you add the new NSX Manager node to the cluster and validated the cluster status, you must restore the CA-signed SSL certificate of the node.
To view the certificate of the failed NSX Manager cluster node, you log in to the NSX Manager for the domain.
  1. In a Web browser, log in to NSX Manager for the management domain by using the user interface.
  2. On the main navigation bar, click
    System
    .
  3. In the left pane, under
    Settings
    , click
    Certificates
    .
  4. Locate and copy the ID of the certificate that is issued by CA to the node that you are restoring.
  5. Run the command to install the CA-signed certificate on the new NSX Manager node.
    curl -H 'Accept: application/json' -H 'Content-Type: application/json'\ --insecure -u 'admin:
    NSX_admin_password
    ' -X POST\ 'https://
    NSX_host_node
    /api/v1/node/services\/http action=apply_certificate&certificate_id=<certificate_id>
  6. Repeat for the remaining restored node.
If assigning the certificate fails because the certificate revocation list (CRL) verification fails, see VMware Knowledge Base article 78794. If you disable the CRL checking to assign the certificate, after assigning the certificate, you must re-enable the CRL checking.

Restart an NSX Manager Node

If the version of NSX in your environment is earlier than NSX 4, then after assigning the certificate, you must restart the new NSX Manager node.
  1. In a Web browser, log in to the management domain vCenter Server by using the vSphere Client.
  2. In the
    Hosts and clusters
    inventory, right-click each restored NSX Manager VM that you updated the certificate on and select
    Guest OS
    Restart
    .

Recover the Management Domain NSX Edge Cluster

Perform the following tasks to recover the management domain NSX Edge Cluster

Redeploy an NSX Edge Cluster

To recover an NSX edge cluster, use the NSX API to redeploy the edge nodes.

UI Procedure

  1. Verify that the NSX Edge node is disconnected from NSX Manager by running the following API call in Postman.
    GET /<NSX-Manager-IPaddress>/api/v1/transport-nodes/<edgenode_id>/state "node_deployment_state": {"state": MPA_Disconnected"}
  2. Retrieve the edge node configuration by running the following API call and copy the output payload of this API.
    GET /<NSX-Manager-IPaddress>/api/v1/transport-nodes/<edgenode_id>
    "resource_type": "EdgeNode", "id": "9f34c0ea-4aac-4b7f-a02c-62f306f96649", "display_name": "Edge_TN2", "description": "EN", "external_id": "9f34c0ea-4aac-4b7f-a02c-62f306f96649", "ip_addresses": [ "10.170.94.240" ], "_create_user": "admin", "_create_time": 1600106319056, "_last_modified_user": "admin", "_last_modified_time": 1600106907312, "_system_owned": false, "_protection": "NOT_PROTECTED", "_revision": 2 }, "is_overridden": false, "failure_domain_id": "4fc1e3b0-1cd4-4339-86c8-f76baddbaafb", "resource_type": "TransportNode", "id": "9f34c0ea-4aac-4b7f-a02c-62f306f96649", "display_name": "Edge_TN2", "_create_user": "admin", "_create_time": 1600106319399, "_last_modified_user": "admin", "_last_modified_time": 1600106907401, "_system_owned": false, "_protection": "NOT_PROTECTED", "_revision": 1 }
  3. Redeploy the edge node using the following API call, passing the JSON data retrieved in Step 2 as the body.
    You do not need to pass any passwords in the JSON file.
    POST /<NSX-Manager-IPaddress>/api/v1/transport-nodes/<edgenode_id>?action=redeploy
  4. Repeat steps 1-3 for the remaining edge cluster nodes.
  5. In NSX Manager, monitor the Configuration Status of the new NSX Edge nodes, until they show
    Success
    .

PowerShell Procedure

  1. Start PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console.
    The values in this example are for the management domain. Replace with the values for the specific workload domain you are recovering.
    $restoredNsxManagerFqdn = "sfo-m01-nsx01.sfo.rainpole.io" $restoredNsxManagerAdmin = "admin" $restoredNsxManagerAdminPassword = "VMw@re1!VMw@re1!" $restoredVcenterFqdn = "sfo-m01-vc01.sfo.rainpole.io" $restoredvCenterAdmin = "administrator@vsphere.local" $restoredvCenterAdminPassword = "VMw@re1!" $clusterName = "sfo-m01-cl01" $extractedSDDCDataFile = ".\extracted-sddc-data.json"
  3. To recover the edge nodes, perform the configuration by running the command in the PowerShell console.
    Invoke-NSXEdgeClusterRecovery -nsxManagerFqdn $restoredNsxManagerFqdn -nsxManagerAdmin $restoredNsxManagerAdmin -nsxManagerAdminPassword $restoredNsxManagerAdminPassword -vCenterFQDN $restoredVcenterFqdn -vCenterAdmin $restoredvCenterAdmin -vCenterAdminPassword $restoredvCenterAdminPassword -clusterName $clusterName -extractedSDDCDataFile $extractedSDDCDataFile
If you encounter an error
Edge redeploy is blocked as an active alarm is found for Edge VM present in NSX Inventory but missing in vCenter
, navigate to the
Alarms
section in the relevant NSX Manager UI, select all relevant alarms whose
Event Type
is
Edge VM Present In NSX Inventory Not Present In vCenter
and from the vertical ellipsis menu, select
Resolve
. Then, retry the redeploy operation.

Verify the State of the NSX Edge Cluster Nodes

After completing all NSX Edge node redeployments, you must verify the state of the NSX Edge cluster nodes.
  1. In a Web browser, log in to NSX Manager for the domain by using the user interface.
  2. On the main navigation bar, click
    System
    .
  3. In the left pane, under
    Configuration
    , click
    Fabric
    Nodes
    .
  4. Click the
    Edge transport nodes
    tab.
  5. Verify all edge transport nodes show these values.
    Setting
    Value
    Configuration state
    Success
    Node status
    Up
    Tunnels
    Up

Post-Recovery Tasks

Perform the following tasks after the workload domain is recovered.

Synchronize the SDDC Manager Service Accounts on the ESXi Host

To ensure ESXi host service accounts are in sync with the SDDC Manager inventory, you must manually set a new password on each ESXi host and perform a password remediate operation in SDDC Manager.

UI Procedure

  1. If the service account does not exist, create a new service account on the ESXi host.
    1. Log into the first ESXi host in the cluster using the host client as
      root
      .
    2. Navigate to
      Manage
      Security & users
      Users
      , and click
      Add user
      .
    3. Enter the SDDC Manager service user name and password, and click
      Add
      .
      The service account user name format is svc-vcf-
      esxi_hostname
      .
  2. If the service account exists, set a new password on the ESXi host.
    1. as the root user.
    2. In the host client on the first ESXI host, navigate to
      Manage
      Security & users
      Users
      .
    3. Select the SDDC Manager service account svc-vcf-
      esxi_hostname
      and click
      Edit user
      .
    4. Set a new password and click
      Save
      .
  3. Perform password remediation in SDDC Manager.
    1. Log in to SDDC Manager.
    2. Navigate to
      Security
      Password Management
      ESXi
      .
    3. Locate the service account from Step 1, click the vertical ellipsis, and click
      Remediate
      .
    4. Enter the password used in Step 1 and click
      Remediate
      .
      Wait for the password remediation to complete.
  4. Repeat steps for all hosts in the workload domain.

PowerShell Procedure

  1. Start PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console.
    $restoredVcenterFqdn = "sfo-m01-vc01.sfo.rainpole.io" $restoredVcenterAdmin = "administrator@vsphere.local" $restoredVcenterAdminPassword = "VMw@re1!" $clusterName = "sfo-m01-cl01" $svcAccountPassword = "VMw@re123!" $sddcManagerFqdn = "sfo-vcf01.sfo.rainpole.io" $sddcManagerAdmin = "administrator@vsphere.local" $sddcManagerAdminPassword = "VMw@re1!"
  3. Perform the configuration by running the command in the PowerShell console.
    Resolve-PhysicalHostServiceAccounts -vCenterFQDN $restoredVcenterFqdn -vCenterAdmin $restoredVcenterAdmin -vCenterAdminPassword $restoredVcenterAdminPassword -clusterName $clusterName -svcAccountPassword $svcAccountPassword -sddcManagerFqdn $sddcManagerFqdn -sddcManagerAdmin $sddcManagerAdmin -sddcManagerAdminPassword $sddcManagerAdminPassword

Update the Backup Configuration in SDDC Manager

If the SFTP backup target for SDDC Manager and NSX has changed, you must update the SSH key for the backup configuration in SDDC Manager.
  1. Log into the SDDC Manager UI.
  2. Navigate to
    Administration
    Backup
  3. On the
    Site Settings
    tab, click
    Edit
    .
  4. Enter the new backup target details and click
    Save
    .

Update the Backup Configuration for vCenter Server

If the backup target for vCenter Server has changed, you must update the backup configuration for each vCenter Server instance.
  1. Log into the appliance management interface of vCenter Server at
    https://<vcenter_fqdn:5480>
    .
  2. Navigate to
    Backup
    and click
    Edit
    .
  3. Enter the new backup target details and click
    Save
    .

Restore Cluster Settings to Recovered vSphere Cluster

After all workloads are recovered on a workload domain vSphere cluster, you restore the cluster settings for virtual machine overrides, locations, and vSphere DRS rules and groups.
  1. Start PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console.
    The values in this example are for the management domain. Replace with the values for the specific workload domain you are recovering.
    $restoredVcenterFqdn = "sfo-m01-vc01.sfo.rainpole.io" $restoredVcenterAdmin = "administrator@vsphere.local" $restoredVcenterAdminPassword = "VMw@re1!" $clusterName = "sfo-m01-cl01" $clusterVMOverridesJsonFile = ".\sfo-m01-cl01-vmOverrides.json" $clusterVMLocationsJsonFile = ".\sfo-m01-cl01-vmLocations.json" $clusterDRSConfigurationJsonFile = ".\sfo-m01-cl01-drsConfiguration.json" $clusterVMTagsJsonFile = ".\sfo-m01-cl01-vmTags.json"
  3. Perform the configuration by running the command in the PowerShell console.
    Connect-VIServer -server $restoredVcenterFqdn -user $restoredVcenterAdmin -password $restoredVcenterAdminPassword Restore-ClusterVMOverrides -clusterName $clusterName -jsonFile $clusterVMOverridesJsonFile Restore-ClusterVMLocations -clusterName $clusterName -jsonFile $clusterVMLocationsJsonFile Restore-ClusterDRSGroupsAndRules -clusterName $clusterName -jsonFile $clusterDRSConfigurationJsonFile Restore-ClusterVMTags -clusterName $clusterName -jsonFile $clusterVMTagsJsonFile Disconnect-VIServer * -confirm:$false
  4. Repeat the procedure for all clusters in the vCenter Server instance.

VI Workload Domain Recovery

This section describes the tasks required to perform a recovery of the VMware Cloud Foundation VI Workload domain.

Deploy a vCenter Server Appliance from an OVA File

To restore a vCenter Server instance, you first deploy a new appliance from an OVA file on the temporary vCenter Server, deployed during partial bring-up. You then use this appliance to restore the management domain or VI workload domain.

UI Procedure

  1. Mount the vCenter Server ISO in Windows Explorer.
  2. Log in to the temporary vCenter Server, deployed at the partial bring-up operation, and navigate to the default cluster.
  3. Right-click the cluster and select
    Deploy OVF Template
    .
  4. Navigate to the
    vcsa
    folder on the mounted ISO, select the vCenter Server OVA file, and click
    Next
    .
  5. Enter the virtual machine name and click
    Next
    .
  6. On the
    Compute resource
    page, select the cluster and click
    Next
    .
  7. On the
    Review details
    page, if warnings appear, click
    Ignore
    and click
    Next
    .
  8. On the
    License agreements
    page, click
    I accept all license agreements
    and click
    Next
    .
  9. On the
    Configuration
    page, select the correct size for the vCenter Server, and click
    Next
    .
  10. On the
    Select storage
    page, select the vSAN datastore, and click
    Next
    .
  11. On the
    Select networks
    page, select the management network, and click
    Next
    .
  12. On the
    Customize template
    page enter the following information, and click
    Next
    .
    Section
    Setting
    Value
    Networking Configuration
    Host Network IP Address Family
    ipv4
    Host Network Mode
    static
    Host Network IP Address
    <vcenter_server_ip_address>
    Host Network Prefix
    <management_network_prefix>
    Host Network Default Gateway
    <management_network_gateway>
    Host Network DNS Servers
    <dns_server_comma_separated_list>
    Host Network Identity
    <vcenter_server_fqdn>
    SSO Configuration
    N/A
    N/A
    System Configuration
    Root Password
    <vcenter_server_root_password>
    Upgrade Configuration
    N/A
    N/A
    Miscellaneous
    N/A
    N/A
    Networking Properties
    Domain Name
    <dns_domain_name>
    Domain Search Path
    <domain_search_path_comma_separated_list>
  13. On the
    Ready to complete
    page, review the details and click
    Finish
    .

PowerShell Procedure

  1. Start PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console.
    $tempVcenterFqdn = "sfo-m01-vc02.sfo.rainpole.io" $tempvCenterAdmin = "administrator@vsphere.local" $tempvCenterAdminPassword = "VMw@re1!" $extractedSDDCDataFile = ".\extracted-sddc-data.json" $workloadDomain = "sfo-m01" $restoredvCenterDeploymentSize = "small" $vCenterOvaFile = "F:\OVA\VMware-vCenter-Server-Appliance-7.0.3.01400-21477706_OVF10.ova"
  3. Perform the configuration by running the command in the PowerShell console.
    New-vCenterOvaDeployment -vCenterFqdn $tempVcenterFqdn -vCenterAdmin $tempvCenterAdmin -vCenterAdminPassword $tempvCenterAdminPassword -extractedSDDCDataFile $extractedSDDCDataFile -workloadDomain $workloadDomain -restoredvCenterDeploymentSize $restoredvCenterDeploymentSize -vCenterOvaFile $vCenterOvaFile

Deploy the First NSX Manager Cluster Node from an OVA File

When the three nodes in an NSX Manager cluster are in a failed state, you begin the restore process by restoring the first cluster node from an OVA file.
The first NSX Manager node is defined as the node that took the backup of the NSX Manager cluster. To identify this node, examine the name of the folder created during backup, as it will contain the node IP address as part of the folder name.

UI Procedure

  1. In a Web browser, log in to the management domain vCenter Server at
    https://<vcenter_server_fqdn>/ui
    by using the vSphere Client .
  2. Right-click the default cluster of the management domain and select
    Deploy OVF Template
    .
  3. On the
    Select an OVF template
    page, select
    Local file
    , click
    Upload files
    , navigate to the location of the NSX Manager OVA file, click
    Open
    , and click
    Next
    .
  4. On the
    Select a name and folder
    page, enter the VM name and click
    Next
    .
  5. On the
    Select a compute resource
    page, select the cluster and click
    Next
    .
  6. On the
    Review details
    page, click
    Next
    .
  7. On the
    Configuration
    page, select the appropriate size and click
    Next
    .
  8. For the management domain select
    Medium
    and for workload domains select
    Large
    unless you changed these defaults during deployment.
  9. On the
    Select storage
    page, select the management vSAN datastore, and click
    Next
    .
  10. On the
    Select networks
    page, from the
    Destination network
    drop-down menu, select management distributed port group, and click
    Next
    .
  11. On the
    Customize template
    page, enter these values and click
    Next
    .
    Setting
    Value
    System root user password
    <first_nsx_cluster_node_root_password>
    CLI admin user password
    <first_nsx_cluster_node_admin_password>
    You must enter a password. Do not leave the default value.
    CLI audit user password
    <first_nsx_cluster_node_audit_password>
    You must enter a password. Do not leave the default value.
    Hostname
    <first_nsx_cluster_node_fqdn>
    Default IPv4 gateway
    <first_nsx_cluster_node_gw>
    Management network IPv4 address
    <first_nsx_cluster_node_ip>
    Management network netmask
    <first_nsx_cluster_node_mask>
    DNS server list
    <dns_server_list>
    NTP server list
    <ntp_server_list>
    Enable SSH
    Selected
    Allow root SSH logins
    Deselected
  12. On the
    Ready to complete
    page, review the deployment details and click Finish.

PowerShell Procedure

  1. Start PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console.
    The values in this example are for the management domain. Replace with the values for the specific workload domain you are recovering.
    $tempvCenterFqdn = "sfo-m01-vc02.sfo.rainpole.io" $tempvCenterAdmin = "administrator@vsphere.local" $tempvCenterAdminPassword = "VMw@re1!" $extractedSDDCDataFile = ".\extracted-sddc-data.json" $workloadDomain = "sfo-m01" $restoredNsxManagerDeploymentSize = "medium" $nsxManagerOvaFile = "F:\OVA\nsx-unified-appliance-3.2.2.1.0.21487565.ova"
  3. Perform the operation by running the command in the PowerShell console.
    New-NSXManagerOvaDeployment -vCenterFqdn $tempvCenterFqdn -vCenterAdmin $tempvCenterAdmin -vCenterAdminPassword $tempvCenterAdminPassword -extractedSDDCDataFile $extractedSDDCDataFile -workloadDomain $workloadDomain -restoredNsxManagerDeploymentSize $restoredNsxManagerDeploymentSize -nsxManagerOvaFile $nsxManagerOvaFile

Restore NSX Manager from a File-Based Backup

You restore the file-based backup of the first NSX Manager cluster node to the newly-deployed NSX Manager instance.

UI Procedure

  1. In a Web browser, log in to the newly-deployed first NSX Manager cluster node by using the user interface as an
    Enterprise Administrator
    .
  2. On the main navigation bar, click
    System
    .
  3. In the left navigation pane, under
    Lifecycle management
    , click
    Backup & Restore
    .
  4. In the
    NSX configuration
    pane, under
    SFTP server
    , click
    Edit
    .
  5. In the
    Backup configuration
    dialog box, enter the details for your backup server, and click
    Save
    .
  6. Under
    Backup history
    , select the source backup, and click
    Restore
    .
  7. During the restore, when prompted, reject adding NSX Manager nodes by clicking
    I understand
    and
    Resume
    .
  8. If a
    Confirm CM/VC Connectivity
    dialog box appears, click
    I understand the message mentioned above and wish to proceed
    , and click
    Resume
    .
  9. If a fabric node discovery time out dialog box appears, click
    Resume
    to continue the restore.

PowerShell Procedure

  1. Start Windows PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console.
    $extractedSDDCDataFile = ".\extracted-sddc-data.json" $workloadDomain ="sfo-m01" $sftpServer ="10.50.5.66" $sftpUser ="svc-vcf-bkup" $sftpPassword ="VMw@re1!" $sftpServerBackupPath ="/media/backups" $backupPassphrase ="VMw@re1!VMw@re1!"
  3. Perform the configuration by running the command in the PowerShell console.
    Invoke-NSXManagerRestore -extractedSDDCDataFile $extractedSDDCDataFile -workloadDomain $workloadDomain -sftpServer $sftpServer -sftpUser $sftpUser -sftpPassword $sftpPassword -sftpServerBackupPath $sftpServerBackupPath -backupPassphrase $backupPassphrase

Verify the State of the First NSX Manager Cluster Node

After you restore the first NSX Manager cluster node, you verify the services state from the Web console of the restored node VM.
  1. In a Web browser, log in to the temporary vCenter Server at
    https://<temp_management_vcenter_server_fqdn>/ui
    by using the vSphere Client.
  2. Click the VM name of the newly-deployed first NSX Manager cluster node, click
    Launch Web Console
    , and log in by using administrator credentials.
  3. Run the command to view the cluster status.
    get cluster status
    The services on the single-node NSX Manager cluster appear as
    UP
    .

Restore vCenter Server from a File-Based Backup

Restore the management domain vCenter Server from a file-based backup on the appliance you deployed from an OVA file.

UI Procedure

  1. Log in to the appliance management interface (VAMI) of the management domain vCenter Server at
    https://<vcenter_server_fqdn>:5480
    as
    root
    .
  2. Navigate to the default cluster of the management domain.
  3. Select the
    Restore
    .
  4. Provide the backup server address, protocol and the path to the backup file, and click
    Next
    .
  5. Provide the SSO administrator user name and password, and click
    Next
    .
  6. On the Ready to complete page, review the details and click
    Finish
    .

PowerShell Procedure

  1. Start PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console.
    $tempvCenterFqdn = "sfo-m01-vc02.sfo.rainpole.io" $tempvCenterAdmin = "administrator@vsphere.local" $tempvCenterAdminPassword = "VMw@re1!" $extractedSDDCDataFile = ".\extracted-sddc-data.json" $workloadDomain ="sfo-m01" $vCenterBackupPath = "10.221.78.133/F$/backup/vCenter/sn_sfo-m01-vc01.sfo.rainpole.io/M_8.0.1.00100_20231121-104120_" $locationtype ="SMB" $locationUser ="Administrator" $locationPassword ="VMw@re123!"
  3. Perform the configuration by running the command in the PowerShell console.
    Invoke-vCenterRestore -vCenterFqdn $tempvCenterFqdn -vCenterAdmin $tempvCenterAdmin -vCenterAdminPassword $tempvCenterAdminPassword -extractedSDDCDataFile $extractedSDDCDataFile -workloadDomain $workloadDomain -vCenterBackupPath $vCenterBackupPath -locationtype $locationtype -locationUser $locationUser -locationPassword $locationPassword

Recover the VI Workload Domain vSphere Cluster

Perform the following tasks to recover the VI workload domain vSphere Cluster

Export the Cluster Settings from the Restored vCenter Server

Before you can restore vSphere clusters of the restored vCenter Server, you must first export the settings so they can be reapplied to the restored cluster.
  • vCenter Server has been restored and powered on.
  1. Start PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console.
    The values in this example are for the management domain. Replace with the values for the specific workload domain you are recovering.
    $restoredVcenterFqdn = "sfo-m01-vc01.sfo.rainpole.io" $restoredVcenterAdmin = "administrator@vsphere.local" $restoredVcenterAdminPassword = "VMw@re1!" $clusterName = "sfo-m01-cl01"
  3. Perform the configuration by running the command in the PowerShell console.
    Connect-VIServer -server $restoredVcenterFqdn -user $restoredVcenterAdmin -password $restoredVcenterAdminPassword Backup-ClusterVMOverrides -clusterName $clusterName Backup-ClusterVMLocations -clusterName $clusterName Backup-ClusterDRSGroupsAndRules -clusterName $clusterName Backup-ClusterVMTags -clusterName $clusterName Disconnect-VIServer * -confirm:$false
  4. Save the JSON file output for later use.
  5. Repeat for all clusters in the vCenter Server.

Remove Non-Responsive ESXi Hosts from the Inventory

Before you can repair a failed vSphere cluster, you must first remove the non-responsive hosts from the cluster.

Prerequisites

Ensure you have exported all cluster settings before proceeding.

UI Procedure

  1. Log into the NSX Manager for the workload domain and navigate to
    System
    Fabric
    Hosts
    Clusters
    .
    In NSX 3.x, the relevant navigation path is
    System
    Fabric
    Nodes
    Host Transport Nodes
    and choose the vCenter from
    Managed by
  2. Select the check box next for the relevant vSphere cluster and click
    Remove NSX
    .
  3. Deselect the check box next for the relevant vSphere cluster.
  4. Expand the cluster and wait for all hosts in the cluster to go into an
    Orphaned
    State
  5. Select the check box that selects all hosts in the cluster without selecting the cluster object itself and select
    Remove NSX
    .
  6. Select the
    Force
    option and submit.
    Wait until all hosts show as unconfigured.
  7. Log in to the vCenter Server with the non-responsive hosts and navigate to the cluster.
  8. Select the cluster, and in the right pane, navigate to the
    Hosts
    tab.
  9. Select the check box for each non-responsive host, right click the selected hosts and select
    Remove from Inventory
    .
    If the cluster use vSphere Lifecycle Manager images, wait for about a minute to allow the background tasks in NSX to complete the removal of the NSX solution from the relevant cluster before proceeding to the next step.
  10. Log in to the NSX Manager for the workload domain and navigate back to
    System
    Fabric
    Hosts
    Clusters
    In NSX 3, the relevant navigation path is
    System
    Fabric
    Nodes
    Host Transport Nodes
    and choose the vCenter from
    Managed by
  11. Select the check box next for the relevant vSphere cluster and click
    Configure NSX
    .
  12. Select the relevant Transport Node Profile and Submit.

PowerShell Procedure

  1. Start PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console.
    The values in this example are for the management domain. Replace with the values for the specific workload domain you are recovering.
    $restoredVcenterFqdn = "sfo-m01-vc01.sfo.rainpole.io" $restoredVcenterAdmin = "administrator@vsphere.local" $restoredVcenterAdminPassword = "VMw@re1!" $clusterName = "sfo-m01-cl01" $restoredNsxManagerFqdn = "sfo-m01-nsx01.sfo.rainpole.io" $restoredNsxManagerAdmin = "admin" $restoredNsxManagerAdminPassword = "VMw@re1!VMw@re1!" $restoredNsxManagerRootPassword = "VMw@re1!VMw@re1!"
  3. Perform the configuration by running the command in the PowerShell console.
    Remove-NonResponsiveHosts -vCenterFQDN $restoredVcenterFqdn -vCenterAdmin $restoredVcenterAdmin -vCenterAdminPassword $restoredVcenterAdminPassword -clusterName $clusterName -NsxManagerFQDN $restoredNsxManagerFqdn -NsxManagerAdmin $restoredNsxManagerAdmin -NsxManagerAdminPassword $restoredNsxManagerAdminPassword -NsxManagerRootPassword $restoredNsxManagerRootPassword

Add New Hosts to the Cluster in the Restored vCenter Server

You add new hosts to the cluster in the restored vCenter Server for the VI workload domain .

UI Procedure

  1. Log into the restored vCenter Server by using the vSphere Client.
  2. In the
    Hosts and clusters
    inventory, navigate to the cluster.
  3. Right-click the cluster and select
    Add Hosts
    .
  4. Enter the FQDN, user name, and password for each host to be added, and click
    Next
    .
  5. When prompted to accept the SSL certificate for the new hosts, click
    Accept
    .
  6. Click
    Finish
    .

PowerShell Procedure

  1. Start PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console.
    $restoredVcenterFqdn = "sfo-w01-vc01.sfo.rainpole.io" $restoredVcenterAdmin = "administrator@vsphere.local" $restoredVcenterAdminPassword = "VMw@re1!" $clusterName = "sfo-w01-cl01" $sddcManagerFqdn = "sfo-vcf01.sfo.rainpole.io" $sddcManagerAdmin = "administrator@vsphere.local" $sddcManagerAdminPassword = "VMw@re1!" $extractedSDDCDataFile = ".\extracted-sddc-data.json"
  3. Perform the configuration by running the command in the PowerShell console.
    Add-HostsToCluster -vCenterFQDN $restoredVcenterFqdn -vCenterAdmin $restoredVcenterAdmin -vCenterAdminPassword $restoredVcenterAdminPassword -clusterName $clusterName -sddcManagerFqdn $sddcManagerFqdn -sddcManagerAdmin $sddcManagerAdmin -sddcManagerAdminPassword $sddcManagerAdminPassword -extractedSDDCDataFile $extractedSDDCDataFile

Migrate the ESXi Hosts and VMkernel Adapters to the vSphere Distributed Switch on the Restored vCenter Server

You connect the new cluster hosts to the vSphere Distributed Switch and migrate the VMkernel adapters.

UI Procedure

  1. Log in to the restored vCenter Server by using the vSphere Client.
  2. Add the hosts to the vSphere Distributed Switch.
    1. In
      Networking
      inventory, right-click the vSphere Distributed Switch and select
      Add and Manage Hosts
      .
    2. Select
      Add Hosts
      and click
      Next
      .
    3. On the
      Select hosts
      page, select all the ESX hosts, and click
      Next
      .
    4. On the
      Manage physical adapters
      page, select a free physical adapter, for example vmnic0, and click
      Assign Uplink
      .
    5. Select uplink1 , and click
      Next
      .
    6. On the
      Manage VMkernel adapters
      page, update the following VMkernel adapters to assign them to the appropriate port group on the new distributed switch.
      VMkernel Migration by Domain Type
      Management Domain
      VI Workload Domain
      vmk0 – Management Network port group
      vmk0 – Management Network port group
      vmk1 – vMotion Network port group
      N/A - Not yet created
      vmk2 – vSAN Network port group
      N/A - Not yet created
    7. To assign the VMkernel adapters, select the adapter and under actions for the corresponding port group, click
      Assign
      .
    8. Click
      Next
      .
    9. On the
      Migrate VM networking
      step, click
      Next
      .
    10. Review the information on the
      Ready to complete
      page and click
      Finish
      .
  3. If this is a management domain cluster, migrate the management VMs to the original management port group.
    1. Right-click the temporary management port group and select
      Migrate VMs to Another Network
      .
    2. For destination network, select the management port group on the vSphere Distributed Switch, for example sfo-m01-vc01-vds01-management, and click
      Next
      .
    3. On the
      Select VMs to migrate
      page, select all management VMs and click
      Next
      .
    4. On the
      Ready to complete
      page, click
      Finish
      .
  4. Remove the temporary standard switch on each ESXi Host.
    1. Select the first ESXi host and, on the Configure tab, select
      Networking
      Virtual Switches
      .
    2. Expand
      vSwitch0
      and click the horizontal ellipsis.
    3. Click
      Remove
      and click
      Yes
      .
  5. Add additional host uplinks to the vSphere Distributed Switch.
    1. Right-click the distributed switch and select
      Add and Manager Hosts
      .
    2. Select
      Manage Host Networking
      and click
      Next
      .
    3. On the
      Select hosts
      step, select all the ESX hosts and click
      Next
      .
    4. On the
      Manage physical adapters
      step, select the required free physical adapter(s), for example vmnic1, and from
      Assign Uplink
      select the desired uplinks to corresponding physical adapters, and click
      Next
      .
    5. Click
      Next
      and click
      Next
      .
    6. Review the information on the
      Ready to complete
      page and click
      Finish
      .
If you are running NSX 4.1.2 or later, the NSX installation on each host in the vSphere cluster should self-heal. Monitor the self-healing process until complete in the NSX Manager UI at
System
Fabric
Hosts
Clusters
before proceeding.

PowerShell Procedure

  1. Start PowerShell.
    Replace the values in the sample code with your values and run the commands in the PowerShell console.
    $restoredvCenterFQDN = "sfo-m01-vc01.sfo.rainpole.io" $restoredvCenterAdmin = "administrator@vsphere.local" $restoredvCenterAdminPassword = "VMw@re1!" $clusterName = "sfo-m01-cl01" $extractedSDDCDataFile = ".\extracted-sddc-data.json"
  2. Perform the configuration by running the command in the PowerShell console.
    New-RebuiltVdsConfiguration -vCenterFQDN $restoredVcenterFqdn -vCenterAdmin $restoredVcenterAdmin -vCenterAdminPassword $restoredVcenterAdminPassword -clusterName $clusterName -extractedSDDCDataFile $extractedSDDCDataFile
At this point, the NSX installation on each host in the vSphere cluster should self-heal. Monitor the self-healing process until complete in the NSX Manager UI at
System
Fabric
Hosts
Clusters
before proceeding. It might take several minutes for the process to initiate. If you see an error on the hosts that they are not part of the distributed switch, it just means that the self-healing process is yet to start.

Add VMkernel Adapters to the ESXi Hosts

You add vSphere vMotion and vSAN VMkernel adapters to the new ESXi hosts.

UI Procedure

  1. Log in to the restored vCenter Server by using the vSphere Client.
  2. In the
    Hosts ad clusters
    inventory, select the first ESXi host and on the Configure tab, select
    Networking
    VMkernel adapters
    .
  3. Click
    Add networking
    .
  4. Select
    VMkernel Network Adapter
    and click
    Next
    .
  5. Select
    Select an existing network
    , select the port group for the VMkernel type, and click
    Next
    .
  6. On the
    Port properties
    page, configure the following, leaving the default values for all other settings, and click
    Next
    .
    Setting
    vMotion VMkernel
    vSAN VMkernel
    TCP/IP stack
    vMotion
    Default
    Available services
    N/A
    vSAN
  7. On the
    IPv4 settings
    page, select
    Use static IPV4 settings
    and enter the IP details for each VMkernel adapter from your system documentation, and click
    Next
    .
  8. On the
    Ready to complete
    page, click
    Finish
    .
  9. Repeat the procedure for each ESXi host in the cluster.

PowerShell Procedure

  1. Start PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console.
    $restoredvCenterFQDN = "sfo-w01-vc01.sfo.rainpole.io" $restoredvCenterAdmin = "administrator@vsphere.local" $restoredvCenterAdminPassword = "VMw@re1!" $clusterName = "sfo-w01-cl01" $sddcManagerFqdn = "sfo-vcf01.sfo.rainpole.io" $sddcManagerAdmin = "administrator@vsphere.local" $sddcManagerAdminPassword = "VMw@re1!"
  3. Perform the configuration by running the command in the PowerShell console.
    Add-VMKernelsToHost -vCenterFQDN $restoredvCenterFQDN -vCenterAdmin $restoredvCenterAdmin -vCenterAdminPassword $restoredvCenterAdminPassword -clusterName $clusterName -sddcManagerFqdn $sddcManagerFqdn -sddcManagerAdmin $sddcManagerAdmin -sddcManagerAdminPassword $sddcManagerAdminPassword

Recreate a vSAN Datastore

To recreate the vSAN datastore, you claim cache and capacity disks to create disk groups.

UI Procedure

  1. Connect to the restored vCenter Server by using the vSphere Client, and add the hosts to the vSphere Distributed Switch.
  2. In the
    Hosts and clusters
    inventory, navigate to the cluster and click
    Configure
    .
  3. Under
    vSAN
    , click
    Disk Management
    , and click
    Claim Unused Disks
    .
  4. Select the relevant disks for each tier and click
    Claim
    .
    The process to claim the disks and create the vSAN datastore might take some time. Wait for it to complete.
  5. After the disk groups are created, right-click the datastore and select
    Rename
    .
  6. Enter the original datastore name and click
    OK
    .

PowerShell Procedure

  1. Start PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console.
  3. The values in this example are for the management domain. Replace with the values for the specific workload domain you are recovering.
    $restoredVcenterFqdn = "sfo-m01-vc01.sfo.rainpole.io" $restoredvCenterAdmin = "administrator@vsphere.local" $restoredvCenterAdminPassword = "VMw@re1!" $clusterName = "sfo-m01-cl01" $extractedSDDCDataFile = ".\extracted-sddc-data.json"
  4. Perform the configuration by running the command in the PowerShell console.
    New-RebuiltVsanDatastore -vCenterFqdn $restoredVcenterFqdn -vCenterAdmin $restoredVcenterAdmin -vCenterAdminPassword $restoredVcenterAdminPassword -clusterName $clusterName -extractedSDDCDataFile $extractedSDDCDataFile

Resolve vSphere HA on Cluster

After recreating a vSAN Datastore, you disable and renable vSphere HA on the cluster so vCLS can self heal

UI Procedure

  1. Connect to the restored vCenter Server by using the vSphere Client, and add the hosts to the vSphere Distributed Switch.
  2. In the
    Hosts and clusters
    inventory, navigate to the cluster and click
    Configure
    .
  3. Under
    Services
    , click
    vSphere Availability
    , and under
    vSphere HA is Turned ON
    click the
    Edit
    button.
  4. Switch off the
    vSphere HA
    toggle button and Click
    OK
    .
  5. Under
    Services
    , click
    vSphere Availability
    , and under
    vSphere HA is Turned ON
    click the
    Edit
    button.
  6. Switch on the
    vSphere HA
    toggle button and Click
    OK
    .

PowerShell Procedure

  1. Start PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console.
  3. The values in this example are for the management domain. Replace with the values for the specific workload domain you are recovering.
    $restoredVcenterFqdn = "sfo-m01-vc01.sfo.rainpole.io" $restoredvCenterAdmin = "administrator@vsphere.local" $restoredvCenterAdminPassword = "VMw@re1!" $clusterName = "sfo-m01-cl01"
  4. Perform the configuration by running the command in the PowerShell console.
    Connect-VIServer -server $restoredVcenterFqdn -user $restoredVcenterAdmin -password $restoredVcenterAdminPassword | Out-Null Get-Cluster $clusterName | Set-Cluster -HAEnabled $false -confirm:$false | Out-Null Get-Cluster $clusterName | Set-Cluster -HAEnabled $true -confirm:$false | Out-Null Disconnect-VIServer * -confirm:$false

Apply Licensing to Cluster Hosts

The hosts in the recovered cluster are will be running evaluation licenses and should have permanent licenses assigned.

UI Procedure

  1. Connect to the restored vCenter Server and license the hosts.
  2. In vCenter Server UI, navigate to the
    Home
    Administration
    Licenses
    Assets> Hosts
  3. Select all hosts that are part of the recovered cluster and click
    Assign License
    .
  4. Choose a valid existing license and click
    OK
    .

Recover the VI Workload Domain NSX Manager Cluster

Perform the following tasks to recover the VI workload domain NSX Manager Cluster

Deactivate the NSX Manager Cluster

If you are using a version of NSX that is earlier than NSX 4.x, after you restore the first node of the NSX Manager cluster, you must deactivate the cluster.
  1. In a Web browser, log in to the management domain vCenter Server
    https://<vcenter_server_fqdn>/ui
    by using the vSphere Client.
  2. Click the VM of the operational NSX Manager node in the cluster, click
    Launch Web Console
    , and log in by using
    administrator
    credentials.
  3. Run the command to deactivate the cluster.
    deactivate cluster
  4. In the
    Are you sure you want to remove all other nodes from this cluster? (yes/no)
    prompt, enter
    yes
    .

Redeploy a Failed NSX Manager Node

You deploy a new NSX Manager instance by using the configuration of the failed node.

UI Procedure

  1. In a Web browser, log in to the management domain vCenter Server by using the vSphere Client.
  2. In the
    Hosts and clusters
    inventory, sight-click the management cluster and select
    Deploy OVF Template
    .
  3. On the
    Select an OVF template
    page, select
    Local file
    , click
    Upload files
    , navigate to the location of the NSX Manager OVA file, click
    Open
    , and click
    Next
    .
  4. On the
    Select a name and folder
    page, enter the VM name and click
    Next
    .
  5. On the
    Select a compute resource
    page, select the cluster and click
    Next
    .
  6. On the
    Review details
    page, click
    Next
    .
  7. For the management domain, select
    Medium
    , and for VI workload domains, select
    Large
    unless you changed these defaults during deployment.
  8. On the
    Select storage
    page, select the management vSAN datastore, and click
    Next
    .
  9. On the
    Select networks
    page, from the
    Destination network
    drop-down menu, select management distributed port group, and click
    Next
    .
  10. On the
    Customize template
    page, enter these values and click
    Next
    .
    Setting
    Value
    System root user password
    <failed_nsx_cluster_node_root_password>
    CLI admin user password
    <failed_nsx_cluster_node_admin_password>
    CLI audit user password
    <failed_nsx_cluster_node_audit_password>
    Hostname
    <failed_nsx_cluster_node_fqdn>
    Default IPv4 gateway
    <failed_nsx_cluster_node_gw>
    Management network IPv4 address
    <failed_nsx_cluster_node_ip>
    Management network netmask
    <failed_nsx_cluster_node_mask>
    DNS server list
    <dns_server_list>
    NTP server list
    <ntp_server_list>
    Enable SSH
    Selected
    Allow root SSH logins
    Deselected
  11. On the
    Ready to complete
    page, review the deployment details and click
    Finish
    .
  12. Repeat for the remaining failed node.

PowerShell Procedure

  1. Start PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console.
    The values in this example are for the management domain. Replace with the values for the specific workload domain you are recovering.
    $tempvCenterFqdn = "sfo-m01-vc02.sfo.rainpole.io" $tempvCenterAdmin = "administrator@vsphere.local" $tempvCenterAdminPassword = "VMw@re1!" $extractedSDDCDataFile = ".\extracted-sddc-data.json" $workloadDomain = "sfo-m01" $restoredNsxManagerDeploymentSize = "medium" $nsxManagerOvaFile = "F:\OVA\nsx-unified-appliance-3.2.2.1.0.21487565.ova"
  3. Perform the operation by running the command in the PowerShell console.
    New-NSXManagerOvaDeployment -vCenterFqdn $tempvCenterFqdn -vCenterAdmin $tempvCenterAdmin -vCenterAdminPassword $tempvCenterAdminPassword -extractedSDDCDataFile $extractedSDDCDataFile -workloadDomain $workloadDomain -restoredNsxManagerDeploymentSize $restoredNsxManagerDeploymentSize -nsxManagerOvaFile $nsxManagerOvaFile
  4. Repeat for the remaining failed node.

Join NSX Manager Nodes to the NSX Manager Cluster

You retrieve the ID and API thumbprint of the NSX Manager cluster, and use it join the newly-deployed NSX Manager instance to the cluster.

UI Procedure

  1. In a Web browser, log in to the management domain vCenter Server by using the vSphere Client.
  2. In the
    VMs and templates
    inventory, click the VM of an operational NSX Manager node in the cluster, click
    Launch web console
    , and log in by using
    administrator
    credentials.
  3. Retrieve the ID of the NSX Manager cluster.
    1. Run the command to view the cluster ID.
      get cluster config | find Id:
    2. Write down the cluster ID.
  4. Retrieve the API thumbprint of the NSX Manager API certificate.
    1. Run the command to view the certificate API thumbprint.
      get certificate api thumbprint
    2. Write down the certificate API thumbprint.
  5. Close the VM Web console.
  6. In the vSphere Client, click the VM of the newly deployed NSX Manager node, click
    Launch Web console
    , and log in by using
    administrator
    credentials.
  7. Run the command to join the new NSX Manager node to the cluster.
    join
    new_node_ip
    cluster-id
    cluster_id
    thumbprint
    api_thumbprint
    username admin
  8. Repeat for the remaining failed node.

PowerShell Procedure

  1. Start Windows PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console (note the values in this example are for a Management Domain but should be replaced with the values for the specific workload domain you are recovering)
    $workloadDomain = "sfo-m01" $extractedSDDCDataFile = ".\extracted-sddc-data.json"
  3. Perform the operation by running the command in the PowerShell console.
    Add-AdditionalNSXManagers -workloadDomain $workloadDomain -extractedSDDCDataFile $extractedSDDCDataFile

Restore the SSL Certificate of NSX Manager Node

If the version of NSX in your environment is earlier than NSX 4, then after you add the new NSX Manager node to the cluster and validated the cluster status, you must restore the CA-signed SSL certificate of the node.
To view the certificate of the failed NSX Manager cluster node, you log in to the NSX Manager for the domain.
  1. In a Web browser, log in to NSX Manager for the management domain by using the user interface.
  2. On the main navigation bar, click
    System
    .
  3. In the left pane, under
    Settings
    , click
    Certificates
    .
  4. Locate and copy the ID of the certificate that is issued by CA to the node that you are restoring.
  5. Run the command to install the CA-signed certificate on the new NSX Manager node.
    curl -H 'Accept: application/json' -H 'Content-Type: application/json'\ --insecure -u 'admin:
    NSX_admin_password
    ' -X POST\ 'https://
    NSX_host_node
    /api/v1/node/services\/http action=apply_certificate&certificate_id=<certificate_id>
  6. Repeat for the remaining restored node.
If assigning the certificate fails because the certificate revocation list (CRL) verification fails, see VMware Knowledge Base article 78794. If you disable the CRL checking to assign the certificate, after assigning the certificate, you must re-enable the CRL checking.

Restart an NSX Manager Node

If the version of NSX in your environment is earlier than NSX 4, then after assigning the certificate, you must restart the new NSX Manager node.
  1. In a Web browser, log in to the management domain vCenter Server by using the vSphere Client.
  2. In the
    Hosts and clusters
    inventory, right-click each restored NSX Manager VM that you updated the certificate on and select
    Guest OS
    Restart
    .

Recover the VI Workload Domain NSX Edge Cluster

Perform the following tasks to recover the VI workload domain NSX Edge Cluster

Redeploy an NSX Edge Cluster

To recover an NSX edge cluster, use the NSX API to redeploy the edge nodes.

UI Procedure

  1. Verify that the NSX Edge node is disconnected from NSX Manager by running the following API call in Postman.
    GET /<NSX-Manager-IPaddress>/api/v1/transport-nodes/<edgenode_id>/state "node_deployment_state": {"state": MPA_Disconnected"}
  2. Retrieve the edge node configuration by running the following API call and copy the output payload of this API.
    GET /<NSX-Manager-IPaddress>/api/v1/transport-nodes/<edgenode_id>
    "resource_type": "EdgeNode", "id": "9f34c0ea-4aac-4b7f-a02c-62f306f96649", "display_name": "Edge_TN2", "description": "EN", "external_id": "9f34c0ea-4aac-4b7f-a02c-62f306f96649", "ip_addresses": [ "10.170.94.240" ], "_create_user": "admin", "_create_time": 1600106319056, "_last_modified_user": "admin", "_last_modified_time": 1600106907312, "_system_owned": false, "_protection": "NOT_PROTECTED", "_revision": 2 }, "is_overridden": false, "failure_domain_id": "4fc1e3b0-1cd4-4339-86c8-f76baddbaafb", "resource_type": "TransportNode", "id": "9f34c0ea-4aac-4b7f-a02c-62f306f96649", "display_name": "Edge_TN2", "_create_user": "admin", "_create_time": 1600106319399, "_last_modified_user": "admin", "_last_modified_time": 1600106907401, "_system_owned": false, "_protection": "NOT_PROTECTED", "_revision": 1 }
  3. Redeploy the edge node using the following API call, passing the JSON data retrieved in Step 2 as the body.
    You do not need to pass any passwords in the JSON file.
    POST /<NSX-Manager-IPaddress>/api/v1/transport-nodes/<edgenode_id>?action=redeploy
  4. Repeat steps 1-3 for the remaining edge cluster nodes.
  5. In NSX Manager, monitor the Configuration Status of the new NSX Edge nodes, until they show
    Success
    .

PowerShell Procedure

  1. Start PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console.
    The values in this example are for the management domain. Replace with the values for the specific workload domain you are recovering.
    $restoredNsxManagerFqdn = "sfo-m01-nsx01.sfo.rainpole.io" $restoredNsxManagerAdmin = "admin" $restoredNsxManagerAdminPassword = "VMw@re1!VMw@re1!" $restoredVcenterFqdn = "sfo-m01-vc01.sfo.rainpole.io" $restoredvCenterAdmin = "administrator@vsphere.local" $restoredvCenterAdminPassword = "VMw@re1!" $clusterName = "sfo-m01-cl01" $extractedSDDCDataFile = ".\extracted-sddc-data.json"
  3. To recover the edge nodes, perform the configuration by running the command in the PowerShell console.
    Invoke-NSXEdgeClusterRecovery -nsxManagerFqdn $restoredNsxManagerFqdn -nsxManagerAdmin $restoredNsxManagerAdmin -nsxManagerAdminPassword $restoredNsxManagerAdminPassword -vCenterFQDN $restoredVcenterFqdn -vCenterAdmin $restoredvCenterAdmin -vCenterAdminPassword $restoredvCenterAdminPassword -clusterName $clusterName -extractedSDDCDataFile $extractedSDDCDataFile
If you encounter an error
Edge redeploy is blocked as an active alarm is found for Edge VM present in NSX Inventory but missing in vCenter
, navigate to the
Alarms
section in the relevant NSX Manager UI, select all relevant alarms whose
Event Type
is
Edge VM Present In NSX Inventory Not Present In vCenter
and from the vertical ellipsis menu, select
Resolve
. Then, retry the redeploy operation.

Verify the State of the NSX Edge Cluster Nodes

After completing all NSX Edge node redeployments, you must verify the state of the NSX Edge cluster nodes.
  1. In a Web browser, log in to NSX Manager for the domain by using the user interface.
  2. On the main navigation bar, click
    System
    .
  3. In the left pane, under
    Configuration
    , click
    Fabric
    Nodes
    .
  4. Click the
    Edge transport nodes
    tab.
  5. Verify all edge transport nodes show these values.
    Setting
    Value
    Configuration state
    Success
    Node status
    Up
    Tunnels
    Up

Post-Recovery Tasks

Perform the following tasks after the workload domain is recovered.

Synchronize the SDDC Manager Service Accounts on the ESXi Host

To ensure ESXi host service accounts are in sync with the SDDC Manager inventory, you must manually set a new password on each ESXi host and perform a password remediate operation in SDDC Manager.

UI Procedure

  1. If the service account does not exist, create a new service account on the ESXi host.
    1. Log into the first ESXi host in the cluster using the host client as
      root
      .
    2. Navigate to
      Manage
      Security & users
      Users
      , and click
      Add user
      .
    3. Enter the SDDC Manager service user name and password, and click
      Add
      .
      The service account user name format is svc-vcf-
      esxi_hostname
      .
  2. If the service account exists, set a new password on the ESXi host.
    1. as the root user.
    2. In the host client on the first ESXI host, navigate to
      Manage
      Security & users
      Users
      .
    3. Select the SDDC Manager service account svc-vcf-
      esxi_hostname
      and click
      Edit user
      .
    4. Set a new password and click
      Save
      .
  3. Perform password remediation in SDDC Manager.
    1. Log in to SDDC Manager.
    2. Navigate to
      Security
      Password Management
      ESXi
      .
    3. Locate the service account from Step 1, click the vertical ellipsis, and click
      Remediate
      .
    4. Enter the password used in Step 1 and click
      Remediate
      .
      Wait for the password remediation to complete.
  4. Repeat steps for all hosts in the workload domain.

PowerShell Procedure

  1. Start PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console.
    $restoredVcenterFqdn = "sfo-m01-vc01.sfo.rainpole.io" $restoredVcenterAdmin = "administrator@vsphere.local" $restoredVcenterAdminPassword = "VMw@re1!" $clusterName = "sfo-m01-cl01" $svcAccountPassword = "VMw@re123!" $sddcManagerFqdn = "sfo-vcf01.sfo.rainpole.io" $sddcManagerAdmin = "administrator@vsphere.local" $sddcManagerAdminPassword = "VMw@re1!"
  3. Perform the configuration by running the command in the PowerShell console.
    Resolve-PhysicalHostServiceAccounts -vCenterFQDN $restoredVcenterFqdn -vCenterAdmin $restoredVcenterAdmin -vCenterAdminPassword $restoredVcenterAdminPassword -clusterName $clusterName -svcAccountPassword $svcAccountPassword -sddcManagerFqdn $sddcManagerFqdn -sddcManagerAdmin $sddcManagerAdmin -sddcManagerAdminPassword $sddcManagerAdminPassword

Update the Backup Configuration in SDDC Manager

If the SFTP backup target for SDDC Manager and NSX has changed, you must update the SSH key for the backup configuration in SDDC Manager.
  1. Log into the SDDC Manager UI.
  2. Navigate to
    Administration
    Backup
  3. On the
    Site Settings
    tab, click
    Edit
    .
  4. Enter the new backup target details and click
    Save
    .

Update the Backup Configuration for vCenter Server

If the backup target for vCenter Server has changed, you must update the backup configuration for each vCenter Server instance.
  1. Log into the appliance management interface of vCenter Server at
    https://<vcenter_fqdn:5480>
    .
  2. Navigate to
    Backup
    and click
    Edit
    .
  3. Enter the new backup target details and click
    Save
    .

Restore Cluster Settings to Recovered vSphere Cluster

After all workloads are recovered on a workload domain vSphere cluster, you restore the cluster settings for virtual machine overrides, locations, and vSphere DRS rules and groups.
  1. Start PowerShell.
  2. Replace the values in the sample code with your values and run the commands in the PowerShell console.
    The values in this example are for the management domain. Replace with the values for the specific workload domain you are recovering.
    $restoredVcenterFqdn = "sfo-m01-vc01.sfo.rainpole.io" $restoredVcenterAdmin = "administrator@vsphere.local" $restoredVcenterAdminPassword = "VMw@re1!" $clusterName = "sfo-m01-cl01" $clusterVMOverridesJsonFile = ".\sfo-m01-cl01-vmOverrides.json" $clusterVMLocationsJsonFile = ".\sfo-m01-cl01-vmLocations.json" $clusterDRSConfigurationJsonFile = ".\sfo-m01-cl01-drsConfiguration.json" $clusterVMTagsJsonFile = ".\sfo-m01-cl01-vmTags.json"
  3. Perform the configuration by running the command in the PowerShell console.
    Connect-VIServer -server $restoredVcenterFqdn -user $restoredVcenterAdmin -password $restoredVcenterAdminPassword Restore-ClusterVMOverrides -clusterName $clusterName -jsonFile $clusterVMOverridesJsonFile Restore-ClusterVMLocations -clusterName $clusterName -jsonFile $clusterVMLocationsJsonFile Restore-ClusterDRSGroupsAndRules -clusterName $clusterName -jsonFile $clusterDRSConfigurationJsonFile Restore-ClusterVMTags -clusterName $clusterName -jsonFile $clusterVMTagsJsonFile Disconnect-VIServer * -confirm:$false
  4. Repeat the procedure for all clusters in the vCenter Server instance.