One of the most downloaded scripts on my blog over the past 2 years has been the Rubrik Recovery Plan for orchestrating the failover, test, and ordering of VM live mounts from Rubrik. Want to recover 100 VMs from scratch to VMTools running in less than 30 minutes? If so, you’ve come to the right place so read on…
With the recent announcement of Rubrik App Flows (a SaaS based DR product for orchestrating failover/failback from VMware to AWS) this script becomes even more valuable as it focuses solely on the most common VMware to VMware use-case across the Rubrik customer base.
Demand drives innovation so I recently spent some time adding some seriously valuable features, so many that I’ve incremented it from v3 to v5! This also brings it in-line with the CDM version it is compatible with (CDM 5.x +) due to new capabilities of recovering VM tags and keeping MAC addresses, in addition to user-acceptance testing, auto-testing and new email templates.
To get started download it from here:
Unzip to C:\RubrikRecoveryPlanv5\ or your desired location and as mentioned above, Rubrik CDM v5 is required. At the start of each script change the $RubrikCluster, $LogDirectory, $EmailServer, $EmailTo, $EmailFrom to match your environment. On first run you’ll be prompted for credentials which are saved encrypted in an xml file for subsequent runs.
You can also configure $LogoFile to customize the email reports with your own company logo. Here’s the result of me recovering 100 Windows 2016 VMs using a 4-node r344 Rubrik cluster to my 3 ESXi host HyperConverged Home Lab 2.0:
Success per VM is defined as the VM powering on and VMware tools started (if installed) by checking the progress of the live mounts once all have been requested. Pretty impressive huh? Here’s the CSV data file outputs proving the recovery (times are in UTC):
Within the above zip you’ll find the following files:
- Sample CSV recovery plan or generate your own using the ExportVMList.ps1 tool below. For HostSelection (ESXi) use the hostname as shown in vCenter of RANDOM to auto-select.
- Runs the recovery plan CSV, verifies the result, sends an email, removes the live mounts.
- Runs the recovery plan CSV, verifies VM recovery, prompts for user acceptance/notes, sends an email, removes the live mounts.
- Runs the recovery plan CSV, verifies VM recovery, sends an email.
- Removes the live mounts created by the VMs in the recovery plan CSV.
- Generates a recovery plan CSV with sample default values containing every protected Rubrik VM.
- Generates a CSV containing each VM NIC, its connection state and port group amongst other useful info using vSphere 6.5+ REST APIs. Edit the State and PortGroup fields then run the Import script to make the desired changes. Can be used standalone as a bulk VM NIC editor or in-conjunction with the -ManualTest.ps1 for verifying vSphere port reconfiguration (by default Rubrik uses the port group name of the protected VM, if they match between sites this isn’t needed).
- Runs through each VM NIC in the CSV generated by the export script to change the NIC State if CONNECTED and not already, and the PortGroup name using vSphere 6.5+ REST APIs. Can be used standalone as a bulk VM NIC editor or as above with the -ManualTest.ps1.
Here’s a complete list of all the other new features I added:
- Support for recovering vSphere tags and keeping MAC address matching Rubrik CDM v5.x interface configuration options (configured by CSV)
- Enhanced real-time progress reporting of recovery plan actions
- Snazzy new email template with configurable logo (recommended 150 by X pixels) with color-coded rows for failed VMs/tasks
- Raw test result data attached to each email and exported to CSV
- Configurable to only email on failures (useful for auto-testing)
- VM names contain hyperlink to corresponding protected VM object in Rubrik UI
- Rubrik cluster name contains hyperlink to Rubrik cluster being used
- Start and end times for plan, VMs and each action
- RTO benchmarking per VM and for entire recovery plan
- VM failure messages showing reason if a live mount fails in details column
- Detailed recovery plan log showing each action as part of the live mount
- Load balancing summary of live mounts per ESXi host and Rubrik node
- Manual-testing with user acceptance and note entry
- Auto-testing end-to-end with mount and removal of VMs
- Support for mounting the same source VM multiple times (auto-increments the name with a random ID, allows you to easily test load)
- Enhanced unmount process now waits for each unmount to finish before proceeding to the next to prevent overloading API/vCenter for a task (as cleanup success is more important than speed)
If you have any feedback, questions or issues reach out to me via the Drift chat. If you found this useful please like and share. Happy scripting,@JoshuaStenhouse