Virtually Sober

If there is free booze and Virtualization; I'm there!

Scripting a Rubrik Recovery Plan using REST APIs & PowerShell

Following hot on the heels of my first post on an “Introduction to Rubrik REST APIs using PowerShell & Swagger” I’d now like to show you how to easily automate the recovery and boot ordering of VMs as a Recovery Plan.

In the Rubrik HTML5 interface, you can easily recover any VM in just a few clicks with the VMs running on a whopping 30,000 IOPS per brik (Rubrik appliance) giving you a sub 1 minute RTO. However, at scale, clicking on each VM to recover can become tedious, hard to manage and it will always require human interaction. This is where using PowerShell to interact with REST APIs is going to make your life easier by automating the entire process for anything from 1 to 10,000+ VMs. Use cases include:

  • Disaster recovery and failover testing of VMs replicated between Rubrik clusters
  • Recovery from production storage outages in a controlled manner
  • Bring multi-VM applications online in a working state with pre-configured time delays between VMs
  • Automatically create temporary dev/test VMs on any frequency required
  • Interactive user-driven recovery with warning prompts or fully automated end to end
  • Pre/post recovery scripting along with VM name customization

The 2 Rubrik operations we will be using in the script are “LiveMount” and “InstantRecover”. LiveMount should typically be used for testing the recovery of VMs which should not be attached to production port groups or have any networking at all. The primary use case of InstantRecover is to connect the recovered VMs direct to production port groups with the original VM (if still in the inventory) automatically powered off, deprecated and renamed by Rubrik. If you want InstantRecover without the existing VM being deprecated, you can use a LiveMount with the right combination of parameters to achieve this.

To start, you’re going to need a list of VMs to recover. For this I’m using a simple CSV with each VM listed in the order it is to be booted/recovered with the following fields (mandatory fields in bold):

  • VMName (assumes unique VM names)
  • Action (LiveMount or InstantRecover)
  • DisableNetwork (TRUE or FALSE)
  • RemoveNetworkDevices (TRUE or FALSE)
  • PowerOn (TRUE or FALSE)
  • RunScriptsinLiveMount (TRUE or FALSE)
  • PreFailoverScript (leave empty or specify script)
  • PostFailoverScriptDelay (0-x seconds)
  • PostFailoverScript (leave empty or specify script)
  • NextVMFailoverDelay (0-x seconds, leave as 0 for no delay between VM boot requests)
  • PreFailoverUserPrompt (leave empty for no prompt, or enter custom text for user prompt)
  • PostFailoverUserPrompt (leave empty for no prompt, or enter custom text for user prompt)

RubrikRecoveryPlan

Don’t worry about creating this by hand as you can download a ready-made example at the end of this post. Once you’ve authenticated with the REST API using the example from my first post, the key commands you need are:

# Getting list of VMs
$VMListURL = $baseURL+"vmware/vm?limit=5000"
Try 
{
$VMListJSON = Invoke-RestMethod -Uri $VMListURL -TimeoutSec 100 -Headers $RubrikSessionHeader -ContentType $TypeJSON
$VMList = $VMListJSON.data
}
Catch 
{
Write-Host $_.Exception.ToString()
$error[0] | Format-List -Force
} 
# Getting VM ID
$VMID = $VMList | Where-Object {($_.name -eq $VMName)} | select -ExpandProperty id 
# Getting VM snapshot ID
$VMSnapshotURL = $baseURL+"vmware/vm/"+$VMID+"/snapshot"
Try 
{
$VMSnapshotJSON = Invoke-RestMethod -Uri $VMSnapshotURL -TimeoutSec 100 -Headers $RubrikSessionHeader -ContentType $TypeJSON
$VMSnapshot = $VMSnapshotJSON.data
}
Catch 
{
Write-Host $_.Exception.ToString()
$error[0] | Format-List -Force
}
# Selecting most recent VM snapshot to use for recovery operation
$VMSnapshotID = $VMSnapshot | Sort-Object -Descending date | select -ExpandProperty id -First 1

Now we have the VM ID, snapshot ID you can perform either operation required. Let’s start by looking at how to perform a LiveMount:

# Performing Live Mount by first specifying JSON parameters and URL required
$VMLMJSON =
"{
  ""vmName"": ""$VMName - LiveMount"",
  ""disableNetwork"": true,
  ""removeNetworkDevices"": false,
  ""powerOn"": true
}"
$VMLiveMountURL = $baseURL+"vmware/vm/snapshot/"+$VMSnapshotID+"/mount"
# POST to REST API URL with VM JSON
Try 
{
write-host "Starting LiveMount for VM:$VMName"
$VMLiveMountPOST = Invoke-RestMethod -Method Post -Uri $VMLiveMountURL -Body $VMLMJSON -TimeoutSec 100 -Headers $RubrikSessionHeader -ContentType $TypeJSON
}
Catch 
{
Write-Host $_.Exception.ToString()
$error[0] | Format-List -Force
} 

Here you can see how to take the same information to perform an InstantRecover operation:

# Performing Instant Recovery by first specifying JSON parameters and URL required
$VMIRJSON =
"{
  ""vmName"": ""$VMName"",
  ""removeNetworkDevices"": false
}"
$VMInstantRecoverURL = $baseURL+"vmware/vm/snapshot/"+$VMSnapshotID+"/instant_recover"
# POST to REST API URL with VM JSON
# Warning, connects the VM the production network, shuts down and renames the original VM if it exists as "Deprecated VMName Date Time"
Try 
{
write-host "Starting InstantRecover for VM:$VMName"
$VMInstantRecoverPOST = Invoke-RestMethod -Method Post -Uri $VMInstantRecoverURL -Body $VMIRJSON -TimeoutSec 100 -Headers $RubrikSessionHeader -ContentType $TypeJSON
}
Catch 
{
Write-Host $_.Exception.ToString()
$error[0] | Format-List -Force
} 

If we then take these commands and wrap them up in a simple script that combines the ability to prompt the user, run separate scripts, and wait time delays, you have a very powerful recovery plan! To hit the ground running you can download my example here:

RubrikRecoveryPlanv1.zip

To run this fully automated with no user interaction simply remove the prompts for user credentials at the start along with PreFailoverUserPrompt and PostFailoverUserPrompt then you’re good to go! If you found this script useful please like and share. Happy scripting,

Joshua

2 responses to “Scripting a Rubrik Recovery Plan using REST APIs & PowerShell

  1. Mutahir July 21, 2017 at 8:30 am

    Hi Virtually Sober, thank you for sharing great tutorials with the community.

    can rubrik offer near zero RPO? does it do journal based replication – will be grateful if you can share your expert insight on this in a blog post or in comment reply – Thank you

    best regards
    Ali

    • Joshua Stenhouse July 21, 2017 at 8:53 am

      Hey Ali!

      Having worked at both Zerto and now Rubrik I can definitely answer your question. Rubrik offers best case RPOs of hourly for VMs using VADP/VM snaps and every 15 minutes for SQL databases using transaction log backups. The backups can then be replicated asynchronously as part of the backup policy (SLA Domain) to get the data offsite to another Rubrik cluster (physical or in AWS/Azure). You can recover to multiple points in time, but these are limited to the frequency on which you can take the backups. Once you have the backup offsite you can then leverage scripts such as the example I provide to orchestrate the recovery operation. I haven’t extended it to SQL databases yet but I’m thinking of adding it in.

      So Rubrik definitely isn’t near 0 RPO or journal based replication, in my opinion. If you need that then you most definitely should look at Zerto. Nobody does near 0 RPOs and journal based point in time better! For me it comes down to the use case. If the primary use case is backup with a copy of the backup offsite for DR then I’d use Rubrik. If the primary use case is replication and orchestration then I’d use Zerto. If I need both use cases then use both technologies together! They compliment each other very well in my testing. Self-managing backups and self-managing replication (at Zerto I really didn’t emphasize enough how important it is that you don’t schedule Zerto replication, you set the priority/QOS and it manages itself). At Rubrik this is a key tenant of the platform that is explained on every demo because it’s such a shift change from classic backup scheduling, just as Zerto is a shift change from replication scheduling.

      Any further questions let me know. Thanks,

      Joshua

Leave a Reply

%d bloggers like this: