Wednesday, July 16, 2014

Bulk Importing EC2 Instances

I have been testing a a preview of a new PowerShell command, Import-EC2Instance, that will be added to the AWS PowerShell API next week.  The new command allows you to import a VM from VMware or Hyper-V.  I covered this in my book, but at the time the functionality was not available in PowerShell and I had to use the Java API.

While the new command will upload and convert your VM, you can also do the upload and convert independently.  This left me wondering if I could use the AWS Import/Export Service to ship a an external drive full of VMDK files and skip the upload process.  After some testing, it turns out you can.  Depending on the number of VMs you plan to migrate and the speed of your internet connection, this may be a great alternative.

Let me clarify that I am speaking of two similarly named services here.  EC2 Import is used to convert a VMDK (or VHD) into an EC2 Instance.  AWS Import allows you to ship large amounts of data using removable media.

Normally, the EC2 Import process works like this.  First, the PowerShell module breaks up the VMDK into 10MB chucks and uploads it to an S3 bucket.  Next, it generates a manifest file that describes how to put the pieces back together, and uploads that to S3.  Then, it calls the ec2-import-instance REST API passing a reference to the manifest.  Finally, the import service uses the Manifest to reassemble the VMDK file and convert it into an EC2 instance.

The large file is broken into chunks to make the upload easier and allow it recover from a connection error (retrying a part rather than the entire file.)  With the AWS Import/Service there is no need to break up the file.  Note that S3 supports objects up to 5TB and EC2 volumes can only be 1TB.  So there is no reason not to upload the VMDK as a single file.

So, all we need to do is create the manifest file and call the E2 Import API passing a reference to the manifest file.  If you have ever looked at one of these manifest files, they can look really daunting.  But, with only a single part, it's actually really simple.  Note that all of the URLs are pre-signed so the Import Service can access your VMDK file without granting IAM permissions to the import service.

$Bucket = 'MyBucket'
$VolumeKey = 'WIN2012/Volume.vmdk'
$VolumeSize = 20

#We need to know how big the file is to create the Manifest
$FileSize = (Get-S3Object -BucketName $Bucket -Key $VolumeKey).Size
$ByteRangeEnd = $FileSize - 1 #Byte Range is zero based

#Let's create a few pre-signed URLs for the import engine
$ManifestKey = "$VolumeKey.Manifest.xml"
$SelfDestructURL = Get-S3PreSignedURL -BucketName $Bucket -Key $ManifestKey -Expires (Get-Date).AddDays(7) -Protocol HTTPS -Verb GET
$HeadURL = Get-S3PreSignedURL -BucketName $Bucket -Key $VolumeKey -Expires (Get-Date).AddDays(7) -Protocol HTTPS -Verb HEAD
$GetURL = Get-S3PreSignedURL -BucketName $Bucket -Key $VolumeKey -Expires (Get-Date).AddDays(7) -Protocol HTTPS -Verb GET
$DeleteURL = Get-S3PreSignedURL -BucketName $Bucket -Key $VolumeKey -Expires (Get-Date).AddDays(7) -Protocol HTTPS -Verb DELETE

#The URLs need to be HTML encoded to write into the XML document
Add-Type -AssemblyName System.Web
$SelfDestructURL = [System.Web.HttpUtility]::HtmlEncode($SelfDestructURL)
$HeadURL = [System.Web.HttpUtility]::HtmlEncode($HeadURL)
$GetURL = [System.Web.HttpUtility]::HtmlEncode($GetURL)
$DeleteURL = [System.Web.HttpUtility]::HtmlEncode($DeleteURL)

#Create a Manifest file with one enourmous part
$Manifest = @"
<manifest>
    <version>2010-11-15</version>
    <file-format>VMDK</file-format>
    <importer>
        <name>ec2-upload-disk-image</name>
        <version>1.0.0</version>
        <release>2010-11-15</release>
    </importer>
    <self-destruct-url>$SelfDestructURL</self-destruct-url>
    <import>
        <size>$FileSize</size>
        <volume-size>20</volume-size>
        <parts count="1">
            <part index="0">
                <byte-range end="$ByteRangeEnd" start="0">
                <key>$VolumeKey</key>
                <head-url>$HeadURL</head-url>
                <get-url>$GetURL</get-url>
                <delete-url>$DeleteURL</delete-url>
            </byte-range></part>
        </parts>
    </import>
</manifest>
"@

#Upload the manifest file to S3
Write-S3Object -BucketName $Bucket -Content $Manifest -Key $ManifestKey

#Kick off an import task using the new manifest file
$Task = Import-EC2Instance -BucketName $Bucket -ManifestFileKey $ManifestKey -InstanceType m1.large -Architecture x86_64 -Platform Windows

Obviously there is room for improvement here.  You could import directly to a VPC, support Linux instances, or use the Import-Ec2Volume command to import additional (non-boot) volumes.  Hopefully this is good starting point.

Note that prerequisites for the EC2 Import still apply.  For example, you must convert the VMDK files to an OVF before shipping.