Agile DevOps - People and Process then Automation!!!

Use Cloud and DevOps to build infrastructure by rapid agile iterative development using collaborative open source tools:
-Terraform or Cloud Formation for easy orchestration.
-chef cookbooks, puppet modules or ansible playbooks to build servers.
-Test Kitchen to deploy and test them.
-Packer for build server images or containers.
-EC2Dream a graphic user interface that provides a 'single pane of glass' to do agile devops primarily on cloud servers.
-Amazon AWS, Azure, Google Compute Engine, IBM, Openstack, Local and Hosted servers.

Snapshots


Snapshots are a way of backing up an Elastic Block Store (EBS). An EBS can also be created from a snapshot so this is a way of restoring an EBS or cloning an EBS. Snapshots are stored on S3 although they are not currently accessible via a bucket so you cannot make a backup copy of the snapshot (for example an off-site backup).
Snapshots are not useful for backing up volatile data that needs consistency (e.g SQL databases). In this case you need to either:
  - Use a backup utility of the product that can do a backup in a consistent fashion.
  - Have a a slave copy of that can be frozen to do a backup.
Once a backup is taken it can be either:
  - Copied to S3
  - If the backup is on an EBS then a snapshot could be taken of the backup.


aws_helper - helper code to do snapshot backups, send confirmation email, cleanup old ebs disks.

Ruby scripts can be used to manage snapshots

First a script to create a snapshot of a volume. The script gets the volume id from the user data or you can hard code the volume. It can be run regularly via cron on linux or similar on Windows (ie AT command).

#!/usr/bin/ruby
# script to create a snapshot

require 'rubygems'
require 'right_aws'
require 'net/http'

AMAZON_PUBLIC_KEY=<public key>
AMAZON_PRIVATE_KEY=<private key>
url_user_data = 'http://169.254.169.254/latest/user-data'
# switch between the next 2 lines to either use user data or
# a hard coded volume
ec2_vol = Net::HTTP.get_response(URI.parse(url_user_data)).body
# ec2_vol ='vol-xxxxx'

print "Create Snapshot ",ec2_vol," ",DateTime.now,"\n"
# remove last parameter if not in the EU zone
ec2 = RightAws::Ec2.new(AMAZON_PUBLIC_KEY, AMAZON_PRIVATE_KEY,:endpoint_url => 'https://eu-west-1.ec2.amazonaws.com/')

print "snapshot ",vol,"\n"
vol = ec2.create_snapshot(ec2_vol)
print "result ",vol,"\n"
print "Snapshot Complete ",ec2_vol," ",DateTime.now,"\n"



Snapshots - How they work
When you take a series of snapshots for a volume, each unique piece of information is stored once, regardless of how many snapshots need that piece of data. (So each snapshot just has the differences between the other snapshots).
For example:
a. Say you have a 3GB volume that you put 1GB on and take a snapshot, S1.
b. Then you put an additional 1GB on it (so that you're using 2GB) and take another snapshot, S2.
c. Internally Amazon is storing 2GB (less after compression) of total unique data.
d. If you deleted S1, it will still be storing 2GB, because all that data was still needed by S2.
e. If instead you deleted S2, the amount of data stored goes down to 1GB, the amount that was needed for S1.

Charges for Snapshots
Charges aren't tracked per snapshot. Amazon looks at the total amount of unique data for all the snapshots you have.
So if you have a volume, V1, take a snapshot of it, S1, create a new volume, V2, from the snapshot and then create a snapshot, S2, of V2, you will not be charged twice for the data shared between S1 and S2.
Charging is for the total amount of unique data needed by all your undeleted snapshots.

Delete Snapshots

The following script can also be run to delete snapshots when they reach a certain age. I tested restoring from a snapshot after deleting an old one so the snapshot are not dependent on one another.

#!/usr/bin/ruby

require 'rubygems'
require 'right_aws'
require 'net/http'
require 'date'

AMAZON_PUBLIC_KEY=<public key>
AMAZON_PRIVATE_KEY=<private key>
# switch between the next 2 lines to either use user data or
# a hard coded volume
url_user_data = 'http://169.254.169.254/2008-02-01/user-data'
ec2_vol = Net::HTTP.get_response(URI.parse(url_user_data)).body
# ec2_vol ='vol-xxxxx'
# remove last parameter if not in the EU zone
ec2 = RightAws::Ec2.new(AMAZON_PUBLIC_KEY, AMAZON_PRIVATE_KEY,:endpoint_url => 'https://eu-west-1.ec2.amazonaws.com/')
# delete snapshots 30 days old
age = 30
print "delete snapshots for vol ",ec2_vol,"\n"
sa = ec2.describe_snapshots()
sa.each do |s|
  aws_volume_id = s[:aws_volume_id]
  if aws_volume_id == ec2_vol then
    days_old = Integer((Time.now - s[:aws_started_at]) / (60*60*24))
    if days_old > age then
      ec2.delete_snapshot(s[:aws_id])
      print "snapshot ",s[:aws_id]," ",days_old," days old for vol ",s[:aws_volume_id]," deleted \n"
    else
      print "snapshot ",s[:aws_id]," ",days_old," days old for vol ",s[:aws_volume_id],"\n"
    end
  end
end


EC2 snapshot performance
A volume created from a snapshot will be available for use immediately, but will be populated lazily from S3 in the background. However, if your application is hitting regions of the volume that are not yet pulled from S3, then performance can be impacted waiting for the data from S3 (it will be fetched immediately, but the path to S3 is slower than the path to the EBS volume).

Sharing a Snapshot
By default only your account can access a snapshot, however you can share it with a list of EC2 accounts, or share it publicly. You do this via the Modify Snapshot Attribute function. This is useful to allow data to be access by multiple accounts and to provide a public pre-build snapshot image for an EBS that could complement a public Amazon Machine Image.

In linux to see performance of disk use
  iostat -x

2 comments:

Anonymous said...

perfect, this should be part of their plugins.

Mike Heffner said...

I decided to clean this up and add some configurable options...see

http://blog.fesnel.com/2010/04/cleanup-old-ebs-snapshots.html

Post a Comment