AWS EC2

EC2 instances on AWS

EC2 on AWS

Introduction

The purpose of this page is to go into some detail about the provisioning and use of AWS EC2 (Elastic Cloud Compute)
virtual servers. What is an EC2 really? It is simply a computer available for you to use on the cloud. You pay for it
by the hour, typically a few cents per hour. If you get a really powerful machine you can pay as much as $15 per hour
so we suggest you do not do that, to start with. Also please be aware that you can Stop an EC2 instance and not
pay the hourly rate without losing its contents. To get them back just Start the instance again.

Links

Warnings

*** There are three really important things to understand before you start using EC2 instances:

Never place your access keys in a public repository such as GitHub.

Never allocate EC2 resources and leave them running idle. This will cost you money. Learn to shut them down. Even stopped EC2 instances with attached EBS volumes will cost you money!

Learn to tag your resources to help cost management.
***

Informal introduction

An EC2 (Elastic Cloud Compute) instance is a computer; or you may prefer the term ‘virtual server’. It runs the Linux or Windows
operating system just like a box under your desk might. However: If you need to increase your computing capacity or expand
your storage: There are many many such EC2 instances available at a moment’s notice.

Please refer to our read our Cloud Core drop-down for more on how cloud computing may be suitable for your project,

Details and terminology

Amazon Web Services (AWS) EC2 instances come with a host of features and terminlogy. Briefly:

AMIs

Elastic Block Storage (EBS)

Snapshots of EBS volumes

EC2 Resources

Elastic File System (EFS)

Elastic IPs

Access keys and ‘Key Pairs’

S3 storage and access

Setting up an EC2 Instance

The ‘compute’ part of the cloud begins with Virtual Machines that are called Elastic Cloud Compute = ECC = EC2 instances.
To get started with EC2 instances you would log on to the AWS console, click on the EC2 icon and follow the buttons to create a
new instance; which takes a few minutes. And then you can log in to that machine and use it like any other computer. It has an
operating system that you choose, it has some amount of computing power that you choose, it has some amount of RAM that you
choose and it has very little disk space.

A short digression on logging in

Before going any further let’s log on to the instance. Note its ip address in the AWS console in the Instances table. Let’s
suppose this is 123.124.125.126 and that you chose the AWS Linux operating system. You have also downloaded the keypair file
that we will call kp.pem. If you are on Windows you use PuTTYGen to convert this to kp.ppk but let’s assume you are
on a Linux machine or a Mac which has the ssh command built in. (Also you can install the bash shell on Windows.)

Go to the file location of your .pem keypair file and type

% chmod 400 kp.pem
% ssh ec2-user@123.124.125.126 -i kp.pem

It may ask whether you wish to continue to connect. Answer ‘yes’. If everything has gone according to plan you should
see a little banner and a login prompt. This is the command prompt on your EC2 instance on AWS. Welcome to the cloud.
Depending on your instance type you may have more than one attached disk volume; but at this point we will back up
our narrative to try and be a bit more methodical.

To continue with setting up your EC2 instance

This is the big Aha: You also want to attach some disk space in the process as well. The generic term for disk space is a disk volume. You can
attach these volumes (more than one is fine) in the spin-up process or you can attach them later. This is Elastic Block Storage (EBS). A single
EBS volume can be up to 16 Terabytes.

Let’s assume EBS is where you’ll keep your data; and so to safely make a back-up copy of that EBS volume you can periodically take snapshots of it.
The snapshots bundle up all the stuff in the volume and put that (like a big zip file) in S3 storage. That storage is completely separate from the
EC2 instance and its attached EBS volumes. You can also bundle up an image of the entire EC2 instance into storage. This is a snapshot combined
with some additional instructions on setting up the instance; and together these comprise an AMI for Amazon Machine Image. So there are two levels
of granularity in backing up your work: Snapshots and AMIs.

Now how much does this cost? The EBS costs ten cents per GB-month. Storage costs three cents per GB-month; so if you have a lot of data this is
considerable cost savings, putting your EC2 instance in storage by means of an AMI; or putting your disk volume in storage by means
of a snapshot. The EC2 instance itself (not considering the cost of the EBS) will vary with which machine you chose. A cheap machine is free
(T2.micro). A basic, simple low-power machine will cost three cents per hour. A fairly powerful machine will cost perhaps 40 cents per
hour. You can also rent super-powerful machines for several dollars per hour. Since that adds up it is important to know that you can turn these
machines off (without losing your data) so that you are not paying for them when you are not using them.

What this page is / isn’t

Now this document is in preparation; so it does not cover everything in this topic. For example it does not cover how to
save money by using the Spot Market and it does not cover turning EC2 instances on and off and setting Alarms to inform you
when they are possibly doing something expensive. We’ll build all of this into these documentation pages; but the main point
of this page at the moment is Resources: The short list of items that you tend to build and associate with EC2 instances.
This includes AMIs, Snapshots, Key Pairs, and Elastic IPs.

Tagging

Tagging helps to manage and organize your AWS resources. This page
contains the long spiel on tagging strategies. A few things to note: tagging will help with cost allocation. A good practice is to always
set the following tags when spinning up a new EC2 instance:

name: Name of your resource (e.g. cloudmavencompute)

owner: Owner of the resource or who provisioned the resource

project: For billing purposes or which project the resource belongs to

confidentiality: For data security reasons if necessary

Making an AMI

An EC2 instance: We take as a given. (Although we don’t have down “Encrypted” but let’s just flag that with kilroy. Notice I
can create an AMI quite easily using the menu. Here is the configuration page:

Resource Overview

Returning to the EC2 Resource summary: Let’s take a look at what’s what here term by term:

In this region I’m not (apparently) running any instances.

Elastic IP

An Elastic IP is a convenient persistent ip address associated with my account (and with a particular machine unless that machine gets blown away).
It is said to be ‘publicly routable’: It is visible/findable on the internet.

AWS will hand us some number of these on demand for very little money. If we do not request this, however, then the default is a temporary public ip
address that is only good as long as the instance is up and running. Shut it down and that ip evaporates.

An Elastic IP is persistent on my account and could even be detached and re-attached to something else. This is like having a
piece of public plumbing at my dispoal to a resource that I want to share for example. This ip stuff is distinct from a DNS
entry. The latter is a person-friendly string associated with an ip address (free or maybe for a nominal fee).

Dedicated Host

A Dedicated Host is a physical computer which permits only me (my AWS account) to be connected. It does not thread in other virtual operating systems or
other accounts. This is an important concept in HIPAA compliance.

Snapshot

A Snapshot is always drawn from, i.e. is an image of some Elastic Block Storage (EBS). An 8GB Snapshot is quite likely the root
volume of an AWS Linux EC2 instance. However if you attach more EBS – like say 2TB – to that instance you can make a separate larger Snapshot.

AMI

An AMI is a Snapshot together with some instructions for standing up an EC2 instance. Hence: When you create an AMI part of that process is a Snapshot
or Snapshots of all the EBS volumes associated with that instance.

Snapshot to AMI Conversion in Linux

You can “upgrade” a Snapshot to an AMI in Linux but it is more complicated for Windows. And this is a little bit vague: Do
we mean that the snapshot is of the Root volume or an attached volume? The rebuilt AMI – in the case where the original AMI
instructions are unavailable – will be built out rather generically. This is one of these digressions that can drive one
crazy so let’s leave it there for now.

Understanding a Snapshot listing

When you look at your Snapshot listing:

CreateImage is a facility for building an AMI. Fine.

…is an instance ID.

…is an AMI ID.

Snapshot Archaeology

Suppose you have an old Snapshot and you are not sure what is on it; and if nothing you might want to
delete it. Now can you look at it? If the associated AMI is gone you can always create a Volume from the
Snapshot, Attach the Volume to some other EC2 instance in the same Availability Zone, mount the
Attached Volume and peruse the file system in the usual way. Maybe your important data is still there!

Create the Volume from the Snapshot

I will choose Magnetic because it is a bit cheaper… this is just an example. If I wish to have speed I would stay with SSD.

The Volume is created very quickly.

Here it is, now listed in the Volume table:

PRO TIP: When you create a Volume from a Snapshot make sure it is in the same Availabilty Zone (AZ)
as the Instance that you intend to attach it to. Failing to do so is a waste of time: You can’t attach
a Volume in Oregon to an EC2 in Virginia. In fact you can’t attach a Volume in Virginia 1-a to an EC2
in Virginia 1-b. Virginia is a Region, 1-a is the Availability Zone.

That’s where my restored Volume is. Oh look I have a machine there also; though it is Stopped: Let us Start it.

This finds a host, creates a VM, points to the associated storage all of this takes a few minutes but you
can be impatient and see if you can Attach the restored Snapshot Volume whenever you like.

Mounting the Attached Volume

To check if there is an attached volume, log on (ssh) onto your EC2 instance. More information is here

First see if it is attached with:

>> lsblk
Disk /dev/xvdc: 10.7 GB, 10737418240 bytes

Then format (if you haven’t already):

>> sudo mkfs -t ext4 /dev/xvdc

Create dir where it will be mounted:

>>mkdir /data

Mount it:

>>sudo mount /dev/xvdc /data

Check if it has been mounted correctly with:

>>mount -l
/dev/xvdc on /data type ext4 (rw)

Next, you will want to edit the fstab configuration file to have it mount automatically on reboot. Get the disk UUID. For example, for /dev/xvdb,

Note the the UUID for my /data drive is the same as the UUID we obtained the sudo file -s /dev/xvdb command above!

Check if there are errors in your /etc/fstab file:

sudo mount -a

That’s it. Make sure your read and write permissions are set accordingly using chmod.

What is the Snapshot ID?

Here is a screen capture of some volumes:

Each is attached to an EC2 instance. Notice these have a snapshot ID. This means that these volumes were created from a
snapshot, for example from an AMI = Snapshot + instructions. It does not mean that this snapshot still exists; or that
if it did exist it would reflect what is actually in the restored volume. That volume probably has changed. So to
explore these volumes: Again we would ssh into the EC2 instance associated and go look at the file system.

In the case above, by the way, as noted earlier: These are all tell-tale Linux OS root volumes because the default on AWS Linux EC2 instances is 8GB.

Pro Tip: Returning to the snapshot table comment field (‘Created by… in our example above): This can
have anything in there (User defined) when the snapshot is of an EBS without this AMI association business.
***

This concludes the overview of Snapshots and AMIs and the archaeological process of figuring out what is preserved
on an artifact Snapshot.

Key Pairs

In the Resource summary table there is an entry for Key Pairs. Let’s cover what these are next. A Key Pair is both a public and
a private key; and we will be primarily discussing the use of the private key file to authenticate into an AWS EC2 instance
using the secure shell (ssh) protocol.

Start up an EC2 instance. You need an initial way of getting in via ssh. Rather than use a password let’s use Key Pair authentication.
I get the private Key file; and it contains JUST a private key: A long string of characters. Let’s not publish this on Github.

When I authenticate using PuTTY I set it to use this private key. PuTTY uses the ‘.ppk’ version of the key but AWS only gives out a
‘.pem’ version of that file. No problem: There is a separate application called PuTTYGen that does the conversion. So procedurally:

Generate the EC2 instance

From the AWS Console: Get the key file associated.

Install both PuTTY and PuTTYGen (both will install in a PuTTY package)

Run PuTTYGen and convert the key file from .pem to .ppk format

Run PuTTY and use the .ppk file to log in to the instance.

To ssh to this EC2 instance I will need to know what username to enter. This can vary from one EC2 instance type (OS) to another.
For example on an AWS-styled Linux machine the user name is ec2-user. On an AWS EC2 Ubuntu instance the username is ubuntu.

Now I have logged in to the machine using ssh. I can sudo anything I want. Success.

How do I log in in the future? How do others log in? Three options:

I can use the key that I have and/or give that key to someone else.

I can generate a new key on that machine and share that key. This has nothing to do with AWS. I could do
it with a script for example, using Linux commands.

I can enable logging in by username and password.

Notice that ssh
is a secure (encrypted) tunnel through which these keys are passing.

This security level is maintained as a separate effort by ssh / PuTTY.
(PuTTY is the application and ssh is the cryptographic network protocol.)

Ssh, PuTTY, scp and WinSCP

Now that we have identified PuTTY as the ssh-using application let’s go a bit further. Ssh is also a Linux command for logging into another
machine; so in a sense PuTTY is the Windows equivalent of the Linux ssh. Similarly there is a secure copy program in Linux called scp. The Windows analog is WinSCP.

More on Keys

Keys are actually generated in pairs: The public and private key pair are associated; and the public key can be openly shared. For more on this see
this.

On AWS I can only generate one key pair per machine. The private key recognition machinery is injected into the instance when it launches the first time; and
I can start multiple instances using that same key. If I create an AMI and use that to generate many EC2 instances: Again just one private key provides
access to all of them. One key can map to many EC2 instances in the context of AWS.

Keys Versus S3 Access Sharing

Let’s take a moment to contrast Key-based access to an EC2 instance with the process of sharing files using S3 buckets. The latter is done
using IAM permissions, specifically using a Bucket Policy.

Sharing between AWS accounts is straightforward. If my friend has an AWS account then I just set that up in the S3 bucket policy by referring to his account
number. So he has to send me that.

Sharing with non-AWS-account holders is also easy and there are several options. If my collaborator has no AWS account
I have three broad categories of approach: IAM User, Web Server and Signed URL.

IAM User method: I get an Access Key and a Secret Key. I do not think this is the same thing as a Key Pair but I could be
proven wrong. I receive these two keys for example when I create a User. They reside in a single credential file.
I click “Download Credentials” and there it is in ASCII. The file is in CSV format and includes a user name, an
access key and a secret key (all strings).

There are three ways of getting to the S3 bucket now for that person.

Third party tool: Cloudberry, Cyberduck, etcetera

AWS command line interface (CLI)

An API call

Notice this does NOT involve the Web Console.

They can only use the AWS Web Console if I generate a password for them using IAM.

Put a web server / web app in front of the S3 bucket. This pushes the problem down a level, so to speak.

Generate a signed URL

Gives access to one object during the time-to-live associated with that URL.

This can be done on the CLI or in one of those applications (Cloudberry etc)

Look for the button that says ‘Generate signed URL’

So now we have covered Key Pairs and differentiated Key Pair use from S3 access.

Mounting an EBS to your Instance

FTP Setup and EC2 Instance

Bonus: Bind your EC2 instance to an Elastic IP so you can reuse the same public IP even if your instance changes!

Sign up for a free AWS account here.
Usage and free tier information available
here.

This recipe from AWS is straightforward. Print it out and don’t skip a
step. An EC2 (elastic compute cloud) is your virtual computing environment i.e. your virtual machine. This video by Microwave Sam expands on the EC2 setup.

You can set up an FTP server on your VM. Search stackoverflow for “setting up ftp server on amazon cloud”.

Test by connecting an FTP client to your FTP server. You may need to change directory permissions on the VM.

Elastic IPs

Stopping and re-starting EC2 instances can change the ip address. OVercome this using Elastic IP. You get a limited number
of these with your account; and you can request more. See the AWS user guide (kilroy link needed).

Once you’ve associated the Elastic IP with a running instance, you can ssh into the VM with the Elastic IP but using
the previous public key generated for the instance. Don’t forget to update the vsftpd.conf with your new Elastic IP
address which is now your public address.

Microsoft Azure let’s you choose your own public DNS hostname which reduces the need for this workaround.

DNS Hostnames

If you have your own registered domain, you can set your A-Record to point to the Elastic IP address of your
instance. That gets rid of the unsightly public DNS that AWS assigns to you. Here’s the example for
cloudmaven.org (our domain registrar is Namecheap.com):

(link was broken; image should be put in images/az and referenced directly)

An A-record points the hostname (here “compute”) to the AWS instance Elastic IP (here “52.41.144.22”).
I can then ssh into my compute instance using ec2-user@compute.cloudmaven.org. You can also set up
similarly an A record called ftp that points to the elastic IP of your ftp server instance to allow ftp
access into say ftp.cloudmaven.org.