Data synchronised at a block level asynchronously to S3 and then saved as EBS snapshots

From 1GB to 16TB

Cached Volumes

Primary storage is S3

Most of storage space in the cloud and little on premise

Frequently accessed data is cached on-premise

Tape Gateway (VTL)

Backup and archiving solution based on Virtual Tapes)

Uses existing tape backup infrastructure

Examples: NetBackup, Backup Exec, Veeam

Snowball

It used to be called AWS Import/Export

Petabyte-scale data transport solution

Secure appliance (25 but encryption)

Use of Trusted Platform Module (TPM)

Amazon performs software erasure of the Snowball appliance

It can

Import to S3

Export from S3

It is necessary to enter an on-line generated unlock code onto the appliance

Types

Regular Snowball

The size of a briefcase

Snowball Edge

“A little AWS on-premise”: It can run Lambda functions

100TB with on-board storage and compute

Use to move data in and out of AWS

It supports standard storage interfaces

Snowmobile

It is 45-foot long shipping container pulled by a semi-trailer truck

For petabytes worth of data

Can transfer up to 100PB

The snowball software works similarly to the AWS cli tool. Software must be copied into “buckets” that will then end up in the proper cloud bucket when Amazon gets the appliance back:

./snowball cp hello.txt s3://my_bucket

S3 Transfer Acceleration

It uses the CloudFront Edge Network to accelerate uploads to S3

A different URL is used rather than the regular S3 bucket one

Upload is to local edge node

Amazon then transfers from the edge node to the actual s3 bucket

A CloudFront-powered URL is created such as rato-accelerate.s3-accelerate.amazonaws.com

EC2

Amazon Elastic Compute Cloud (Amazon EC2)

Pricing Options

On-Demand

By the second for Linux and by the hour for Windows

Flexible for unpredictable workloads

Reserved

Discount on the hourly charge with 1-3 Year Committment

Steady state/predictable usage

Sub-types

Standard RIs (Up to 75% off)

Convertible RIs (Up to 54% off)

May allow changing the machines’ properties provided that the value remains lower or the same

Scheduled RIs (applicable to a time window)

Spot

A bidding scheme for buying discounted compute

For applications that are only feasible at very low compute prices

If Amazon terminates the instance, the customer will not be charged for the partial hour of usage in which the termination took place. However, if the customer terminates the instance, the charge will apply for the entire hour.

Ability to burst up to 3000 IOPS for extended periods of time for volumes at 334 GiB and above

IO1 (Provisioned IOPS SSD)

For I/O intensive applications such as large relational or NoSQL databases

Useful if more than 10,000 IOPS is required

Up to 20,000 IOPS may be provisioned per volume

ST1 (Throughput Optimized HDD, Magnetic)

Big data, data warehouses, log processing

Cannot be a boot volume

SC1 (Cold HDD, Magnetic)

Lowest cost storage for infrequently accessed workloads

File Server

Cannot be a boot volume

Standard (Magnetic)

Lowest cost per GB for a bootable drive

Ideal for infrequently accessed data

Volumes exist on EBS

Volumes and EC2 need to be in the same availability zone

1 EBS volume:1 ECS2 instance

It is preferable to create roles for EC2 instances to access other resources (such as S3) rather than relying on the access key and secret. Such a role may be assigned after an instance has been created

Detaching rules (!)

If it is a root volume, it can’t be detached without stopping the instance first

If it is a non-root volume, it may be detached

RAID and EBS

RAID stands for Redundant Array of Independent Disks

RAID 0 (Striped)

RAID 1 (Mirrored, Redundancy)

RAID 5 (Good for reads, bad for writes) - Not recommended by AWS

RAID 10 - Striped & Mirrored, Good Redundancy, Good Performance

Snapshots

Snapshots exists on S3 (but there are no publicly accessible buckets)

Snapshots are point in time copies of Volumes

Snapshots are incremental (only deltas stored in S3)

AMI images can be created out of snapshots

Snapshots are the objects that can travel from region to region

Snapshots of encrypted volumes are encrypted automatically

Volumes from encrypted snapshots are encrypted automatically

Only unencrypted snapshots may be shared (to other AWS accounts or made public)

Amazon insists on stopping a instances before taking snapshots

On the CLI: aws ec2 create-snapshot

EBS vs Instance Store

EBS Volumes are created from an EBS snapshot. EBS is essentially network attached storage

The volume is created from an instance stored in Amazon S3

EBS can be preserved upon instance termination unlike Instance Store

AMIs

They are regional

Load Balancing

3 Types of Load Balancers

Application Load Balancers (Layer 7)

Network Load Balancers (Layer 4)

Classic Load Balancers (ELB)

504 error: the gateway has timed out

X-Forwarded-For Header: It is a mechanism for the load balancer to identify the original requestor’s IP address.

Cloud Watch

Features

Dashboards allow custom visualisation

Alarms allow to create notifications when particular thresholds are hit

Events allow to react to events in the state of AWS resources

Logs help aggregate logs—it requires an agent to be installed.

EC2 Metrics (Out of the Box) (!)

Disk

Network

CPU

Monitoring Types

Standard = 5 minutes

Detailed = 1 minute (extra price)

CloudTrail is for auditing (e.g. user john created an S3 bucket) rather than monitoring and it is not the same as CloudWatch

CloudTrail

AWS CLI

$ aws configure
$ cd ~/aws
$ ls -la

Elastic File System (EFS)

It is a storage service for EC2

Storage capacity is elastic

It doesn’t need to be pre-provisioned (e.g. like EBS volumes)

Supports NFS (NFSv4)

Pay per use

It can be mounted by multiple EC2 instances

Data is stored across multiple AZs within a region

Read After Write Consistency

Lambda

General Points

Event-driven with multiple trigger sources including HTTP

Maximum duration is 5 minutes

Triggers

API Gateway

AWS IoT

Alexa Skills

Alexa Smart Home

CloudFront

CloudWatch Events

CloudWatch Logs

CodeCommit

Cognito Sync Trigger

DynamoDB

Kinesis

S3

SNS

Languages

C#

Java

Node

Python

Route 53 & DNS

The name originates because the DNS port is 53 An apex record if one at the root of a DNS zone. They are also known as naked domains There is a limit of 50 domains that can be raised by contacting AWS support

Aurora

Since there are 3 availability zones, this means that there are 6 copies in total (!)

2 copies of data may be loss without affecting write availability

3 copies may be lost without affecting read availability

Aurora Replica Types (2)

Aurora Replicas (15)

MySQL Read Replicas (currently 5)

Microsoft SQL Server

Storage is fixed and cannot be increased (!)

Amazon Virtual Private Cloud (VPC)

Amazon VPC is a capability that allows to provision a logically isolated section and network so that resources can be secured and grouped into trust areas.

5 VPCs are allowed in each region by default

Hardware Virtual Private Network (VPNs) may be created between a corporate datacentre and a VPC so that AWS becomes an extension to the corporate data centre.

1 Subnet = 1 Availability Zone

VPC Components

A connection method:

Internet Gateway

Virtual Private Gateway

A router

Route Table

Network ACL

Private and Public subnet(s)

Security Group

Resources secured using the Security Group

General Capabilities

Launching instances into a specific subnet

Assign custom IP addresses to ranges in each subnet

Configure route tables between subnets

Attach an Internet gateway to a VPC

Establish network access control lists (ACLS) across subnets

Peering

VPCs can be interconnected using a direct network route

VPCs can be peered with other AWS accounts as well as with other VPCs within the same account

Default VPC

All subnets in default VPC have a route out to the internet

Each EC2 instance has both a public and private IP address

No transitive peering

An Internet Gateway can only be attached to one VPC at a time.

Security Groups exists at the VPC Level

Subnets are associated with only one Network ACL

Subnets within a VPC can communicate with each other by default across availability zones (!)

ELBs and VPCs

ELBs can only operate on public subnets

Public subnets must have an Internet Gateway attached to them

At least two subnets must be specified

Subnet Ranges

CIDR Prefix

First IP

Last IP

Total

(10/8)

10.0.0.1

10.255.255.255

16,777,216

(182.16/12)

172.16.0.1

172.31.255.255

1,048,576

(192.168/16)

192.168.0.1

192.168.255.255

65,536

http://cidr.xyz/

Creating a new VPC

Creating a new VPC results in the automatic creation of:

Default Route

Default Network ACL

Default Security Group

Unavailable IPs

For example, in a subnet with CIDR block 10.0.0.0/24, the following five IP addresses are reserved:

10.0.0.0: Network address.

10.0.0.1: Reserved by AWS for the VPC router.

10.0.0.2: Reserved by AWS. The IP address of the DNS server is always the base of the VPC network range plus two; however, we also reserve the base of each subnet range plus two. For VPCs with multiple CIDR blocks, the IP address of the DNS server is located in the primary CIDR. For more information, see Amazon DNS Server.

10.0.0.3: Reserved by AWS for future use.

10.0.0.255: Network broadcast address. We do not support broadcast in a VPC, therefore we reserve this address.

NAT Instances

NAT Instances are AMI virtual machines that work as a NAT router.

On EC2/Network Settings, the option to check Source/Destination should be disabled in order for the NAT instance to work

NAT instances must be deployed in a public subnet

The instance size affects performance

Autoscaling Groups are necessary for high availability in multiple subnets

They are restricted by a security group

Route tables must be updated

NAT Gateway

A NAT gateway is a cloud native managed service rather than a user-managed EC2 instance

It scales automatically up to 10GBps

No need to patch

Not associated with security groups

It gets an IP address automatically

Route tables must be updated

They must be deployed in multiple AZs for high availability

NO need to disable source/destination

Network ACL (NACL)

Since NACL is stateless, both inbound and outbound rules must be created for regular TCP services like HTTP

Default NACLs allow all outbound and inbound traffic

New private NACLs have all inbound and outbound rules denied by default

Amazon recommends rule numbers to be in increments of 100

There is a *:1 relationship between subnets and NACLs

If a NACL isn’t specified, a subnet will be associated with the default NACL

NACLs are evaluated in order

Because of protocols using ephemeral ports such as FTP, a rule allowing traffic for ports 1024-65535 is typically defined as an outbound rule.

NACLs allow blocking IP addresses, unlike Security Groups

VPC Flow Logs

It allows capturing information about IP traffic going to and from network interfaces in a VPC using Amazon CloudWatch.

They can be created at three levels:

VPC

Subnet

Network Interface Level

General

Flow logs can only be enabled for VPCs under one’s account. This is important for “peered” VPCs

Flow logs can’t be tagged

Once created the configuration can’t be changed. They must be deleted and created again.

Not all IP traffic is monitored:

Traffic to the Amazon DNS Server (rather than a user-provided one)

Traffic generated by Window instances for license activation

Traffic for metadata access to 169.254.169.254

DHCP traffic

Traffic to the reserved UP address for the default VPC router

Endpoints

Two types

Elastic Network Interface (ENI) serves as an entry point for traffic destined to a service

A gateway endpoint serves as a target for a route in one’s route table for traffic destined for the service

Internet Gateway

Online 1 Internet Gateway can be attached to a VPC

Application Services

(Simple Queue Service) SQS

Oldest Amazon Service

Decouples producers from consumers

It is pull rather than push service

Messages can be of up to 256Kb in size

Messages may be kept in the queue from 1 minute to 14 days

Default retention period is 4 days

Visibility Timeout: the amount of time the message becomes invisible whilst being picked up by a reader client

Default timeout: 30 seconds (may be increased)

Maximum timeout: 12 hours

There is a long polling mechanism which allows to wait until a message arrives to the queue.

Simple Workflow Service (SWF)

It is a solution for workflows that involve human or human-like interaction

It ensures that tasks are only assigned once

It keeps track of the application state without requiring a user-provided application for this purpose