Amazon Elastic Inference pricing

With Amazon Elastic Inference, you pay only for the accelerator hours you use. There are no upfront costs or minimum fees. There is no charge for the AWS-optimized versions of the TensorFlow and Apache MXNet deep learning frameworks. There are no additional charges for AWS PrivateLink VPC Endpoints to Amazon Elastic Inference, as long as you have at least one instance configured with an accelerator, that is running in an Availability Zone where a VPC endpoint is provisioned.

Amazon Elastic Inference accelerator pricing for Amazon EC2

Following is the Amazon Elastic Inference pricing with Amazon EC2 instances. For Amazon Elastic Inference pricing with Amazon SageMaker instances, please see the Model Deployment section on the Amazon SageMaker pricing page.

US East

US West

EU

Asia Pacific

US East

US East (N. Virginia) Region

Accelerator type

Throughput in trillion 32-bit floating point operations per second (FP-32 TFLOPS)

Throughput in trillion 16-bit floating point operations per second (FP-16 TFLOPS)

Memory

Pricing

eia1.medium

1 FP-32 TFLOPS

8 FP-16 TFLOPS

1 GB

$0.130 per hour

eia1.large

2 FP-32 TFLOPS

16 FP-16 TFLOPS

2 GB

$0.260 per hour

eia1.xlarge

4 FP-32 TFLOPS

32 FP-16 TFLOPS

4 GB

$0.520 per hour

US East (Ohio) Region

Accelerator type

Throughput in trillion 32-bit floating point operations per second (FP-32 TFLOPS)

Throughput in trillion 16-bit floating point operations per second (FP-16 TFLOPS)

Memory

Pricing

eia1.medium

1 FP-32 TFLOPS

8 FP-16 TFLOPS

1 GB

$0.130 per hour

eia1.large

2 FP-32 TFLOPS

16 FP-16 TFLOPS

2 GB

$0.260 per hour

eia1.xlarge

4 FP-32 TFLOPS

32 FP-16 TFLOPS

4 GB

$0.520 per hour

US West

US West (Oregon) Region

Accelerator type

Throughput in trillion 32-bit floating point operations per second (FP-32 TFLOPS)

Throughput in trillion 16-bit floating point operations per second (FP-16 TFLOPS)

Memory

Pricing

eia1.medium

1 FP-32 TFLOPS

8 FP-16 TFLOPS

1 GB

$0.130 per hour

eia1.large

2 FP-32 TFLOPS

16 FP-16 TFLOPS

2 GB

$0.260 per hour

eia1.xlarge

4 FP-32 TFLOPS

32 FP-16 TFLOPS

4 GB

$0.520 per hour

EU

EU (Ireland) Region

Accelerator type

Throughput in trillion 32-bit floating point operations per second (FP-32 TFLOPS)

Throughput in trillion 16-bit floating point operations per second (FP-16 TFLOPS)

Memory

Pricing

eia1.medium

1 FP-32 TFLOPS

8 FP-16 TFLOPS

1 GB

$0.140 per hour

eia1.large

2 FP-32 TFLOPS

16 FP-16 TFLOPS

2 GB

$0.280 per hour

eia1.xlarge

4 FP-32 TFLOPS

32 FP-16 TFLOPS

4 GB

$0.560 per hour

Asia Pacific

Asia Pacific (Tokyo) Region

Accelerator type

Throughput in trillion 32-bit floating point operations per second (FP-32 TFLOPS)

Throughput in trillion 16-bit floating point operations per second (FP-16 TFLOPS)

Memory

Pricing

eia1.medium

1 FP-32 TFLOPS

8 FP-16 TFLOPS

1 GB

$0.220 per hour

eia1.large

2 FP-32 TFLOPS

16 FP-16 TFLOPS

2 GB

$0.450 per hour

eia1.xlarge

4 FP-32 TFLOPS

32 FP-16 TFLOPS

4 GB

$0.890 per hour

Asia Pacific (Seoul) Region

Accelerator type

Throughput in trillion 32-bit floating point operations per second (FP-32 TFLOPS)

Throughput in trillion 16-bit floating point operations per second (FP-16 TFLOPS)

Memory

Pricing

eia1.medium

1 FP-32 TFLOPS

8 FP-16 TFLOPS

1 GB

$0.210 per hour

eia1.large

2 FP-32 TFLOPS

16 FP-16 TFLOPS

2 GB

$0.430 per hour

eia1.xlarge

4 FP-32 TFLOPS

32 FP-16 TFLOPS

4 GB

$0.850 per hour

Pricing example 1

Let’s say you are running a streaming video analytics application. To run deep learning inference to analyze a single video stream in this application, you can choose an Amazon EC2 c5.xlarge instance configured with an Amazon Elastic Inference eia1.medium accelerator. Your hourly cost to run this deep learning model in the US East (N.Virginia) region is:

Pricing example 2

Let’s say you are running a web application that analyzes images uploaded by your end users in real time. To use deep learning inference for this application, you can choose an Amazon EC2 c5.large instance configured with an Amazon Elastic Inference eia1.medium accelerator and scale this instance capacity using Amazon EC2 Auto Scaling to meet the demands of your application. Your hourly cost for this combination in the US East (N.Virginia) region is: