IMPORTANT: Running Imhotep on AWS will incur costs to your AWS account. We recommend shutting down your Imhotep cluster by deleting the CloudFormation stack when it is not in use. At a minimum, for a small amount of data, you will be running two r3.large instances and one r3.xlarge instance. Additional costs will be incurred for data upload and instance provisioning. For larger data sizes, you might need to increase the AWS resources and therefore costs. Click here for information about AWS costs.

Deleting the stack does not delete data you have uploaded to your S3 buckets. When you are ready for a new session, recreate the stack and point to the existing S3 buckets.

Setup

Use AWS CloudFormation to create a stack on AWS.

Select CloudFormation and Create Stack.

From the Select Template page, enter the name of your new stack.

From this same page, in Template, select Specify an Amazon S3 template URL and enter this URL:

Any traffic coming from outside of this range won't be able to access Imhotep. You can change the IP address range later by deleting and recreating your stack. If you know your IP address, use myIPAddress/0

KeyName

Name of an existing EC2 key pair to enable SSH access to the cluster. Create your KeyName in the same region as the Key Pair you created as a prerequisite to this procedure.

LoginId

Your user ID for logging into Imhotep.

LoginPassword

Your password for logging into Imhotep.

NumImhotepInstances

Number of Imhotep instances in the cluster that service queries. The default value is 2. Increase this number for greater scalability.

SSHLocation

IP address range for SSH access to the cluster. The range must be a valid IP CIDR range of the form x.x.x.x/x

Any SSH traffic coming from outside of this range won't be able to access Imhotep. You can change the IP address range later by deleting and recreating your stack.

Click Next through the remaining options of the stack setup until you see a Review page with the options you defined.

Allow the template to create IAM resources: from the Review page, scroll down to the Capabilities section and select the acknowledgment.

Click Create.

The process might take several minutes. When the setup is successful, URLs are available on the Outputs tab for Imhotep TSV Uploader and the IQL web client. Allow several minutes for the services to become available.

TSV Uploader allows you to upload your data to Imhotep.

The IQL web client allows you to query the Imhotep cluster using IQL queries. Learn about IQL.

Imhotep TSV Uploader

Use TSV Uploader to make your data available in Imhotep. TSV Uploader converts your data files into datasets that Imhotep can use and moves the datasets to the correct location so that Imhotep can access them.

Logging into TSV Uploader

Open a browser and navigate to the Imhotep TSV Uploader URL provided when you created the stack on AWS.

Bypass the SSL warning to reach the login screen.

Enter your login ID and password that you defined during setup.

Creating a Dataset

Log into TSV Uploader.

Scroll to the bottom of list of available datasets and enter a name for your new dataset in the text entry box. The dataset name must be at least two characters long and can contain only lowercase a-z and digits.

Click + to create the dataset.

The name of your new dataset appears in the list. When you first add the dataset, it is empty until you upload a data file. A dataset is not created on Imhotep until you upload a data file and a shard is created.

Uploading a Data File

To test your stack, consider uploading the sample time-series dataset in nasa_19950801.tsv. For more information about this dataset, click here.

Log into TSV Uploader and click the dataset name.

In the search field near the top of the page, click Upload TSV and browse to the TSV file that contains your data. Repeat this step to upload additional data files to your dataset. To upload multiple files at one time, with the dataset name selected, drag and drop the files to the TSV Uploader window.

Refresh the page to show the status of the upload.

When the process completes successfully, indexed shows as the status of the file. Allow a minute or two for your dataset to be available in the IQL web client.

If the process fails, failed shows as the status. Errors are written to a .error.log file, which you can download to your computer.

To upload files directly to your S3 build bucket, place the files in the iupload/tsvtoindex/datasetName/ directory. As they are processed, they are moved to iupload/indexedtsv/datasetName/. You can also view the files in TSV Uploader.

NOTE: If you upload a TSV file to the wrong dataset, you must manually remove the shard that contains the dataset from Imhotep. Learn how.

To download a data file to your computer, select datasetName>dataFileName and click the download button in Operations.

Deleting Files from TSV Uploader

To delete a data file, select datasetName>dataFileName and click the trash can.

To delete a dataset, select datasetName and click the trash can.

NOTE: Deleting a data file or dataset from TSV Uploader does not delete the dataset from Imhotep. TSV Uploader shows the list of data files for two weeks after a file’s upload date.

IQL Web Client

Use the IQL web client to query the Imhotep cluster using IQL.

Logging into the client:

Open a browser and navigate to the IQL URL provided when you created the stack on AWS.

Bypass the SSL warning to reach the login screen.

Enter your login ID and password.

Follow these general steps to construct an IQL query:

Formulate your question.

Select your dataset and the date range.

Enter the query.

Select how you want to group your data. Groups show as rows in tabular data.

Choose one or multiple metrics for your data. Metrics show as columns in tabular data.