Deploy the model that you trained in Create and
Run a Training Job (Amazon SageMaker Python SDK) by calling the deploy method
of the sagemaker.estimator.Estimator object. This is the same object
that you used to train the model. When you call the deploy method,
specify the number and type of ML instances that you want to use to host the
endpoint.

Deploying a model using the AWS SDK for Python (Boto 3) is a three-step process:

Create a model in Amazon SageMaker – Send a CreateModel request to provide information such as
the location of the S3 bucket that contains your model artifacts and the
registry path of the image that contains inference code.

Create an endpoint configuration – Send a CreateEndpointConfig request to provide the
resource configuration for hosting. This includes the type and number of ML
compute instances to launch to deploy the model.

Create an endpoint – Send a CreateEndpoint request to create an endpoint. Amazon SageMaker
launches the ML compute instances and deploys the model. Amazon SageMaker returns
an
endpoint. Applications can send requests for inference to this
endpoint.

This code continuously calls the describe_endpoint command in
a while loop until the endpoint either fails or is in service,
and then prints the status of the endpoint. When the status changes to
InService, the endpoint is ready to serve inference
requests.