Welcome!

Ecommerce Observability

Real world Observability

In this workshop, we'll inspect a non-functioning E-Commerce app and add Datadog to grab metrics, traces, and logs from the applications within.

Congratulations!

You've completed the scenario!

Scenario Rating

Adding Observability with Distributed Tracing

As we've seen, distributed tracing is just one of many tools to help increase your visibility into the complex systems you deploy and manage every day.

It's one facet of a many faceted approach, to being able to reliable diagnose production systems and problems before your customers notice.

I hope this lesson gives you some ideas for how you can add custom traces to your company's systems, and start building observability systems that fit your particular business and legacy software problems.

It's an exciting time to be in software development. Let's build systems we can all be proud of.

Steps

Ecommerce Observability

Step1 of 6

Step 1

Spinning Up Our Legacy E-Commerce Shop

Our legacy shop uses Ruby on Rails and Spree.

We use docker-compose to bring it up and running. There's a prebuilt Rails Docker container image, along with a few Python / Flask microservice which handle our Coupon codes and Ads which run on the store.

In this workshop, we're going to spin up and instrument our application to see where bottlenecks exist.

We'll focus on thinking through what observability might make sense in a real application, and see how setting up observability works in practice.

Our application should be cloned from Github in this scenario, and if we change into the directory, we should be able to start the code with the following:

For Ruby on Rails, we need to first add the ddtrace Gem to our Gemfile. Take a look at store-frontend/Gemfile in the Katacoda file explorer, and notice we've added the Gem so we can start shipping traces.

Because we plan on also consuming logs from Rails and correlating them with traces, we've also added logging-rails and lograge. Both of these are documented on the Ruby trace / logs correlation part of the documentation.

Once these are both added to the list of our application's requirements, we must then add a datadog.rb to the list of initializers.

Once we spin up that traffic with our included observability, we can now take a look at the issues we've come across since the new team rolled out their first microservice, the advertisements-service.

Before we began instrumenting with Datadog, there'd been reports that the new advertisements-service broke the website. With the new deployment on staging, the frontend team has blamed the ads-service team, and the advertisements-service team has blamed the ops team.

Now that we've got Datadog and APM instrumented in our code, let's see what's really been breaking our application.

The first place we can check is the Service Map, to get an idea for our current infrastructure and microservice dependencies.

In doing so, we can tell that we've got two microservices that our frontend calls, a discounts-service, along with an advertisements-service.

If we click in to view our Service Overview in Datadog, we can see that our API itself isn't throwing any errors. The errors must be happening on the frontend.

So let's take a look at the frontend service, and see if we can find the spot where things are breaking.

If we look into the service, we can see that it's been laid out by views. There's at least one view that seems to only give errors. Let's click into that view and see if a trace from that view can tell us what's going on.

It seems the problem happens in a template. Let's get rid of that part of the template so we can get the site back up and running while figuring out what happened.

Open store-frontend/app/views/spree/layouts/spree_application.html.erb and delete the line under <div class="container">. It should begin with a <br /> and end with a </center>.

The banner ads were meant to be put under store-frontend/app/views/spree/products/show.html.erb and store-frontend/app/views/spree/home/index.html.erb.

For the index.html.erb, under <div data-hook="homepage_products"> add the code:

With that, our project should be up and running. Let's see if there's anything else going on.

Step 6

Spotting Bottlenecks with the Service List

With the Service List, we can see at a quick glance see endpoints that are running slower than the rest.

If we look at the Frontend Service, we can see there are two endpoints in particular that are substantially slower than the rest.

Both the HomeController#index and the ProductController#show enpoints are showing much longer latency times. If we click in, and view a trace, we'll see that we've got downstream microservices taking up a substantial portion of our load time.

Use the span list to see where it may be, and we can then take a look at each of the downstream services and where things may be going wrong.

It seems two microservices in particular are being called for the homepage. If we look into our docker-compose.yml, we can see both the advertisements-service and discounts-service are each taking over 2.5 seconds for each request. Let's look within their specific urls to see if there isn't something amiss.

Looking at the code, it appears we've accidentally left a line in from testing what happens if latency goes up.

Try spotting the line and removing the code to see if you can bring the latency down again for the application.

Help

Katacoda offerings an Interactive Learning Environment for Developers. This course uses a command line and a pre-configured sandboxed environment for you to use. Below are useful commands when working with the environment.

cd <directory>

Change directory

ls

List directory

echo 'contents' > <file>

Write contents to a file

cat <file>

Output contents of file

Vim

In the case of certain exercises you will be required to edit files or text. The best approach is with Vim. Vim has two different modes, one for entering commands (Command Mode) and the other for entering text (Insert Mode). You need to switch between these two modes based on what you want to do. The basic commands are: