At Intouch Insight our logging infrastructure is our holy
grail. The engineering team relies on it every day, so we need to keep
it up to snuff. I was lucky enough to be able to update our ELK cluster
this week to 5.6 - a huge upgrade from our previous stack running ES 2.3
and Kibana 4.

The Problem

The documentation surrounding the usage of the dead letter queue mostly
revolves around re-processing rejected events. I wasn’t
particularly interested in that use case; I just wanted to be able to
easily see when events were rejected.

The Solution

filter {# First, we must capture the entire event, and write it to a new# field; we'll call that field `failed_message`ruby {code => "event.set('failed_message', event.to_json())" }

# Next, we prune every field off the event except for the one we've# just created. Note that this does not prune event metadata.prune {whitelist_names => [ "^failed_message$" ] }

# Next, convert the metadata timestamp to one we can parse with a# date filter. Before conversion, this field is a Logstash::Timestamp.# http://www.rubydoc.info/gems/logstash-core/LogStash/Timestampruby {code => "event.set('timestamp', event.get('[@metadata][dead_letter_queue][entry_time]').toString())" }

# Apply the date filter.date {match => [ "timestamp", "ISO8601" ] }

# Pull useful information out of the event metadata provided by the dead# letter queue, and add it to the new event.mutate {add_field => {"message" => "%{[@metadata][dead_letter_queue][reason]}""plugin_id" => "%{[@metadata][dead_letter_queue][plugin_id]}""plugin_type" => "%{[@metadata][dead_letter_queue][plugin_type]}" } }}

Today I learned an interesting lesson. If you’re using AWS Aurora, don’t set your alarms (CPU, memory, etc) on
an individual instance. The reason? Aurora instances, by their very nature, are somewhat transient:

If you have an issue with a given primary, Aurora may promote another instance to your WRITER role.

If you need to perform an in-place update, you can sometimes do so by creating a new set of instances, and promoting those manually.

In either case, you’ll either lose your alarms completely, if you forget to re-create them, or your alarms
for your primary instance will no longer be on your primary.

It turns out that the metrics Aurora publishes also include DBClusterIdentifier and Role dimensions.

If you choose that dimension group, then you can view metrics and set alarms based on the WRITER or READER roles,
for whatever cluster you desire. This is much more fool-proof than setting alarms per-instance.

Laravel is a great framework; it’s easy to use and extend, and makes liberal
use of Interfaces so that you can write your own implementations and provide
them to the IoC Container.

Unfortunately, the downside to this is navigability. Many times, your IDE detects
that a variable has either been type hinted, or has code comments indicating that
it is storing a particular Interface, rather than a concrete class, and this
sometimes makes navigation difficult.

Allow me to introduce you to Go to Implementation (CMD-OPT-B) in PHPStorm -
just one additional key press away from the most commonly used Go to Declaration (CMD-B)
It’ll allow you to select a specific implementation of whatever abstract method
you’ve got a reference to.

We all have those moments when Queue jobs fail. Sometimes it’s a bad deploy,
others it’s an upstream service that’s taken a poop. Sometimes, we need to
retry failed jobs, but can’t just artisan queue retry:all, because maybe we
haven’t done a cleanup of failed jobs lately.

We run Elasticsearch in production, fronted by an API which abstracts away complex queries and
presents our APIs which consume the data a consistent interface. It came to my attention
recently that we had no visibility in NewRelic on external transaction time going to ES.

In a nutshell, the problem turned out to be that the elasticsearch-php-sdk uses RingPHP
as a transport, which NewRelic doesn’t support.

We’ve seen occasionally poor performance on the AWS EC2 Metadata API when using IAM roles at Intouch which got
me thinking. Why does the aws-pdp-sdk need to hit the EC2 Metadata API during every request? Well, it turns out, it’s
simple. If you don’t explicitly give the sdk a cache interface, then it won’t use one!

We use Laravel for all of our APIs at Intouch Insight, so when AWS Batch was released, I started
wondering about backing our Laravel Queues with AWS Batch. This seemed like the perfect opportunity to give back,
since I’m sure others are looking at Batch with the same interest that I am. A few evenings of playing around, and
here we are.

Specifies the maximum number of concurrent unauthenticated
connections to the SSH daemon. Additional connections will be
dropped until authentication succeeds or the LoginGraceTime
expires for a connection. The default is 10.

I’ve been using the ELK stack for over three years. It’s a tool that is used daily at work, so it’s little surprise that
when In-Touch Insight Systems went down the AWS Lambda road for one of our newest projects, I wasn’t happy
using the default CloudWatch Logs UI.

The Problem

Initial setup of OpsWorks instances takes between 15 and 25 minutes depending on the complexity of your chef recipes.

Why?

The OpsWorks startup process injects a sequence of updates and package installations via the instance userdata before
setup can run. To make matters worse, the default Ubuntu 14.04 AMI provided by AWS (at the time of writing) over six
months old! YMMV but I experienced a 11 minute speedup simply in “time to running_setup” by introducing a simple
custom AMI.