OSCON 2017 and InVision Engineering in Open Source

Our Platform Team grouped up for a great trip to Austin, Texas for this year’s
O’Reilly Open Source Convention -
(OSCON-2017). It was a great
opportunity to see what kind of innovations were going on in open source,
connect with companies whose products we use, and generally catch up on
technology trends. InVision builds much of it’s software stack on open source
projects. Docker, Kubernetes, Linux, Golang, NodeJS, React and many other
projects make up the stack at InVision. Therefore, we really benefit from
connection with the community. Additionally, we want to contribute back as we
build great services and libraries that can be reusable to the community as a
whole! Below, are some summaries from great talks we saw at OSCON 2017. Also,
you’ll find some summaries of our current open source offerings. Come check out
our projects, contribute and use them!

Open Source AI at AWS and Apache MXNet

Summary by Tatsuro Alpert, Senior Software Engineer, Core Services

Adrian Cockroft talked about AWS’s open source library for AI algorithms. They
offer easy access to these AI engines as well as EC2 instances with specs that
are ideal to running them such as the P2 with multiple GPUs and lots of RAM.
They also offer access to the services that back some of Amazon’s products such
as the Alexa. You can use their APIs to take advantage of conversation engines as
well as face and image recognition engines. Cockroft concluded with a demo of a
self learning, self driving toy car. The car had an onboard RaspberryPi and
camera that ran a MXNet to control it. The data processing and generation of the
model is done in EC2 on one of the powerful machines.

Scaling massive, real-time data pipelines with Go

Summary by Tatsuro Alpert, Senior Software Engineer, Core Services

Jean de Klerk spoke about data pipelines written in Go. He began with a
comparison of several network protocols and their strengths and weaknesses for
transferring large amounts of data. He compared across HTTP, UDP, gRPC unary,
websocket streaming, and gRPC streaming. While the streaming mechanisms were by
far the fastest, they are more difficult to use from an implementation
perspective. He then went on to discuss queueing data between producers and
consumers. He compared several models on Go using arrays, channels, and ring
buffers. He concluded that any methods that required the use of mutices did not
perform well. Atomics on the other hand perform well, but are very complex to
implement. Channels are the easiest to implement, but they do not have the
flexibility of decoupling the producer from the consumer as they will eventually
block. If your use case requires this decoupling, ring buffers are the best
solution.

Monitoring at scale at Salesforce

Summary by Adam Frank, Engineering Manager, SRE

Salesforce is a giant HBase shop so it’s always really interesting how they run
something like that at scale. This talk was about Argus, the tool Salesforce
uses for gathering millions of metrics which, naturally, uses HBase on the
backend. The data structure it uses in HBase is defined by OpenTSDB, which I
believe makes it compatible with the various OpenTSDB tools for data collection,
like tcollector. It uses Kafka to queue ingest, which makes it extremely flexible
and not as susceptible to the sort of back pressure you see in other solutions,
especially 3rd party hosted solutions. The talk presented a great world where we
can collect as many metrics as we want with minimal sampling, but is of
somewhat-limited use to us, because anyone who’s run an HBase cluster before
knows you only do so if you have to :slightly_smiling_face: All the same, it
clearly demonstrated the value of treating your internal metrics with the same
big data analytics you’d consider for customer data.

Using NGINX as an effective and highly available content cache

Summary by Adam Frank, Engineering Manager, SRE

This ended up being the most technical NGINX-related talk at the conference,
which surprised me since NGINX is so integral to most ingress tiers in the
container and VM space. Although the talk was specifically about content
caching, which didn’t seem particularly sexy, it ended up being a very
informative talk (which ended with the speaker giving everyone developer
licenses to NGINX Plus, which was nice). One thing that was covered was using
dynamic variables in NGINX.

For example, you can set up a variable in the nginx config and map a value to it
based on the output of a regex:

In that example the variable $dynamic will either get google or not_google
depending on the user agent. The speaker also covered the split_clients
option, which lets you apply a certain value to a variable based on percentages.

For example:

split_clients $request_uri $variable {
50% "var1";
50% "var2";
}

The config related specifically to caching that was discussed was too much to
describe here, but generally he covered different types of hashes, improved
logging, a couple fairly-clever ways of doing HA, and how to properly configure
disk caching without your disks becoming the bottleneck.

How and why we’re opening our code at Octopus Deploy

Damian Brady gave a talk on how Octopus Deploy decided to open source their deployment tooling
as well as determining how much of the tools to open source. Since I work on the deployment tools
at InVision and we are also in the process of open sourcing some of our internal projects I was
especially interested in this talk.

While some of this talk revolved around the business case to be made for releasing a core part of
the company revenue stream as open source - which is not directly applicable to my current teams
case, I did enjoy the talk around defining what makes a tool truly useful as a public
open source tool vs just a exercise in the process. Part of this comes down to is it a general
enough tool to be useful and is it ready to be released? Also, is there an user base for this project
and is there leadership to help steer it in the future?

Evolutionary architectures

Great talk around architecture patterns for supporting evolution of your application stack.
As part of this Neal Ford discussed the idea of fitness functions. These fitness functions can
rate or validate certain characteristics of your application architecture. For example, security
or performance. This can then be used as part of your continuous deployment pipeline to validate
the “fitness” of your application and that your changes have not hindered any important aspects of
your application.

In describing a architecture that is flexible and can support evolution Neal showed an architecture diagram
using micro-services that allows for scalability and isolation of concerns and an API gateway to isolate any
changes and support a more flexible design. Since InVision is actively implementing this sort of design, it
was good validation that we are on the right path.

Open Source at InVision

InVision has only just begun to contribute some of our in-house projects back to
the community. We have more planned in the future, but currently our four
efforts are out and waiting on GitHub for your contribution, input and use.
Below is a quick summary of each project and how we use them.

Kit

Kit is a full system to push Kubernetes deployment out based on a docker
pipeline. There are multiple pieces of Kit that helps keep your Kubernetes
deployments simple and manageable. We run many Kubernetes clusters here at
InVision and we use Kit to manage the interactions. Our continuous integration
pipeline allows us to do a great deal of deployments. We still take action on
individual deployments (not continuous deployment), however those actions are
all automated through our own internal chat-bot. Kit helps us push these
deployments to MANY clusters without hassle. Feel free to take a gander at Kit
and give it a try!

Kit-Overwatch

Kit-Overwatch is a simple service who’s sole responsibility is to watch the
Kubernetes event stream and push notifications to other services. This can be
really useful to get the stream into something your engineering staff can work
with. Currently, we only have a few notifiers built, but this could easily be
expanded based on your needs. We run this in a docker container and it helps us
get the Kuberenetes event stream into Slack and DataDog. There’s also a stdout
logging feature for testing. Enjoy!

Rye

So, everyone needs middleware. That’s the truth. However, in Golang, middleware
is one of those things you could do eight different ways and it wouldn’t matter,
they would all work. Rye is our answer to middleware. We built a very simple
middleware library to give us some out-of-the-box functionality such as StatsD
integration and timing on our middleware methods. Additionally, we built out
some base middlewares to go with the library including request logging, CORS,
JWT verification, and Golang 1.7 context support. Rye has turned out to be very
useful for us and is being used by multiple teams here at InVision. Feel free to
give it a whirl!

Conjungo

So! Have you ever had a situation where you had to instances of the same struct
and needed to merge them together? Well, we did. Basically, imagine that you
have a PATCH endpoint that takes in your struct in your service, but you need to
merge that with the value in your Mongo database. In that case, you can use
Conjungo to merge the structs together. Conjungo allows you to control much of
the process of merging and supports many use cases. This library was put
together by our Senior Software Engineer, Tatsuro Alpert to solve a problem in a
service catalog service that we built in-house. It’s turned out to be a very
useful library for us! Check it out, use it and enjoy it. We’d love to know how
we can make it better!

Coming Soon

We have other open source projects in the hopper here at InVision. One of the
projects coming is named Chronos which is a very tiny Golang library for
managing the scheduling, logging and reporting of recurring tasks. Look for this
in the near future! Many other projects are growing here at InVision as we
continue to improve our stack. That being said, look to us in the future as we
Tweet out new projects as they become available.

The last word…

InVision is striving to build a platform we can be proud of. Open source has
been a big part of that. As an engineering practice, not only do we rely on open
source to build a platform that we are proud of, but we have started to
contribute back our internal efforts. We’d love your input and help as we grow
this effort. We welcome contributions, Github issues, pull requests and
feedback. Additionally, come join us at InVision and help us contribute more
open source efforts to the community! We can all be better together!

By Cale HoopesCale Hoopes is a Senior Software Engineer, Core Services on the Platform Team at InVision.

Like what you've been reading? Join us and help create the next generation of prototyping and collaboration tools for product design teams around the world. Check out our open positions.