Programming and Music

I pushed a new version of Withenv and thought it might be helpful
to discuss some new features.

No More Shells

In order to make things easy for me, I used shell=True when calling
commands within Withenv. I’ve removed that aspect and started
parsing the shell commands in order to avoid the shell. Replacements
should still work as expected.

This removes some security risk by explicit replacements and
avoiding the shell. Shells can be leaky with environment information,
and I’d like people to feel comfortable passing secrets via
Withenv.

Pipes

Withenv allows dynamic environment variables to be injected by
calling commands and scripts. This was always a little kludgy to me. I
had hoped I could do something like /bin/bash my.sh and inspect the
environment afterwards in order to reuse local shell scripts people
write to manage environments.

Instead, when working on a Go version of
Withenv, I realized I could just
load up JSON or YAML from a command. There are tons of commands that
will output JSON, so this seemed like a reaosnable plan.

With that in mind, you often want to use a piped command. For example,
lets say I’m trying to get a secret token. I might do something like
this:

The Future

There are two things I’ve been working on. The first is feature parity
with the Go version. This has been for my own practice and to provide
a lighter weight solution in that might be more palletable to a larger
audience. The second is implementing some config file templates.

I wrote a version of this in Go, but I’d like to consider one in
Python that uses a popular template language. The idea is you can
provide a template and Withenv will use the environment as the
context and write a file before starting the process. You could then
configure the file to be deleted when tht process exits or even after
some specified time where the upstream process should have already
loaded it into memory.

This adds a lot of complexity, so we’ll see if it makes it. I
originally wrote this tooling as another tool, which might be the best
course of action. I’d also like to support the same config language in
the Go version, but, for now at least, I’m hoping I don’t need to
write a jinja2 parser in Go.

I’ve found myself increasinly frustrated with configuration when
programming. It is not a difficult problem, but it gets complicated
quickly when the intent is typically for simplicity.

Lets consider a simple command line client that starts a server and
needs some info such as a database connection, some file paths,
etc. On the surface it seems pretty simple:

$ myservice --config service.conf

But, an operator might want to override values in the config via
command line flags.

$ myservice --config service.conf --port 9001

This is all well and good, but what if you need to override something
that more sensitive you don’t want showing up in the process table
such as the database connection URL. Instead the operator uses an
environment variable.

By the way, we haven’t talked about things like handling stdin so
you can use pipes with your app.

Now, in your code you have to support all these sorts of input and
make the configuration available to the code. This typically happens
as a singleton of sorts that you import all over the place in your
code and tests. This code needs to deal with command line arguments,
reading config files, providing overrides in flags and the
environment, etc.

There are tons of frameworks to make these problems easier, but they
get pretty complicated. What’s more, if you’ve already made some
design decisions about how you want to pass around configuration in
your application, there is a really good chance that library you chose
won’t work with that model and you’ll need to adjust for that. If you
have a lot of code, this can be a pretty difficult refactoring,
especially if you realize the framework has a bug or doesn’t actually
do what it says it does.

Is there a better way?

Maybe? Lately, I’ve been focusing on only using environment variables
for configuration. This solves some issues in the code. I don’t have
to think about parsing command line flags or dealing with override
operations. I also don’t have to create some sort of global singleton
because languages provide access to the environment directly. This
doesn’t answer things like type conversion, but generally, it is much
simpler.

The reason it is simpler is because of withenv. Using we I get config files
via YAML and JSON, use the environment and layer overrides as needed
as well as codify that layering on the filesystem. I even have a good
way to load dynamic values to avoid storing secrets on the file
system. The downside is that we becomes a dependency, so this isn’t
a reasonable solution for some command line tool you distribute
broadly. But, if you’re running network services and driving an
operational code base, depending on we is a great way to reduce
friction between different systems and have an extremely cross
platform means of communicating configuration to your applications.

The larger philosophy that is at play here is the division between
delcaring configuration and communicating that configuration to the
application. These two aspects are often tightly coupled resulting in
a non-trivial amount of code that is tightly coupled throughout an
application’s code base. It also leaks into the operational code as
configuration files need to be written and rewritten, adding an
unnecessary layer of abstraction in order to meet the needs of the
application.

Environment variables can be tricky though. While it is non-trivial to
inspect the environment of a running process, it may be easy for an
attacker to replace the application that is meant to run and capture
sensitive data. If you start processes via shell, that can also lend
itself to executing dangerous code. With that said, there are many
tools to aid in keeping a production filesystem secure such that these
sorts of attacks can be avoided for the most part, leaving only rare
attack vectors available.

But my app doesn’t use environment variables

The on gotcha about a system such as this is that all apps may not be
using environment variables. I’ve tried to deal with this in withenv
by allowing writing a config file before starting the app based on a
template. Until then, the best I can offer to is write something
similar yourself.

Conclusion

I’ve discussed this topic with some folks and have seen a somewhat
wide spectrum of positive and negative comments. The folks that have
tried withenv and see how to extend it, withenv becomes a critical
tool. This makes me believe that it is a good method for constructing
an environment and providing it to a process. But, you never know
until you try, so please give it a go the next time you find yourself
exporting environment variables for some code and see if withenv
improves anything for you.

A while back I wrote a tool called withenv to help manage environment
variables. My biggest complaint with systems that depend on
environment variables is that it is really easy for the variables to
be sticky. For example, if you use docker-machine or docker swarm, you need to be very careful
that your environment variables are configured or else you could be
using the wrong connection information.

Another change for me recently has been using Go regularly. I’ve found the language to be easy
to learn (coming from Python), fast and a lot of fun on the type of
problems I’ve been solving at work.

So, with some existing tooling around that I’d wanted to improve upon
and with some new ideas in tow, I started writing Bach.

The basic idea is that Bach helps you orchestrate your process
environment. While it is a clever name, I don’t know that it really
make clear what the bach tools are meant to do, so I’ll try to clarify
why I think these tools can be helpful.

Say we have an web application that we want to run on 3 nodes with a
database and load balancer. The application is pretty old, meaning it
was written with the intent to deploy the app on traditional bare
metal servers. Our goal is to take our app and run it in the cloud on
some containers without making any major changes to the application.

When you start looking at containers and the cloud, it can be very
limiting to consider a world where you can’t just ssh into the box,
make a configuration change and be done. Even when you use containers,
making the assumption the target platform will provide a way to
utilize volumes is a bit tricky. These limitations can be difficult to
work around without changing the code. For example, changing code that
previously wrote the file system to instead write to some service like
Amazon S3 is non-trivial and introduces a pretty big paradigm shift to
code and operations.

Bach, is meant to help provide tooling to make these sorts of
transitions easier such that the operations code base doesn’t dictate
the developer experience, and vice versa. As a secondary goal, the
bach tools should be easy to run and verify independently of each
other, while working together in unison.

Going back to our example, lets think about how we’d configure our
application to run in our container environment. Lets assume that we
can’t simply mount a config directory for our application and we need
to pass environment variables for configuration to a container. Our
app used to run with the following command.

$ app --config /etc/app.conf

Unfortunately, that won’t work now. Here is where the Bach tools can
be helpful.

First off, our application needs a database endpoint. We’ll use
withenv (we) to find the URL and inject it in our environment before
starting our app. Lets assume we use some DNS to expose what services
live at what endpoints. Here is little script we might write to get
our database URL.

#!/bin/bash# The environment will provide a SERVICES environment variable that# is the IP of DNS server that we use for service discovery. The# ENV_NAME is the name of the environment (test, prod, dev,# branch-foo, etc...)DBIP=`dig @$SERVICES +short db.$ENV_NAME.service.list`echo"{\"DBURI\": \"$DBIP\"}"

Now that we have the script we can use we to inject that into our
environment before running our app.

$ we -s find-db.sh env | grep DBURI

Now that we know we are able to get our DBURI into our environment,
we still need to add it to our application’s configuration. For that
we’ll use toconfig. We use a simple template to write the config
before running our app.

Now, when we switch our command in our container to run the above
command, we get to run our app in the new environment without any code
changes while still capitalizing on new features the environment
provides.

If that command is a bit too long, we can copy the arguments to a YAML
file run it with the bach command.

At the moment these are the only released apps that come with
Bach. With that said, I have other tools to help with different tasks:

present: This runs a script before and after a command exits. The
idea was to automate service discovery mechanisms by letting the
app join some cluster on start and leave when the process exits.

cluster: This provides some minimal clustering functionality. When
you start an app, it will create a cluster if none exists or join
one if it is provided. You can then query any member of the cluster
to get the cluster state and easily pass that result into the
environment via we and a script (ie we -s ‘cluster nodes
192.168.1.14’).

At the moment, the withenv docs
should be correct for Bach’s we command. I’m still working on
getting documentation together for toconfig and the other tools, so
the source is your best bet for reference.

Lately I’ve had the opportunity to spend some time with Go. I’ve been
interested in the language for a while now, but I’ve only done a
couple toy projects. The best way to get a feel for a language is to
write spend some consecutive time writing a real program that others
will be using. Having a real project makes it possible to get a really
good picture of what software development is like using a language.

First Impressions

When I first started going beyond toy projects, the most difficult
aspect was the package (or module in Python terms) system. In Python,
you have an abstraction of modules that uses the filesystem, but
requires extra tooling to include other code. Other languages, like
Ruby (if I remember it correctly!) can use a more direct include type
of module system where you include the code you want. PHP is a great
example of this simple module pattern where you truly just include
other code as if it were written within your own. Go does something
different where a package is really a directory and the import
statement includes everything in that directory that has the same
package declaration. I’m sure there is more to it, but this crude
understanding has been enough to be dangerous.

One aspect that drew me to go was the simplified deployment. Go,
aseems to be focused on producing extremely flexible binaries that can
be used on a wide array of systems. The result is that a Go binary
that you built with linux typically can be copied to any other linux
without having any issues. While I’m the sure the same is possible
with C/C++ or any other compiled language, it has been an early
feature to produce all inclusive binaries that don’t have a dependency
on anything on the target machine. As I’m coming from Python, I’m not
an expert in these sorts of things, so this is really just my
impression, validated by my limited experience.

Finally, Go is reasonably fast without being incredibly complex. I
don’t have to manage memory, but I do get to play with
pointers. Concurrency is a core feature of the language that helps
implement the fun features of Python like generators. Go routines are
poised to do the things that I always wished Python did and that is to
take care of work no matter whether it is async I/O or CPU
bound. Lastly, it doesn’t have a warm up period like you’d see in a
JVM language, which makes it suitable for small scripts.

But What About Python?

I still enjoy Python, but the enjoyment comes from feeling fluent in
the language more than enjoying any set of features. The thing I like
most about programming in Python is that I feel very comfortable
banging out code and knowing that it is reasonably well written with
tests and can function well within the larger Python ecosystem. This
fluency in Python is not something I’m willing to drop due to some
hype in another language. There are some frustrating aspects of
programming in Python that make me want to try something new.

The first is the packaging landscape. I don’t mean to suggest that pip
is terrible or wheels are a huge mistake. Instead, it is the more
general pattern of shipping source code to an environment in order to
run the code. There are TONS of great tools to make this
manageable and containers only make it even easier, so again, I’m not
condemning the future of Python deployments. But I am tired of it.

In Python, I need to make a ton of decisions (some that have been
chosen for me) such as whether or not to use the system packages or a
virtualenv. I have to consider how to test a release before releasing
it. I have to establish how I can be sure that what I tested is the
same as what I’m releasing. All these questions can become subtlely
complex over time and it gets tiring.

The second aspect of Python that is frustrating is the
performance. Generally, it is fast enough, except when it isn’t. It is
when things need to be faster that Python becomes painful. If you are
CPU bound then get ready to scale out those processes! If I/O is where
you are lacking there are a plethora of async I/O libraries to help
efficiently accessing I/O. But, dealing with both CPU and I/O bound
issues is complicated, especially when you are using libraries that
may not be compatible with the async I/O framework you are using. It
is definitely possible to write fast Python code, but there is a lot
of work to do, none of which makes deployment easier, hence it is a
little tiring.

When all else fails you can write a C module, use Cython or try
another interpreter. Again, all interesting ideas, but the cost shows
up in complexity and in the deployment.

There is no easy fix to these problems and it isn’t as fun as it used
to be to think about solving it.

So Why Go Then?

The question then is why Go is a contender to unseat my most fluent
language Python? The simple answer is that I’m tired of the
complexity. The nice thing about Go is that takes care of some
essential aspects I don’t care to delve into, specifically, memory
management and concurrency. Go is also fast enough that I should not
have to be terribly concerned about adopting new programming paradigms
in order to work around some critical bottleneck. While I’m confident
that I could make Python work for almost anything, at some point it feels
like it has become a hammer and I’d like to start doing other things
than hitting nails.

Now, even though I’m learning Go and have been excited about the
possibilities, it doesn’t change the fact most of my day to day work
is in Python. Maybe that will change at some point, but for the
foreseeable future, I don’t see it changing anytime soon, which is
totally fine by me.

I’m sure there are issues with Go that I haven’t run into yet. Static
typing has been challenging to use after having massive freedom in
Python. What is appealing is that less than perfect Go code has been
good enough in quality for production uses. That is exciting because
it means while I learn more about the langauge and community, it
doesn’t preclude me from being productive and learning new things.

Microservices are often touted as a critical design pattern, but
really it is just a tactic for managing certain vectors of complexity
by decoupling the code from operations. For example, if you have a
large suite of APIs, each driven by a different microservice, you can
iterate and release different APIs as needed, reducing the complexity
of releasing a single monolithic API service. In this use case, the
microservices allow more concurrent work between teams by decoupling
the code and the release process.

Another use case for microservices would be to decouple resource
contention. Lets assume you have a database that is used by a few
different apps. You could remove this decoupling by using separate
services that manage their own data plane, removing contention between
the services that exists by using the same database.

From a design standpoint, it is non-trivial. The first example of a
huge API suite can be implemented primarily through load balancing
rules without much issue. The database example is more difficult
because there will need to be a middleman ensuring data is replicated
properly. Just like normalization of databases, the normalization that
occurs in microservices can be costly as it requires more work
(expense) when those services need consistency.

Another expense is designing and maintaining APIs between the
services. Refactoring becomes more complicated because you have to
consider there is old code still handling messages in addition to the
new code. Before, the interactions were isolated to the code, but when
using microservices the APIs will need some level of backwards
compatibility.

The one assumption with microservices that make them operationally
feasible is automation. Artifacts should be built and tested reliably
from your CI pipeline. Deployments need to be driven by a
configuration management system (ie chef, ansible, etc.). Monitoring
needs to be in place and communicated to everyone doing releases. The
reason all this automation is so critical is because without it, you
can’t possibly keep track of everything. Imagine having 10 services
each scaled out to 10 “nodes” and you begin to see why it is difficult
to manage this sort of system without a lot of automation.

Even though it is expensive, it might be worth it. Incidents are one
area where microservices can be valuable. Microservices provide a
smaller space to search for root causes. Rather than having to examine
a large breadth of over a single codebase, you can (hopefully) review
a single service in semi-isolation and determine the problem. Assuming
your microservices are designed properly and even though the number of
services is large, the hope is that the problem areas will be limited
to small services and can be more easily debugged and fixed.

Microservices require a balance. Using microservices is a tactic, not
a design. Lot of small processes doesn’t make anything easier, but
lots of small processes that divide operational concerns and decouple
systems can be really helpful. Like anything else, it is best to
measure and understand the reasoning for breaking up code into
microservices.

At work we have a support rotation where
we’re responsible for handling ticket escalations. Typically, this is
somewhat rare event that requires the team to get involved, thanks to
the excellent and knowledgable support folks. But, when there is an
issue, it can be a rough road to finding the answers you need.

The biggest challenge is simply being out of practice. We have a few
different APIs to use, both internal and external that use different
authentication mechanisms. Using something like cURL definitely works, but gets rather difficult
over time. It doesn’t help that I’m not a bash expert that can
bend and shape the world to my will. There are other tools, like
httpie, that I’d like to spend
some more time with. Unfortunately,I never seem to remember about in
the heat of the moment. Some coworkers delved into this idea for a bit
and wrote some tools, but my
understanding is that it was still very difficult to get around the
verbosity in a generic enough way for the approach to really pay off.

Looking at things from a different perspective, what if you had a
shell of sorts? Specifically, it doesn’t take much to configure
something like ipython with some builtins for
reading auth from known files and including some helpful functions to
make things easier. You could also progressively improve the support
over time. For example, I can imagine writing small helpers to follow
links, dealing with pagination, or finding data in a specific
document. Lastly, I can also imagine it would be beneficial to store
some state inbetween sessions in order to bounce back and forth
between your HTTP shell and some other shell.

Seeing as this doesn’t seem too revolutionary, I’m curious if others
have investigated this already. I’m also curious how others balance a
generic command line interface, API specific tooling and reusability
over time, without adding a million flags and more to learn than just
using cURL!

There is always a question of whether to use a hosted service or
manage a service yourself. It is a tough question because the answer
changes over time depending on your business needs. A startup might be
totally fine using github and slack, but the size of google means
rolling your own solution. The arguments regarding hosted services
revolve around security, and more specifically, the sensitive data you
make available by using these services.

There are many that would argue that owning a service is more
secure. While it is true that you may send fewer bits to a third
party, it really says nothing of security. A hosted service such as
github or slack is
already targeted by hackers. While I’m sure there are vulnerabilities,
popular hosted services have been vetted by huge amounts of usage. It
is in the providers interest to provide a secure and reliable service,
constantly improving infrastructure and security over time. Running a
service at scale shakes loose quite a few bugs that contain difficult
to find attack vectors.

Even if a service is reasonably secure, there is still a risk of
trusting your data to another company. Unless you have clients that
specifically disallow this, I’d argue that this is not worth the
cost. Successful hosted services generally have a community of
supporters that have done the work of integrating with the
platform. That makes hooking up your bug tracker with your build
system, chat and monitoring is trivial. Sorting out all the bits to
make this work in an internal environment means writing, debugging and
maintaining code along side operating each dependent system. That is
far from impossible, but it is certainly expensive when you need
developers and operators working on more pressing issues. The irony
here is that by avoiding the hosted service, you’ve essentially made
local development more difficult and reduced the ability of your
development pipeline to improve the code.

To put this in financial terms, lets say you have a team of 5 people
and lets say the average salary is $100k, or $48 / hour. If each
person spends 10 hours a week on the CI/CD system and operating tools
like chat, it would cost $480 / week, or ~$25k per year. That doesn’t
seem too bad, but that doesn’t include the extra cost of an effective
build system catching bugs and the initial development time to get
these systems up and running and talking to each other. You might need
to get hardware within the network, setup firewalls, configure secure
routes via VPNs to allow remote developers to use the system. At this
point you might have included the time of another 15 people and spent
at least 3+ months of your team’s time getting the initial system up
and running, noting, that it is all code you’ll need to maintain. Also
note, that this says nothing about problems that come up.

The fact is, it is really expensive to design, build and operate a
suite of services simply to avoid having some bits on another person’s
computer. It seems better to focus on making your team more productive
by providing helpful tools they don’t have to manage and prepare
mitigation plans for how to recover from a security breach or service
failure. Obviously, there are dangers, but mitigating them is less
work than rebuilding a service along with its integrations from
scratch.

I’m curious what others have experienced when choosing an external
service over a DIY solution. Did you feel the DIY solution was full
featured? Did you get burned choosing a hosted solution? Let me know!

I’m thankful Rackspace sent me to Tokyo for
the OpenStack Summit. Besides the
experience of visiting Tokyo, the conference and design summit proved
to be a great experience. Going into the summit, I had become a little
frustrated with the OpenStack development
ecosystem. There seemed to be a lot of decisions that separated it
from more main stream Python development with little actual reason for
the diverting from the norm. Fortunately, the summit contextualized
the oddities of developing OpenStack code, which made the things more
palatable as a developer familiar with the larger Python ecosystem.

One thing that hit me at the summit was that OpenStack is really just
5 years old. This may not seem like a big deal, but when you consider
the number of projects, amount of code, infrastructure and
contributors that have made all this happen it is really
impressive. It is a huge community of paid developers that have
managed to get amazing amounts of work done making OpenStack a
functioning and reasonably robust platform. So, while OpenStack is an
impressive feat of software development, it still has a ton of rough
edges, which should be expected!

As a developer, it can be easy to come to a project, see how things
are different and feel as though the code and/or project is of poor
quality. While this is a bad habit, we are all human and fear things
we don’t understand! The best way to combat this silliness is to try
and educate those coming to a new project on what can be
improved. In addition to helping recognize ways to improve a project,
it helps new developers feel confident when they hit rough edges to
dig deeper and fix the problems.

With that in mind, here are some rough edges of OpenStack that
developers can expect to run into, and hopefully, we can fix!

OpenStack Doesn’t Mean Stable

You’ll quickly find that “OpenStack” as a suite of projects is
HUGE. Each of these projects is at different stages of
stability. Documentation may be lacking, but the code is well tested
and reliable, while other projects may have docs and nothing else. It
critical to keep this in mind when developing for an OpenStack project
that the other OpenStack requirements, and there will be TONS, may
not be entirely stable.

What this means is that it is OK to dig deep when trying to figure out
why something doesn’t work as expected. Don’t be afraid to checkout
the latest version of the library and dive into the source to see what
is going on. Add some temporary print / log messages to get some
visibility into what’s happening. Write a small test script to get
some context. All these tactics you’d use with your own internal
libraries are the same you should use with ANY OpenStack project.

This is not to say that OpenStack libraries aren’t stable. You can’t
assume that just because it has gotten the label “oslo” or
“OpenStack”, that it has been tested and considered working. The
inclusion of libraries or applications into OpenStack, from a
development standpoint, has more to do with answering the question of
“Does the community need this”. Inclusion means that the community has
identified a need, not a fully fleshed out solution.

Not Invented Here

OpenStack is an interesting project because the essence of what it
provides is an interface to existing software. Nova, for example, provides
an interface to start VMs. Nova doesn’t actually do the work to act as
a hypervisor, instead, it leaves that to pluggable backends that work
with things like Xen or KVM. Right off the bat, this
should make it clear that when a project is labeled as the OpenStack
solution for X, it most likely means it provides an interface for
managing some other software that implements the actual functionality.

Designate and
Octavia are two
examples of this. Designate manages DNS servers. You get a REST
interface that can can update different DNS servers like bind or
PowerDNS (or both!). Designate handles things like multi-tenancy,
validation and storing data in order to provide a reliable
system. Octavia does a similar task, but specifically for haproxy.

It doesn’t stop there though. OpenStack aims to be as flexible as
possible in order to cater to the needs / preferences of operators. If
one organization prefers Postgres over
MySQL that should be supported because an
operator will need to manage that database. The result is that many
libraries tend to provide the same sort of wrapping. Tooz and oslo.messaging, for example,
provide access to distributed locking and queues
respectively. Abstractions are created to consistently support
different backends, so projects can not only provide flexibility for
core functionality, but also the services that support the
application.

In the cases where there really was a decision to reimplement some
library, it is often due to an in compatibility with another
library. A good example of this is oslo.messaging. It supports
building apps with an async or background worker pattern, much like
celery. This makes one wonder, why not just celery! My understanding is that celery has
been tried in many different projects and it wasn’t a good fit within
the community at large.

By the way, my vague answer of “it wasn’t a good fit” is
intentional. There are so many projects in OpenStack that often times
these questions are bought up again and again. A new project is
started and the developers try out other libs, like celery, because it
is a good solution that is well supported. Sometimes the technical
challenges of integration with other services is a problem, while
other times, the dependencies of the library aren’t compatible with
something in OpenStack, making it impossible to resolve
dependencies. I’m sure there are cases where someone just doesn’t like
the library in question. No matter what the reasons are, OpenStack has
a huge plane of software that makes it hard for new libraries to be a
“good fit”, so sometimes it is easier to rewrite something
specifically for OpenStack.

Dependencies

OpenStack is committed to providing software that can be
deployed via distro packages. In other words, OpenStack wants to make
yum install openstack or apt-get install openstack work. It is a
noble goal, especially for a suite of applications written in python,
moving at radically different rates of change.

You see, distro package managers have different priorities than most
Python authors may have when it comes to packaging. A distro is
something of an editor, ensuring that all the pieces for all the use
cases work reliably. This is how Red Hat provides general “linux”
support, by knowing that all the pieces work together. Python REST
services (like OpenStack), on the other hand, typically assume that
the person running it uses some best practices such as a separate
virtualenv for each application. This design pattern means that at the
application level, the dependencies are isolated from the rest of the
system.

Even though the vast majority of OpenStack operators don’t rely on
system packages in a way that requires all projects use the same
versions, it is an implementation detail OpenStack has adopted. As
a developer, you have to be ready to deal with this limitation, and
more importantly, the impact it has on your ability to introduce new
code. I believe that this restriction is most likely to blame for the
Not Invented Here nature of many OpenStack tooling, which leads
reimplementations that are not very stable.

Why Develop for OpenStack?

If OpenStack has such a rough development experience, why should you
commit to learning it and developing on OpenStack software?

You’ll remember, I began all this with a recognition that OpenStack
is only 5 years old. Things will continue to change, and I believe,
improve. Many of the rough edges of OpenStack have been caused by
growing pains. There is a crazy amount of code happening and it takes
time and effort to improve development patterns. Even though it can be
rough at first to develop in OpenStack, it gets better.

Another reason to develop OpenStack code is that it is a exciting
work. OpenStack includes everything from low level, high performance
systems to distributed CPU intensive tasks to containers and
micro-services. If you enjoy scaling backend applications, OpenStack is
a great place to be. The community is huge with loads of great
people. OpenStack also makes for a very healthy career path.

No project is perfect, and OpenStack is no different. Fortunately,
even though there are rough edges, OpenStack is a great project to
write code. If you are new to OpenStack development and need a hand,
please don’t hesitate to reach out!

I’ve recently made an effort to stop using local virtual
machines. This has not been by choice, but rather because OS X has
become extremely unstable as of late with VirtualBox and seems to show
similar behavior with VMWare. Rather than trying to find a version of
VirtualBox that is more stable, I’m making an effort to develop on
cloud servers instead.

First off, to aid in the transition, I’ve started using Emacs in a
terminal exclusively. While I miss some aspects of GUI Emacs, such as
viewing PDFs and images, it generally hasn’t been a huge change. I’ve
had to do some fiddling as well with my $TERM in order to make sure
Emacs picks up a value that provides a readable color setting.

Another thing I started doing was getting more familiar with byobu and tmux. As Emacs
does most of my window management for me, my use is relatively
limited. That said, it is nice to keep my actually terminal (iTerm2)
tabs to a minimum and use consistent key bindings. It also makes
keeping an IRC bouncer less of a requirement because my client is up
all the time.

The one thing I haven’t done yet is to provision a new dev machine
automatically. The dev machine I’m on now has been updated
manually. I started using a Vagrantfile to configure a local VM that
would get everything configured, but as is clear by my opening
paragraph, frequent crashes made that less than ideal. I’m hoping to
try and containerize some processes I run in order to make a
Vagrantfile that can spin up a cloud server reasonably simple.

What makes all this possible is Emacs. It runs well in a terminal and
makes switching between local and remote development reasonably
painless. The biggest pain is the integrations with my desktop, aka my
local web browser. When developing locally, I can automatically open
links with key bindings. While I’m sure I could figure something out
with iTerm2 to make this happen, I’m going to avoid wasting my time
and just click the link.

If you don’t use Emacs, I can’t recommend tmux enough for “window”
management. I can see how someone using vim could become very
proficient with minimal setup.

My MacbookPro crashed with a gray screen 4 times yesterday and it gave
me time to think about what sort of environment to expect when
developing.

The first thing is that you can forget about developing
locally. Running OS X or Windows means your operating system is
nothing more than an inconvenient integration point that lets you use
Office and video conferencing software. Even if you use Linux, you’ll
still have some level of indirection as you separate your dev
environment from your OS. At the very least, it will be language
specific like virtualenv in Python. At most you’ll be running
VirtualBox / Vagrant with Docker falling somewhere in between.

Seeing as you can’t really develop locally, that means you probably
don’t have decent integration into an IDE. While I can already hear
the Java folks about to tell me about remote debugging, let me define
“decent”. Decent integration into and IDE means running tests and code
quickly. So, even if you can step through remote code, it is going
to be slow. The same goes for developing for iOS or Android. You have
a VM in the mix and it is going to be slow. When developing server
software, you’re probably running a Vagrant instance and sharing a
folder. Again, this gets slow and you break most slick debugging bits
your editor / IDE might have provided.

So, when given the choice, I imagine most developers choose speed over
integration. You can generally get something “good enough” working
with a terminal and maybe some grep to iterate quickly on code. That
means you work in a shell or deal with copying code over to some
machine. In both cases, it’s kludgey to say the least.

For example, in my case, I’ve started ssh’ing into a server and
working there in Emacs. Fortunately, Emacs is reasonably
feature-full in a terminal. That said, there are still integration
issues. The key bindings I’ve used to integrate with non-code have
been lost. Copy and paste becomes tricky when you have Emacs or tmux
open on a server with split screens. Hopefully, your terminal is
reasonably smart where it can help finding links and passing through
mouse events, but that can be a painful process to configure.

OK. I’m venting. It could be worse.

That said, I don’t see a major shift anytime soon. I’ve gone ahead and
tried to change my expectations. I’ll need to shell into servers to
develop code. It is important to do a better job learning bash and
developing a decent shell work flow. Configuring tmux / screen / byobu
is a good investment. Part of me can appreciate the lean and mean text
interface 24/7, but at the same time, I do hope that we eventually
will expect a richer environment for development than a terminal.