Category: Engineering

At Pedago, we follow the GitHub Flow model of software development. Changes to our app are made in feature branches, which are discussed, tested, code reviewed, and merged into master before deploying to staging and production. This approach has become pretty common, and in most cases does a good job of balancing our desire to ship quickly with the need to control code quality.

But, what happens when a bug inevitably creeps in, and you need to determine when it was introduced? This article describes how to apply git bisect in the presence of numerous feature branches to quickly detect when things went awry in your codebase.

Enter Git Bisect

git bisect is tool for automatically finding where in your source history a bug was introduced. It saves you the pain of manually checking out each revision yourself and keeping a scratchpad for which ones were good and bad.

Here’s how you get started:

# start up git bisect with a bad and good revision
git bisect start BAD_REVISION GOOD_REVISION

At this point, git is going to start checking out each revision and asking you if the commit is good or bad. You tell git this information by typing git bisect good or git bisect bad. Git then uses binary search (bisecting the history) to quickly find the errant commit.

You can also further automate things by giving git a script to execute against each revision with git bisect run. This allows git to take over the entire debugging process, flagging revisions as good or bad based on the exit code of the script. More on this below!

Example

Imagine you go back to work from a vacation and discover that the Rails specs are running much more slowly than you remember before you left. You know that the tests were fast at revision 75369f4a4c026772242368d870872562a3b693cb, your last commit before leaving the office.

Being a master of git, you reach for git bisect. You type:

git bisect start master 75369f4a4c026772242368d870872562a3b693cb

…and then for each revision git bisect gives you, you manually run rake spec with a stopwatch. If it’s too slow, you type git bisect bad, and if it’s fast you type git bisect good.

That’s kind of monotonous, though, and didn’t we mention something about automating things with a script above? Let’s do that.

Let’s say you save this script to /tmp/timeit.sh. You could use that instead of your stopwatch and keep manually marking commits as good and bad, but let’s go further and have git bisect do the marking for us:

git bisect run /tmp/timeit.sh

Now we’re talking! After waiting for a bit, git tells us that the errant revision is:

OK, so that sounds good. But wait, that’s a commit that only affected javascript unit tests! How could that have caused a problem with the Ruby specs?

Damn You, Feature Branches

The problem is that git bisect is not confining itself to only the merge commits in master. When it narrows down the point in time when things got slow, it isn’t taking into account the fact that most revisions are confined to our feature branches and should be ignored when searching the history of changes to master.

What we really want is to only test the commits that were done directly in master, such as feature branch merges, and the various one-off changes we commit directly from time to time.

git rev-list

Here’s a new strategy: using some git rev-list magic, we’ll find the commits that only exist in feature branches and preemptively instruct git bisect to skip them:

Hooray! We’ve found our offending commit: Merged in update_rails_4_2 (pull request #903). That makes sense—we upgraded RSpec and made a bunch of testing-related changes in that branch.

Furthermore, we see a list of skipped commits that git bisect didn’t test. This also makes sense—those commits are all within the update_rails_4_2 branch.

Conclusion

With a bit of git magic and some scripting, we’ve completely automated what could have been a very tedious exercise. Furthermore, thanks to the judicious use of git rev-list and git bisect skip, we’ve been able to cajole git into giving an answer that takes our branching strategy into account. Happy hacking!

We had a rails app. We used factories in our tests, and it took ten minutes to run them all. That was too slow. (spoiler alert: by the end of this blog post, they will run in one minute.)

We suspected that we could speed up the test run time by using fixtures instead, but worried that fixtures would be much more difficult to maintain than our factories.

As it happens, we are not the first developers to deal with the issue that factories are slow and fixtures are hard to maintain. I cannot explain the issue any better than the following folks do, so I’ll just give you some quotes:

“In a large system, calling one factory may silently create many associated records, which accumulates to make the whole test suite slow …”

“Maintaining fixtures of more complex records can be tedious. I recall working on an app where there was a record with dozens of attributes. Whenever a column would be added or changed in the schema, all fixtures needed to be changed by hand. Of course I only recalled this after a few test failures.”

“Factories can be used to create database records anywhere in your test suite. This makes them pretty flexible and allows you to keep your test data local to your tests. The drawback is that it makes them almost impossible to speed up in any significant way.”

In our case, 99% of our tests were using identical records. For example, we were calling FactoryGirl.create(:user) hundreds of times, and every time, it was creating the exact same user. That seemed silly. It was great to use the factory, because it ensured that the user would always be up-to-date with the current state of our code and our database, but there was no reason for us to use it over and over in one test run.

So we wrote the gem fixturies to solve the problem this way: Each time we run tests, just once at the beginning, we execute a bunch of factories to create many records in the database. The fixturies gem then dumps the state of the database to fixtures files, and our tests run blazingly fast using those fixtures.

We saw a 10x improvement in run times, from ten minutes down to one. We still use factories here and there in our tests when we need a record with specific attributes or when we want to clear out a whole table and see how something behaves with a certain set of records in the database. But in the vast majority of cases, the general records set up in that single run at the beginning are good enough.

If you are using factories in your tests to re-create the same records over and over again, and your tests are running too slowly, give fixturies a try and let us know how it goes. It only took us about half a day to refactor 700 tests to use fixturies instead of traditional factories, so there is a good chance it will be worth your time.

Here at Pedago, we take a hard look at the performance of our applications so that our users don’t have to experience any troublesome hiccups (or “jank”) that might otherwise sour a sweet learning experience.

While “performance” can cover a wide array of metrics, we tend to be extremely critical of browser overhead (script execution, rendering layout, and painting). While others have covered optimization of these metrics in great detail, we came across an unlikely jank-vector that we thought was worth mentioning.

When analyzing CSS performance in relation to browser lifecycle, there are a few notorious styles (eg: border-radius, box-shadow, transform, backface-visibility, etc) that tend to slow down frame rate. Some of these are obvious as they dramatically influence the rendering process or add additional calculations for stylistic ooomph. One might be extremely likely to overlook the rather mundane text-transform while focusing on that list, though.

We had several elements each containing a finite number of additional elements that all were performing CSS-dictated uppercasing on their text content. Now, this might not be a significantly intensive operation in itself, but combined with some excessively spastic scrolling, it degraded the user experience somewhat significantly. After we updated the content to be rendered in uppercase without the need for CSS text transformation, the improvement was obvious.

Here’s how things looked on a common mobile platform, prior to the change (FPS is the key metric, with 60FPS as an ideal target):

As you can see, we were barely hitting the 30FPS threshold and often even missing that window. Here’s what we observed after we removed the relevant text-transform styles:

As you can see, we’re now closer to consistently hitting that golden 60FPS benchmark! Granted, we were probably abusing a CSS style that was intended for narrower application, and the DOM of this particular page meant that there were a lot of them, so your mileage may certainly vary. However, this might help serve others in the war against jank!

I’ve been using Angular every day for over a year, but have always been too intimidated by this error message—and the crazy list of information that comes along with it—to really dig into it and find out how to use it to my advantage.

Building a new product at Pedago, I see this error happen from time to time in production (possibly related to a user using an old browser), but never when developing locally. So I have the error message from our error logs, but I can’t reproduce it or debug it by making changes in the code.

After researching the issue, here’s what I found out on my own.

After the colon there are two brackets. (…’atchers fired in the last 5 iterations: [[{“msg’) The first bracket is the beginning of a json block. Copy from the first bracket to the end of the error and find a way to pretty-print that json. (Use your favorite code editor or an online json formatter.)

Now you have a pretty-printed array with 5 entries in it. Each entry represents an iteration in the digest cycle, i.e. one pass through all of the active watchers in your app, looking for changes. Angular will repeat iterations until it does one in which no watcher has a changed value, or until it hits 10 iterations, at which point it will error. That’s what happened in this case.

There were 10 iterations before the error, and 5 are included in the error message. Presumably that means there are 5 more iterations that happened earlier than what is included in the error message. The first entry in the error message is the 6th iteration, and the last entry in the message is the 10th iteration.

The entry for each iteration is also an array. In this case it is an array of objects, and each object represents a watcher whose value changed during this iteration. Each object will give you the text or the function that defines the watcher, the old value for the watcher before this iteration and the new value after this iteration.

Read it from top to bottom like a story, adding commentary based on what you know about your app. In my case, I was able to see how the changes in each iteration caused new watchers to be created, requiring yet another iteration. “In the 6th iteration, this watcher changed, causing this new stuff to be rendered on the page, creating new watchers which were assigned values in the 7th iteration, and then …” There was no infinite loop or anything. In fact, if Angular had been willing to do just 1 or 2 more iterations, it would have finished.

Geographically distributed teams can work. How? In my experience, there are five key principles that make all the difference.

The other day, I realized that I have worked on geographically split software teams for the last decade. I didn’t set out intending to do so. When I got my first job out of college as a software engineer, I thought being in the office was non-negotiable. If you ask company recruiters, most say in no uncertain terms they are not looking for remote employees.

But the reality on the ground is that many software companies end up letting most full-time engineers work flexible hours, from anywhere. Yet few companies acknowledge or advertise this explicitly. You can’t get this opportunity by asking for it in your interview, but after you are in the door, the opportunity arises. There are a few ways this happens.

Software companies with heavy on-call burdens, like Amazon for example, end up with an unspoken but widespread assumption that people can work from anywhere because they are working all the time. Since most engineers are contractually obligated to be on-call, and emergency issues arise at all hours of the day and night, these engineers start working remotely out of necessity. Soon, whole teams realize there is no barrier to working where and when they are most productive. On any given day during my tenure at Amazon, maybe half of my teammates weren’t in the office when I was.

I’ve worked for several startups with surprisingly similar team environments. At one, people preferred to work at nearby coffeehouses with outdoor porches that made it easier to work while smoking. At another, unreliable wireless bandwidth in the office drove people to work from home out of necessity. At another, almost every person employed there lived in a different time zone, because this startup needed a very specific set of skills that happened to be scarce at the time.

Later I went to work for Rosetta Stone, which had three offices when I started, and grew to six offices when I left . Most of the teams I worked on ended up having people in at least three office locations, and as many time zones. Only one of the four bosses I had during those years worked in the same office as me. Not only were the teams geographically split, but once a team is split across two or more time zones, for all practical purposes, this team is also working flex time hours.

These teams all worked well together. No matter where and when we worked, everyone still was accountable for meetings and deadlines. I never felt particularly disconnected from my team, and I had rapport with and support from my bosses.

Today I work for Pedago.com, a startup company that is geographically split and has been from the very beginning. It is the most productive team I have ever worked with.

Geographically distributed teams can work. How? In my experience, there are five key principles that make all the difference.

1. Shared chat room where all team communication, questions, updates, and banter happen

I’ve used Skype, IRC, and HipChat for this — there are many viable options. In a geographically split team, showing up to the chat room is the same as showing up to work. Conversely, not showing up to chat is like not showing up to the office, with the same professional and social side effects. Knowing everyone will be in one place means you always know where you can ask questions if you have them. And if you are ever away and need to catch up on team discussion, there’s a nice transcript of all the debates, conversations, and decisions the team made while you were away that you can easily read to catch up.

2. Shared “online” hours

These are the hours everyone is guaranteed to be available in the shared chat room. No matter how scattered the team is, there will always be some set of hours that work for everyone. Everyone is assumed to be available during these hours, and if any person can’t make it or needs to leave to run errands, they are expected to notify everyone.

3. Short daily video check-in meetings

If your team uses Scrum as its project management strategy, these are just your daily standup meetings. It makes a huge difference to see peoples’ faces once a day, if only to remember that they are human beings in addition to being demanding product owners or cantankerous engineers. The visual feedback you get from face to face conversations helps facilitate complex discussions, amplifies positive feedback, and the subconscious pressure to announce a socially acceptable level of progress helps hold everyone accountable.

4. Teammates need to be aggressive about helping and asking for help.

However tempting, no one should ever spin their wheels when stuck. Teams should declare that ad hoc pair programming, calls, and hangouts are a part of their team’s working style, and developers should be aggressive about picking up the phone or screen-sharing whenever they get stuck or need to ask a question.

5. Everyone on the team needs to buy into these rules.

No exceptions. If even one person refuses to get in the shared team chat-room or doesn’t feel like showing up to the video check-in meetings every day, these strategies won’t work for the rest of the team. Everyone on the team has to buy in, and in particular, managers and team leads need to lead by example, being disciplined about leading discussions and disseminating information only in the team’s shared forums.

But how do you know people are working if you can’t see them?

Simple answer. The same way you know they’re working when you can see them: you talk to them about requirements, check in on progress, look at what they deliver. If you are a technical manager, you monitor deployments or skim the latest check-ins. Software management really isn’t very different with distributed versus in-house teams. The secret to success is still the same: clear direction, well-defined, prioritized requirements, and carefully managed execution.

Local development environment

We currently have a single git-managed project containing our Rails server at the top level and our Angular project in a subdirectory of vendor. Bower components are checked in to our repo to speed up builds and deploys. The contents of our gruntfile and the organization of our asset pipeline are described here.

We can start up our server via grunt server (which we have configured to shell out to rails server) or directly with rails server for Ruby debugging.

Even though both the client and server apps are checked into the same project and share an asset pipeline, we restrict our Angular code to only communicate to the backend Rails server over APIs. This enforces a clean separation between client and server.

Bamboo build project

When Angular and Rails code is checked in to master, our Bamboo build process runs. We always push through master to production, à la the GitHub flow process. The build process comprises two stages:

Stage 1: Create Artifacts:

Rails: bundle install and freeze gems.

Angular: npm install, grunt build. No bower install is needed because we check-in our bower_components. The grunt build step compiles, concatenates, and minifies code and assets. It also takes the unusual step of cache-busting the asset filenames and rewriting any references in view files to point to the new filenames.

The resulting artifact is saved in Bamboo and passed to Stage 2.

Stage 2: Run Tests:

Rails: run rspec model and controller tests, and then cucumber integration tests. It was a bit tricky to get headless cucumber tests running on Bamboo’s default Amazon AMI; see details in our previous blog post.

Angular: grunt test.

If the artifact creation succeeds, and the tests run on that artifact all pass, Bamboo triggers its associated deploy project. Otherwise, our team receives notifications in Hipchat of the failure.

Bamboo deploy project

After every successful build, Bamboo is configured to automatically deploy the latest build to our staging environment.

The Bamboo deployment project runs the following tasks to kick off an Elastic Beanstalk deployment:

Write out an aws_credentials file to the build machine. We don’t store any credentials on our custom AMIs. Instead, we keep them in Bamboo as configuration variables and write them out to the build machine at deploy time.

Run Amazon’s AWSDevTools-RepositorySetup.sh script to add aws.push to the set of available git tasks on the build machine.

Kick off the deployment to our Elastic Beanstalk staging environment with a call to git aws.push from the build machine’s project root directory.

Since our project is configured to use Elastic Beanstalk, the remaining deployment-related configuration (like which Elastic Beanstalk project and stage to push the update to) is checked in to the .elasticbeanstalk and .ebextensions directories in our project and made available to the git aws.push command. If there is interest in sharing the contents of these config files, please let us know on Twitter.

Elastic Beanstalk staging environment

After the staging deployment has been kicked off by Bamboo, we can head over to our EB console at https://console.aws.amazon.com/elasticbeanstalk and monitor the deployment while it completes. The git aws.push command from the previous step is doing the majority of the work behind the scenes. For staging, we use Amazon’s default Rails template, and “Environment type: Single instance.” Amazon’s default Rails template manages Rails processes on each server box with a passenger + nginx proxy.

When we first decided to go to a grunt-based asset pipeline, we worried this might impact the way we deployed our servers. In fact, it does not. Our git code bundle containing our Rails app, Angular front-end, and shared assets is deployed to Elastic Beanstalk via git aws.push, exactly as it was prior to our grunt-based asset pipeline switch.

We then do smoke testing on our staging environment.

Elastic Beanstalk production environment

After we have determined the staging release is ready to go to production, we promote the current code bundle from staging to production simply by loading up the EB console for the production stage of our project, clicking “Upload and Deploy” from the Dashboard, clicking “All Versions” in the popup, then selecting the git version currently deployed to staging.

Like any good startup, we try to leverage off-the-shelf tools to save time in our development process. Sounds simple enough, but the devil is in the details, and sometimes a custom solution is worth the effort. In this post, I’ll describe how and why we replaced the Rails asset pipeline with a Grunt-based system.

In the Beginning…

Early on, we embraced AngularJS as the foundation of our core application. We started prototyping using the Yeoman project and never looked back. If you’ve never used this project before, I highly recommend checking it out. It will save you time and tedium in setting up a development ecosystem. We fell in love with the Bower and Grunt utilities as a way to manage project dependencies and build pipelines, and we found the array of active development on the various supporting toolsets impressive. We were knee deep in NodeJS land at this point.

After we stubbed out a good portion of the UI on mock data, we had to start looking towards building out an API that could take us into further iteration. Ruby on Rails was proven and familiar, and we knew how to carve out a reliable backend in no time flat. Additionally, we wanted to take advantage of some proven RubyGems to handle common tasks for which the NodeJS web ecosystem hadn’t fully established itself. Some of these tasks include handling view responsibility, and as such relied on Sprockets for asset compilation.

At this point, we had an AngularJS project, built and managed with Grunt, contained within a Rails project, built and managed with Rake and Sprockets.

Trouble in Paradise

We quickly found ourselves hitting a wall trying to manage these two paradigms. As have severalothers.

Our hybrid Grunt + Sprockets asset pipeline included multiple build processes and methods of shuffling assets. The more we tried to get these two jealous lovers to play nice, the more they fought. Ultimately the final straw came down to minification-induced runtime errors and the lack of sourcemap compilation support in Sprockets (while somewhat supported in an on-going feature branch, sourcemaps hadn’t made it into master and required dependency changes we weren’t ready to make quite yet).

At this point it became apparent that we were wasting precious cycles dealing with things outside our core competency, and that we needed to unify these pipelines once and for all.

Unification

Our solution: say goodbye to Sprockets! We have completely disabled the traditional Rails asset pipeline, and now rely on GruntJS for all things assets-related. The deciding factors for us were the community activity and the flexibility the project provided. Here’s a Gist of our (slightly sanitized) Gruntfile.js powering the whole pipeline.

How we currently work:

We don’t use the Rails asset helpers…at all. We use vanilla HTML for our views as much as possible. Attempts to use the Rails asset helpers ended up being overly complex and ultimately felt like trying to work a square peg into a round hole.

Grunt build and watch tasks handle the the pipeline actively and passively. In development, we use the wrapper task grunt server to launch Rails along with our watches. Source and styles are compiled and published directly to Rails as they are saved. Likewise, unit tests are run continually with output to console and OSX reporters.

LiveReload refreshes the browser or injects CSS whenever published assets are updated or otherwise modified.

We no longer require our Rails servers to perform any sort of asset compilation at launch, as they’re now built by CI with the command grunt build prior to deployment. Nothing structural in our build deployment process has changed (in our case, using Bamboo to deploy to Elastic Beanstalk).

With the above, we are now constantly testing using the assets that actually make it into a production environment, with sourcemap support to handle browser debugging sessions. Upon deployment, Rails instances do not need to pre-process static assets, reducing warm-up time.

Ultimately, the modular nature of the Grunt task system ensures we have a huge array of tools to work with, and as such, we’ve been able to incorporate all the nice little things that Sprockets does for us (including cache-busting, and gzip compression) and the things it doesn’t (sourcemaps).

DIY

Feel free to steal our Gruntfile.js if you’re looking to adopt this system. We’ve also cobbled together a list of Grunt tasks that we’ve found helpful:

grunt-contrib-uglify – handles all JS concatenation, minification, and obfuscation. Despite adhering to AngularJS minification rules, we’ve found issues with the mangle parameter and must disable that flag when handling Angular code. Uglify2JS is also providing our sourcemaps.

grunt-contrib-compass – we only author SCSS and rely on Compass to handle everything concerning our styles, including compilation and minification as well as spritesheet and sourcemaps generation.