08 May 2014

When I work on projects with slow test suites, my workflow often ends up looking sort of like this: I make some changes on a branch, run the tests that seem likely to be relevant locally, and then push the branch off to tddium (or whatever) to see if any of the other tests fail unexpectedly.

I like each of my commits to be clean and green in isolation, to make digging through project history easier in the future. To that end, when I make a bunch of commits before running all my tests, I often end up fixing bugs and using interactive rebase to merge the fixes into earlier commits before merging my branch in. (So long as I’m the only person working on that branch, at least!)

It’s a reasonable enough process - stage my changes for the fix, commit them as a WIP, then open up interactive rebase to amend them as a fixup to the right commit from earlier. That’s so many steps, though! And I’m so lazy! I really just want to be able to run something like git shamend SHA_FOR_EARLIER_COMMIT instead. (Or better yet, git-smend or git-sm for short!)

So, I wrote git-shamend to solve this problem for me. The full script is available here - just copy it to /usr/local/bin (or wherever you prefer to keep such things) and you’ll be able to use it with git shamend SHA_TO_AMEND.

First, we get the SHA for the reference you pass in. This avoids problems that could come up if you pass in something like HEAD^, whose meaning changes whenever you add a new commit (as we do later on).

SHA_TO_AMEND=$(git rev-parse "$@")

If the reference you pass in is a commit that’s in your current branch…

if git merge-base --is-ancestor $SHA_TO_AMEND HEAD

…then git-shamend commits your staged changes, marked as a fixup (which is an amendment that retains the original commit message) to that earlier commit…

git commit --fixup $SHA_TO_AMEND

…and if you have any remaining unstaged changes…

git diff-index --quiet HEAD
NOTHING_TO_STASH=$?

…stashes them so they don’t interfere with the upcoming rebase…

git stash

…and runs an interactive rebase automatically to get that fixup amended properly to the earlier commit you specified.

So from git’s perspective, it runs an interactive rebase and opens up an editor for you to move around commits, which ‘you’ close successfully. Great, that’s all git-rebase needs from, er, ‘you’ - assuming there are no conflicts, it can handle the rest on its own!

This works because the --autosquash flag tells git-rebase to put fixup commits in the right place for you before opening up the editor, so there’s really nothing you need to do to get things sorted out right.

From your perspective, git just kinda does its thing without bothering you. Gotta love that.

But what if something does go wrong? If the rebase exits unsuccessfully (that is, with a non-zero exit code)…

if[$? -ne 0 ]

…then git-shamend aborts the rebase and resets that fixup commit it created earlier, to clean up after itself. (And echos a warning, natch. All that stuff is in the actual git-shamend script.)

git rebase --abort
git reset --soft HEAD^

And at the very end, if you did have any unstaged changes that were stashed earlier…

if[$NOTHING_TO_STASH -ne 0 ]

…they’re popped from the stash, to make return your working directory to its pre-SHAmend!ing state.

02 May 2014

I’ve been working on search stuff lately, and we needed some wordlists to help test search results that match only because they sound similar to the query, and not because they’re spelled similarly.

Turns out we couldn’t find a pre-existing wordlist of homophones (words that sound the same but are spelled differently) that are dramatically different in spelling. And our QA team especially wanted some examples of people’s names that meet those criteria.

So, sure, I figured that’d be fun and quick to throw together for them!

It’s a lot like finding anagrams - the basic structure was a dict (a hash map, for the non-Python folks reading this) keyed by the phonetic encoding of each word. Each key pointed to a nested dict, which included an array of words which phonetically matched the key and a bool indicating whether it fit my criteria or not. In the end, all matching words were spit into stdout as a list of comma-separated homophones.

I determined whether words were spelled differently enough by checking whether a small enough percentage of their trigrams were the same. (I also had a minumum length set, so I’d be sure to have enough trigrams per word to be worth checking for match percentage.

(It was kinda neat to find something that felt more like an interview puzzle than anything else, but was actually useful for my day job. Oh hey, look, those skills are occasionally actually useful! Now you don’t have to feel weird about all the time you spent learning how to solve these sorts of puzzles!)

Sweet and simple and fun! Here’s my script and a few of the wordlists I created with it, since I figure other people may also find this sort of thing useful when testing search implementations. (FYI, if you’re using something other than a metaphone/doublemetaphone soundalike algorithm and trigrams for misspellings, you may want to make some adjustments.)

24 Apr 2014

This week, I’m grateful that my coworkers know to come grab me if something seriously weird is going on, because it fills me with so much glee! I mean, WHAT.

minimal repro:
On Suffolk (one of our machines), open tmux, open vim, open new terminal tab.
Vim gets “lililililililill” inserted in current file, and beeps a lot
If the file already has content, it prepends i and appends ll to ~10 lines, and sometimes capitalizes something

WTF WTF WTF THANK YOU

I’m going to skim over some of the details so that this remains a blog post and not an endless excited ramble, but! This is approximately how figuring this nonsense out went!

Initial poking around

When does the problem happen? When you open a new bash tab or window, or enter any command in any bash session.

“lililililililill” looks very suspicious. Is that a macro or something hiding in one of the vim registers? Use :reg to check the contents of the vim registers - nope, nothing fishy in there!

Is there anything funky in our tmux config? ~/.tmux.conf doesn’t exist, and a quick googling around didn’t turn up anything on any other standard sorts of tmux config files. Fair enough, put that aside for the moment.

Does not happen in any other of the handful of machines that were checked.

Does not happen in vim in tmux when ssh’d into another machine.

(One exception to that last one - a coworker said he was able to replicate it when ssh’d into a remote coworker’s machine. But when we tried to replicate that, it didn’t happen. An isolated datum, potentially relevant, but highly suspect. To this day, I’m pretty convinced that folks got mixed up and it never really happened in the first place - happens to the best of us, and it doesn’t fit with any of the other evidence.)

Cool, we got the lay of the land. So! What changed recently?

Ah, this machine was newly reimaged. Maybe we have new broken or incompatible versions of some things?

We were told that the tmux version should be frozen as part of our install script. Do you believe everything you’re told? No? Good! You guessed it, we had totally different versions of vim, tmux, bash, and OS X on this machine than on other machines which do not exhibit the same problem.

Around this time I started up a Google doc to keep track of everything we were trying, because once things start to look complicated I know I won’t be able to remember everything I’ve tried. Especially when multiple people are involved! And it’s a horrible waste of time to repeat experiments out of forgetfulness, or even worse, lose potentially relevant data. I won’t bore you with a full list of versions and reinstallation steps, but boy do I have all the details in my notes.

Point being, we downgraded bash, tmux, and vim to match the versions working on other machines, but the problem remained.

At this point, I was sadly told that the machine was just going to get a bunch of stuff reinstalled and I shouldn’t spend any more time poking at it. Sadness! But okay, fair enough, it was getting in people’s way and the show must go on.

But wait! Things don’t magically solve themselves after all!

Imagine my delight when I came in the next morning and heard that the reinstalling stuff hadn’t fixed the problem! I’d been super bummed the day before to have my mystery stolen away from me, so this was very exciting! I hung out with a coworker for a bit to give him pointers on how to look into Elasticsearch bugs, then ran off to the biggest mystery of the week.

Aha! We noticed that we have a tmux-related vim plugin in our vim config - tmux-config. Bonus points for anyone who feels like stopping here to look at that and guess how this story ends. ^___^

I didn’t have much time to play with it in the moment, but the very best thing happened - we were able to replicate it on any machine recently reimaged with our new workstation setup script! This meant I was able to get the bug onto my laptop! AW YEAH.

Commenting out that line causes the problem to go away. Running tmux send-keys -t %0 ^\\ ^n F19 WriteAll manually in another bash window causes the bug to manifest regardless. Perfect! What is this thing trying to do, and what is it actually doing?

Aw, hell, my MacBook doesn’t even have an F19 key! Yeargh. Fine, whatever, I went and installed KeyRemap4MacBook so I could remap fn-fn to F19 to test stuff with.

Result: No beeping or case toggling.

Why does F19 cause beeping/case toggling in vim inside tmux but not in vim outside tmux?

Am I super confident that my mapping worked properly? I mean, I tested it with EventViewer, but how realistic is that? Does tmux send-keys somehow send something different than what my mapping thinks I’m sending now?

How else can I test that F19 is what it claims to be?

I did some googling around, and learned that you can actually check how keystrokes are encoded in bash by opening up your terminal, hitting control-v, then hitting a key.

Whoa, neat, that seems useful! I checked encodings to see if I could find a difference, and oho, that jumped out at me!

Inside tmux, F19 is encoded as ^[[33~

(in our bash outside tmux, it’s ^[[18;2~ instead, dunno why)

HOLD ON. Look at that more closely: inside tmux, F19’s encoding ends in ‘3~’, which is exactly the command in vim that you’d expect to toggle case for 3 characters - COINCIDENCE? I THINK NOT.

With that option set, the conditional in autowrite.vim is satisfied, and (when vim is restarted after that option being set and all the vim plugins are sourced) t_F9 (which is secretly F19) is mapped to [33~.

(4) This means tmux wasn’t set to handle xterm-style function keys (such as F19). This isn’t super-clear, to be fair. The clear way to set tmux to receive xterm function keys properly would be with “setw -g xterm-keys on”

(5) Vim checks $TERM to see if function keys are available. See the tmux FAQ. If they’re not, the character codes sent by the function keys are interpreted literally.

(6) We actually have vim set to interpret the higher function keys explicitly in autowrite.vim:35 - if $TERM is “screen-256color” (which happens explicitly in that tmux.conf we weren’t using) then t_F9 (which is F19) is set to ^[[33~

(7) Why? Because (as I verified with control-v) inside tmux, F19 is encoded as ^[[33~

(8) Since we never explicitly set it otherwise, $TERM inside tmux was set to “screen” - which means that the condition in our autowrite.vim:35 was never met, and thus t_F9 was never set to ^[[33~ in vim.

(9) Because t_F9 was never mapped properly in our vim config, when that preexec function ran and bash sent “^\\ ^n F19 WriteAll” to tmux via tmux send-keys, vim escaped into normal mode because of ^\\ ^n and then interpreted the rest literally as ^[[33~WriteAll.

(10) And because the literal string ^[[33~WriteAll wasn’t mapped in vim (only <F19>WriteAll was!), each character was interpreted as a separate vim command, not part of a single mapping as intended.

^[[33~WriteAll as interpreted as a series of vim commands

^[ is escape

[3 doesn’t do anything (as far as I can tell)

3~ toggles case for the next three characters

W takes you to the start of the next WORD

ri replace the character under the cursor with an i

te takes you to just before the next e

A takes you to the end of the line and puts you into insert mode, and then

ll is inserted at the end of the line

Long story short, the fix was:

ln -s ~/.vim/bundle/tmux-config/tmux.conf ~/.tmux.conf

Process-related takeaways

Absence of evidence IS evidence of absence - we noticed pretty early on that there was no ~/.tmux.conf, then moved on, figuring that okay, guess there isn’t anything weird in the config. Next time, if something is missing that seems like a likely place to look, I want to think of looking at whether analogous config files exist on working machines to compare sooner.

Verify ALL assumptions sooner (or at least the easy-to-check ones) - I noticed that t_F9 thing way earlier and skimmed past it, assuming that surely t_F9 referred to F9. That’s an assumption that would’ve been super quick to verify! Gotta verify assumptions as they’re made, especially ones that are quick and easy to check out.

It was so fantastic to just get to chat about science and problem-solving and trying to get better about putting our egos aside and really evaluating the evidence before us with such a great group of people.

It started like this…

DAVID:
I bought a microscope yesterday. And there was a splotch on it and I couldn’t figure out what it is and I did the scientific method trying to figure out where in the microscope the splotch was coming from. Turns out, I was seeing a reflection of my optic nerve.

JAMES: Nice.

[Chuckles]

JOSH: Yeah, you can look in the microscope a really long time and you won’t find that.

DAVID: Yeah.

DANIELLE: So, when you gaze into the microscope, the microscope gazes back into you.

[Laughter]

DAVID: Also gazes back to me, yeah.

JOSH: [inaudible] Are you saying that what you see inside Dave’s eyes is the abyss?

DAVID: Yes.

DANIELLE: Yeah, yeah.

JAMES: I just want to know how he proved that hypothesis false. Did he gouge one of his eyes out?

[Laughter]

DAVID: Actually, and this is the part that I was very, very proud of, I finally switched eyes. And the splotch moved and changed shape.

So brilliant!

And this was my favorite quote of mine from the episode:

“Look, the goal is to prove that I’m wrong. That means I win. I’ve proved that I was stupid about something so I can move on to being stupid about something more interesting.”