As much of a pleasure it was to see Erlang in the “languages with professional experience” I don’t really feel that it should be higher than the functional/lazy level. Some of the log(n) or level 3 stuff seemed more like 1 complexity impressiveness and others were silly (version control, try git = log(n)). Definitely seems dated and pretty overtly biased.

I am a compsci graduate, and I just get mad at these lists. You should never, ever feel inadequate for not knowing most of this stuff, because it’s rarely relevant–and when it is relevant, simply knowing of all these things is not the right way to go about it anyway.

Here’s the thing: CS is a huge field. The things that are relevant to one person may be totally irrelevant to someone else, and that’s fine. E.g., if you get a ton of experience with embedded work, then you probably know tons about ARM assembler and boot loaders, but you may have no idea how to write a web page. If you do web development for a living, you probably know tons about the ins and outs of a pile of different browsers, but might well have no idea what assembly language even is. But these are both practicing CS, and neither one is “better” than the other one, or should make the other one feel inadequate. They’re just different skills sets.

And that said, even in this author’s niche, I don’t agree with what he’s written down. E.g., take the data structures he’s listing. I do occasionally find myself in situations where I have to implement a custom low-level data structure, for whatever reason. (“Whatever reason” is usually, though not always, that I need to be able to search through a very large, structured amount of data in a specific way, and I really would like to fit all of that data in RAM.)

I do what I sincerely hope everyone else does at this point:

First, try just writing the algorithm normally using hash tables, arrays, and whatever else. Check my hypothesis that this isn’t enough. The older I get, the more I’m magically able to make things work at this phase, both because I get better at writing efficient code, and because computers get faster.

Next, try the exact same approach, but into the lowest-level language I can that will not make me hate myself. If I’m the only one on the project, that usually means Rust, and historically meant Free Pascal or Chicken Scheme; if I’m working with normal people, I’ll grab C++14 and a sane STL.

If that level still fails, or if I was already at that level, I will finally I grab a data structures book, look through the descriptions to find a few things that sound close to what I need, try some medium-sized benchmarks, and then proceed based on what, in real-world usage, for my use case, looks like it’ll be the right mix of big-O performance and actual performance when you factor in whether the operations can be SIMD optimized, what the memory access patterns look like, and everything else.

Note that at no point do I go, “Aha, yes, splay trees are perfect for this!” or anything like that. And people who do usually write code that has a great theoretical big-O notation, but horrible real-world performance, because e.g. it’ll miss the cache constantly, which can make my O(x^2) algorithm run better than your O(log(n)) one for real-world data.

This is the biggest problem imo, and a common problem in interviews: people often don’t realize that they’re selecting for someone with a particular cultural background rather than for much to do with overall ability, or even general experience. If you really do need someone with pre-existing experience in a specific part of CS, that’s fine, ask questions specifically from that area. But it should be on purpose, rather than using things that you think are proxies for “is knowledgeable about CS in general”, but which are often pretty non-randomly chosen.

An example: Despite writing down a pretty long list of things a programmer should know, this post doesn’t mention one that a lot of companies use as an entry-level screener: at least basic familiarity with floating-point semantics and numerical stability. Are the things this post mentions more important than understanding floating point? Maybe, maybe not, mostly just different.

Excellent point about the implementation process. I follow a similar process, but if I encounter issues at step 1, as step 2 I always re-verify the requirements by saying “this is going to take X additional time to implement, should we change the requirements instead?”. For example, we could only retain the last 12 months worth of data in the database instead of storing it indefinitely.

Often the requirements or the impact of implementation time aren’t considered with enough precision before implementation (perhaps because implementation has to start for the issue to become apparent).

It says “knowledge of”, not “eagerness to implement”. You should be able to say “ah, yes, splay trees are perfect for this!”, then Google for a reliable looking splay tree implementation, and use it.

For example, I recently did something that used a priority queue, because it was “perfect for this”. Not knowing that a priority queue is even a thing would have made figuring out my algorithm a lot harder. But I didn’t implement the priority queue, I got it from a library.

The lower-level alternative is “I use hash tables and arrays for everything because my language has them built in and I have no knowledge of anything else.”

Step one sounds about right, but if performance is unacceptable at that level the next step should be to understand why.

If I’m using a bad algorithm, or my implementation is bad, most of the time there can be more bang for the buck implementing a better algorithm in the higher level language than reimplementing a crap algorithm in a lower level language. And if the new algorithm alone doesn’t work, it’s better to implement the new/better algorithm in a low level language than it is to implement the original crap algorithm.

It’s also more interesting (IMO) to implement a cool algorithm than it is to rewrite code in C or C++.

I should have made it clear that “understanding why” is required to go to step two. Rest assured I do that; good point to call it out explicitly.

But I disagree with you on the second bit. From a maintainability perspective, all other things being equal, I would rather give someone an easy-to-understand but simple design in a fast language than an insane algorithm in a high-level language, if I have to pick one or the other. You’re correct that it’s more interesting to do a cool algorithm in a high-level language, but that’s because it’s novel and different, which in turn means that whoever comes after me is going to have to work a lot harder to understand why it works if they have to change things. Conversely, a boring algorithm in a low-level language should be pretty straightforward to understand.

I’m painting in very broad strokes here, and there are obviously times to do a weird algorithm in a high-level language instead (e.g., if you are doing a generic structure that you expect to get very high reuse—maybe a bloom filter at a web company, for example). But I stick by my existing order as the preferred one otherwise.

Yeah, I don’t really use that question as a trivia HA HA YOU DON’T KNOW THIS THING question, mostly I’m just interested in people’s answers to it. I’ve never had anyone actually answer it correctly, but I can respect if someone actually answers “I have NO idea” vs. trying to dream up some bullshit answer.

My answer would be, “copies commits from one part of the DAG to another, using the standard merge algorithms, and updates the explicitly-listed refs to point to the new copies of the commits.” Do I pass your test?

Coming from hg, I was surprised that by default git will happily duplicate commits if you happen to rebase something that is referenced from more commits than the ones you explicitly name. In Mercurial, rebasing must also move all of the commits that descend from the ones you name, and you must pass --keep in order to keep the original commits in case you really want a copy. The graft hg command is more appropriate for copying over an explicit subset of commits without having to do something about all descendants.

Also, git’s rebase functionality that lets you edit commits as you copy them is called histedit in hg. A pet peeve of mine is that because the “edit” option in git is provided via git rebase -i, then all rewriting operations are called “rebase” in git-land, even when the stack of edited commits is not changing what it’s based on. A rebase that does not change the base? Ugh.

It takes one commit and changes its parent. (As a technical detail, this entails rewriting the commit and all its descendents, and of course it can have a huge impact on the topology of your history. Also, they apparently decided that as long as you’re rewriting commits, they might as well let you change them, too, so rebase does that.) It’s much harder to precisely describe what cherry-pick does, even though I suspect more people have an accurate intuitive impression. =)

A lot of people use git without really understanding how it works under the hood. It’s actually fairly impressive how effectively one can work like this, and I don’t think a person needs to understand it to work with it—but if you have a bunch of developers collaborating on one repository, you very much want to have somebody at the organization who can unfuck things if needed.

I find that fundamentally flawed. None of these are “big picture” skills. No architecture, almost no network or distributed systems knowledge. No knowledge of “models” of any kind. No team skills, no project skills. The ability to evaluate of software - on of the most important skills of a senior programmer - is missing.

This is a skill matrix for people that never go beyond the line of code they are working on currently. (which is great, if you really want that)

If we’re being generous, it ends up being a heuristic for “how good are you at learning,” which is a really important skill (maybe the most important one) but if that’s what you really care about, it’s better to evaluate that one directly.