Slashdot videos: Now with more Slashdot!

View

Discuss

Share

We've improved Slashdot's video section; now you can view our video interviews, product close-ups and site visits with all the usual Slashdot options to comment, share, etc. No more walled garden! It's a work in progress -- we hope you'll check it out (Learn more about the recent updates).

eldavojohn writes "The startup company Ohloh has a database listing 70,000 developers working on 11,000 open source projects. Their aim is to 'rank' open source developers, which raises some interesting questions about exactly how useful this tracking company is. Questions like, 'Is there an accurate way beyond word of mouth to measure the importance and skill of a developer?' I found it slightly alarming that, to this site, the number of commits (with input from the number of kudos) tells how good a developer you are."

I don't know how representative it is, or if it might improve over time, but I looked myself up.

I found mentions in 5 projects - _except_ they're all just versions of 2.6 kernel source with the same contribution for an obscure TV card cx88 variant I did. In practice, I'm sure I'm hardly alone in having contributions (mostly in small ways, but sometimes very considerably) to over 100 projects over the years. I guess I have to go through and add some of those projects.

I'm sure I'm hardly alone in having contributions (mostly in small ways, but sometimes very considerably) to over 100 projects over the years.

I also don't think you're alone in finding that metrics fail to measure good programmers. My boss constantly asks me for lines of code count from developers. No matter how many times I express this to him, this is not a measure of success or of how good a coder you are.

I tried to think of metrics to relay up the chain (a special thank you to the stat-scm goal in maven) but I come up with some pretty lame ones:

Code to comment ratio is desired at 1:1 (at least in the commercial world)

A class/method/function/procedure/module desired size should be defined and rated

# of Unit tests

As you can see these are the ones that I found could be automatically gathered. And even these have exceptions. Anything else I think of either takes too much time to gather or is subjective. This is tough, I would like to default to peer review but oftentimes I find teammates voicing their personal hatred for an individual or taking into account personal qualities when ranking a developer. Real Life Example: Teammate A is from MIT and teammate B thinks everyone from MIT is a god. Unfortunately Teammate A hasn't done anything but criticize everyone's code without any constructive comments to make it better.

I submitted this story hoping it would open dialog on measuring coding abilities in a semi-automated way.

Granted that those are excellent metrics, but you left out Vi/Emacs Flamewars and number of Microsoft Jokes. As to lines of code I once put Temporary Autonomous Zone by Hakim Bey [wikipedia.org] in my.plan The sysadmin for the system fingered me from a dialup and was rather pissed about having to disconnect.

But seriously, the problem with a lot of the FOSS communities is that there are a great deal of part timers. How do you measure the skills of someone who has a day job coding and the has a pet project or three? This wi

Software development metrics are not worthless. They are, however, seriously misunderstood. This is partly why we built Ohloh to focus on Open Source: it's the world's largest testbed of available software development metrics.

One challenge to interpreting development metrics is having a clue about what is 'normal'. Just knowing your FOOBAZ count is X doesn't help much. Once you can compare your FOOBAZ count to 100k other developers, it may begin to give you some helpful perspective. Of course, relying on

I think management probably do grasp that lines of code arent necesarily proportional to the amount of work, however they are probably wanting to make sure the the developer are doing *some* work.
Personally I would gather 1/(no. slash dot comments):D

What about some slightly deeper tracking into the repository? Figure out how much code a developer committed that had to be changed later.

Not bad, but it depends a lot on the maturity of the project. Many young projects start out with several complete rewrites before they find a workable solution that performs well, is scalable, robust and reliable. And even on mature projects -- a lot of times and entire codebase is scrapped to accommodate new features: think Samba 3.x -> 4.x or GNOME 1.4 -> GNOME 2.0.

Making OSS dev is hardly a competition sport, so why do this? Kudos is one thing, but that should come from the community, not from some database.

The "Kudos" is given by the community. And I like the inclusion of objective criteria to supplement the popularity contest. [ And if you don't think "number of commits" is an objective criteria, you don't know the meaning of the word "objective". ]

Some people will get a shiny glory and some will feel annoyed bbecause their projects/contributions have not been tracked.

Exactly. There is much more to get out of the code repository. Check http://www.sourcekibitzer.org/ [sourcekibitzer.org] metrics. It is able to extract developers "know-how" score by crawling the repositories. Much better than simple Lines of Code.

Come on, no matter how you rank people, this just ends in a lawsuit when someone does not like their ranking, especially if the ranking does track errors and poor performance somehow.... It's sure not going to help build community or aid actual development... unless someone is developing a giant pissing contest.

It's a little tricky because talent is only as good as what's seen. From a business perspective, they don't care so much about the getting there, as long as there's something done within deadline (but they want to know you're working to meet that deadline so they count your lines of code).Nice thing about open source code is that it is reviewed by other developers as it goes. This site might be worthwhile if it has good input from thoughtful people. Like Slashdot people! The shiny...happy kind...! Righ

The SEI and various other organizations focused on process have a lot of suggestions. There are some simple ones, such as "function points" * "complexity" as a measure of productivity. Or defects per function point as a measure of quality. Or some from the extreme programming world of user stories completed.

But nothing is really truly accurate, only guides humans. If it were, managers could be robots.

I always hated the lines of code metric because it was so useless. A really good architecture with a good coder leads to much LESS code that looks simple to the casual observer. Similarly, the number of comments might not be very useful as well. Spaghetti often has lots of extra comments, while good code is often self-commenting (at least in my commercial realm). Lots of unit tests are often unhelpful if there are no automated system tests.
The best coders balance design, coding and testing in the scop

Here are some more that can be derived automatically:- Code cleanliness (e.g. consistent camelCase, whitespace, comment styles)- Statistics of commits over the same section of code (may imply that it is buggy)- Rate of API breakage -- how long does a function last before it's name/args/return value changes. This metric could be "weighted" by evaluating the prevalence of the function. A high rate implies an unstable design.

After I gave up looking for good single metrics, years ago I invented a "karma" algorithm for use in CVS Monitor (google it).The karma score is calculated for a single commit, and uses a combination of lines and files added/removed/changed (with some munging so that file moves don't get line scores), and then adds to that the size of the commit message in bytes, and then applies a maximum upper score per commit.

It's worked pretty well, since all the best ways of gaming your karma score also encourage good p

Code to comment ratio is desired at 1:1 (at least in the commercial world)

I believe code-to-comment ratio is one of the things Ohloh tracks -- but it can't even figure that out for everyone.

For example, Perl modules are often documented in POD, rather than "normal" comments beginning with #, but Ohloh doesn't know how to parse Pod and so consider lots of well-documented modules to be nearly completely uncommented.

(That would be sort of as if they only counted// comments in Java source but not any

Commits are definitely a bad metric - for one of the projects I'm involved with it shows I have 24 out of something like 800 commits, which doesn't factor in the commits I made while it was still a private project in a repo with other things, and it doesn't count research and political work required to make the project happen. Also, lots of my contributions to things have been submitting a patch, which got me a mention in the commit log, but Ohloh doesn't pick up on that.

This reminds me of how academics are increasingly judged. It is more about how many papers and how many other people link to it rather than the quality of each paper's work or the note of the linking party. Accordingly, many authors inflate their 'impact' scores by splitting up papers and publishing nonadvancing science, no-one can blame them for this as many are trying to justify themselves to their departments or are still doing the postdoc merry-go-round looking for new jobs every 18months.

You can't effectively rank developers. First there are just too many to rank. Even in college football, where thousands of people are paid everyday to monitor it, they don't try to rank all of the ~119 Div 1 teams, just the top 25. Secondly there isn't a simple metric to rank developers. It's about as smart as saying look I did the most work on this project because I wrote the most lines of code.

This could even have a negative effect if developers get concerned about their ranking and try to game t

Most of my contributions were on website documentation, wikis, or mailing lists, which aren't included in these metrics. At the moment, a lot of my commits are done on repositories not directly available to the public. While I don't really need Ohloh to tell me if I've contributed to a project or not, it's still a little annoying.

And what about contributors who submitted patches that had to be committed by someone else? Or people who contribute by providing help on IRC channels, blogs, forums, or other mailing lists?

While ohloh metrics can be useful, they also need to be taken with a grain of salt, particularly the contributor metrics. They're a bit more useful on measuring a project as a whole (but they still miss a lot of activity).

I am listed as two people with the same pseudonym; my real name is not found. I am listed for two related projects belonging to the same organization. Both of me have the same score albeit for different skills. Ohloh obviously only checks commits to the main branches; my commits and LOCs to an experimental branch of one project would drown my official commits. I won commit status due to my assistance on the mailing lists and a lengthy complicated patch for critical functionality; my name is in the credi

While ohloh metrics can be useful, they also need to be taken with a grain of salt, particularly the contributor metric.

Particularly all of their metrics. Another example: one metric is the ratio of code to comments. This is debatable enough in and of itself (there are an awful lot of people who think code should never be commented, on grounds that code which requires comments must not be written clearly), but on top of that it takes a very narrow view of what constitutes a "comment". For example, a P

TFA says classifieds, them digging through their data on request, subscriptions for them monitoring projects/devs.

I imagine they might make quite nice head hunter "equipment". Think company x wanting to incorporate SMB connectivity to their closed source product (and requiring a developer to do so). Instead of digging thru or placing classifieds, Ohloh would hook them up with the top 10 contributers to smbfs. This could turn out to be a great deal for both sides. Company X gets a dev really skilled in th

It's a very simple model really, when you think about it. Let's examine their possible train of thought:Sites can sell advertising when they get lots of frequent users. Sites need users to get users. Sites need some kind of user list to bootstrap. Where can you get a big list of users from? Why, isn't that opensource stuff based on lots of people communicating in the open, over the net? Oh, hey, let's use those suckers. Hmm. How can we make more suckers sign up after the first ones? Hmm... we need

Their website says they have independent investors. Maybe this is just a technology demonstration, or maybe they'll sell subscriptions when it's working well, or maybe they'll allow people to buy homepages to better present their skills, or allow companies to buy a customized search for programmers they want to hire. I can think of a dozen possible ways to make money off this.Remember, a startup virtually never ends up doing what they start out doing. (The Apple guys started out making phone-hacking blue

Would this discourage contributers to open source projects? Now if I put on my resume that I've contributed to an open source project, somebody is going to want to look me up. I have to deal with all that baggage when I just wanted something to do in my spare time.
Also, I really am not sure I feel comfortable being given an absolute rank. People always bring different skill/approaches to different jobs and I don't think you can arguably say one is better than another. I've worked in teams where everyone respects the different capabilities and limitations of each member. Its sort of like arguing there is an absolute thing known as "intelligence". Is there really such a thing or do we just all bring different skills/perspectives/approaches to the problems we solve? I'd prefer to think the latter, that everyone contributes what they can but has their own limitations. Talking about absolute "intelligence" or "value" seems condescending and elitist.

Few commits means either you're Donald Knuth, or you're not that actively developing your code.

In Open Source active development does tend to mean a reduction in crapness, software wise.

What else it could say I don't know, but since there are few, if any definitive means by which code quality can be measured (and don't give me that lines of code versus man hours rubbish, I heard enough of that nonsense at uni), it's probably a reasonable metric.

Indeed, good point. Any metric can be used to the savvy developers advantage. Some people track how many bugs a piece of software has after a release... We just group similar bugs into one bug. When you know the rules, you can use them to our advantage.

Agreed. And note that Oholoh appears to, by default, only track 'trunk' branches for subversion repositories. So if I spend 6 months and 500 commits working on something on a branch, which I later merge into trunk (one commit), that history isn't tracked. Sure, you can add the branches/ dir to the Ohloh 'enlistment' (what a stupid word), but it seems to be at least frowned upon.

And not all commits are code. A decent percentage of the commits in my projects are i18n/l10n-related. Those are even harder

Talk about damning with faint praise. That's how I aspire to be evaluated: lined up naked against a wall while my vital statistics are transcribed by a group of bonobo monkeys. Hey, it's as good a measure as any.

In fact, the monkey-measure is probably better than commit-count, because no matter how my spam box bulges, the monkey-measure is less likely to persuade me to exchange an effective work habit for an ineffective work habit in an effort to sway a useless statistic.

If you have commit access, maybe.Consider the case where you don't have commit access, which means that not only is someone else committing for you, but they're probably committing it all in one big patch, rather than lots of little increments.

I know that at work, where I have commit access, I tend to commit all the time -- if I need to not break things for everyone else, I make a branch. But for open source projects, I might do a weekend's worth of coding before sending it in, and even then, someone else g

So in other words, I could commit some of my own code to a CVS repository, find some errors that I missed, fix them, commit it again, decide to add more comments, commit it again, find one more thing I probably could have done differently and then rewrite it, commit it again...

I'm not calling him stupid... That was the example he was trying to get across. By this metric, stupid programmers that commit a lot because of mistakes are rated as highly as highly-motivated, caring programmers who commit a lot because they have a lot of additions to make.

No, you're also a bad (or perhaps just "sloppy") coder if you constantly make mistakes. Sure, everyone makes mistakes. But some mistakes are "stupid" and some aren't. If you commit broken code that doesn't even compile (and you didn't even test it), and then make another commit to make it compile, then was that first commit a "stupid" mistake? I'd say yes.

This commit count thing is meaningless anyway. Some people commit more or less often, depends on personal preference. Just about the only things y

Then I suggest you fail to understand the purpose of versioning systems. Their purpose is not to store perfect code. The purpose is to store the latest changes to code.

No, I'm not failing to understand anything here. My point has *nothing* to do with the purpose of versioning systems. My point is that there are sloppier programmers, and there are neater programmers. As sloppy programmer might carelessly check in code without testing it (or without testing it thoroughly), believing that it works. A sloppy programmer might take 10 iterations to get something right, while a neater programmer might only need 3 iterations to get that same thing right.

This is moderated 'insightful'? But definitely not by real coders. More by wannabes like a previous project manager of mine, who, whenever he found one of my bugs, complained I would not test my code. I just wonder how he knew that when he found one, that I previously did not remove 500, which I found myself through testing.

So in other words, I could commit some of my own code to a CVS repository, find some errors that I missed, fix them, commit it again, decide to add more comments, commit it again, find one more thing I probably could have done differently and then rewrite it, commit it again...

Your willingness to fix errors, add comments, and do code rewrites puts you in the pantheon of programming gods! The next thing you are going to tell me you actually write your own legible "how to" user guides in PDF!

If you look around, eg javawoman not happy thread [ohloh.net] it is a bit worrying that an ex cat herder from M$ is not only behind the wheel, but clearly has the same marketing speak to shut up unhappy fodder for the soon to be commercialised service: ohloh_goes_open_source [ohloh.net]

"Kudos" is not plural, just a word that happens to end in "s", like "pathos". "Kudo", as used on that site, is as meaningless as "etho" or "mytho". The more frequent references to "many kudos" or other treatments of it as discontinuous are also incorrect, although much less jarring.

That was the first thing I thought of as well when I noticed their "kudos" system. Then I looked it up, and it turns out [reference.com] that kudos actually is the plural of kudo, and you can use them either way. I've never heard of it used in the singular form before, though.

Wouldn't such a system assume that everyone uses only one handle - or, their real name - all the time for every project? If so, then a lot of people - who contribute under multiple handles, nicks, whatever you want to call their identities - are going to missed or severely under-rated.

I would rather not have my real name attached to most of what I've contributed. One, because my code is so damn sloppy that it's embarrassing. Two, because I don't want the hassle of my real life - you know, offline - and my, uh, "digital lives" conflicting with each other. Three, if I was easy to find - online - I run the risk of being pestered with silly tech support questions.

UrCreepyNeighbor, while an accurate description of my personality, is one of many identities I have. Same could be said of almost everyone. I'm sure "HotChic17CA" doesn't use that username when she's talking with her grandmother, for example.

Wouldn't such a system assume that everyone uses only one handle - or, their real name - all the time for every project?

It does not. Two persons with same nick handle can commit to different projects and they are not counted as one person. One person can have different nick handles and link them to own ohloh id in project properties.

Wouldn't such a system assume that everyone uses only one handle - or, their real name - all the time for every project?

You can register on the page and link all the different aliases back to together to refer to a single person.

I think its a much bigger issue that all those people sending patches will be ignored, since there isn't really a standard way in most version tracking systems to keep track of the patch submitter instead of those that actually commit it into the repository.

I think its a much bigger issue that all those people sending patches will be ignored, since there isn't really a standard way in most version tracking systems to keep track of the patch submitter instead of those that actually commit it into the repository.

That's one of the cool things about git (and maybe others). Each commit has both a "committer name" and "author name" field.

You only have to comment on slashdot to tell that this is a really bad idea. You have people modding a comment "troll" because they don't like a stated opinion, for example. You have people modding a first post as "redundant" and a spot-on comment as "offtopic". People suck, especially at judging other people.And on a thing like that, you may have someone who knows absolutely nothing about code making judgements about coders.

It's a stupid idea. It actually sounds like some harebrained idea thought up by a P

If the first post is "FIRST!", Redundant isn't bad. I mean thats been done how many zillions of times already? At this point it is redundant, but in the context of the site as a whole rather then one individual article.

Of course if they gave me "-1 (Idiot)", I'd use that on the "FIRST!" people instead.

I don't think he was referring to the canonical "first post", but to the first posting of an idea/thought/comment, which is subsequently repeated by somebody else. The somebody else should get the redundant mod (given due diligence in checking timestamps and allowing leeway as necessary).

And I'll agree with the GP that this Ohloh thing sounds like it came straight from a PHB.

I won't argue with that, but I've seen first posts that were actually commenting about the topic itself that were modded "redundant". In fact I try to not submit a comment until I see at least one other posted, because otherwise it will be modded to oblivion and nobody will see it, so what's the point of making the effort?Fortunately it seems that whenever I happen to hit slashdto right when they've posted something new I'm invited to metamoderate, so by the time I'm done with thet there are at least a doze

They are attempting to judge value when there simply is no objective measure for the kind of things they are trying to judge.

This is only half true. You can judge the quality of code. How good does is comply to the OO principles? Makes the design sense? Does it look maintainable? Robust? Reusable? How is it documented? There are even some metrics, which could be measured by static code analysis programs. Nevertheless I doubt that for each with Ohloh registered software project a senior software engineer

I was not referring to the code itself, I was referring to the fact that it is impossible to track who is actually contributing responsibly. For example, one open source project with which I am familiar (and I am sure that many others are similar in this) keeps strict controls on who can commit. Those who do commit, therefore, are generally committing other people's code. How good are they at documenting such? Not.

They are attempting to judge value when there simply is no objective measure for the kind of things they are trying to judge.

I think you're confusing "no objective measure" with "no known algorithms to computate those measures". Just because we don't know how to implement meaningful metrics with a computer or with some formulas, doesn't mean they don't exist. The proof of this is that other humans who are very good coders *are* actually, on the whole, pretty good (in fact, the *best* system out of them *a

I was referring to the human factor. A lot of open source projects have only a few people allowed to commit code. Other code that is contributed is committed by those few people. So... who should get the credit for the code?

There is no reliable way to tell, after the fact. Period. And different projects have different commit and documentation standards.

As long as there is no standard way of handling these things among all open source projects (yeah right), this kind of metric will remain meaningless,

Like the bards of olde, OS devs don't code for money. They code for prestige and fame amongst their fellows! Surely this site will decide who is the greatest dev to walk the earth. And that dev will have his own code set in stone and copied for ages to come. That developer will be legend.

Unless, heaven forbid, the voting is more like the U.S.'s political system.

I work on Amanda, but the site misrepresents my contributions in two important ways, too: first, I commit a lot of other peoples' patches, so my name appears in the ChangeLog a lot less often than it appears in the commit log. Second, Amanda changed from CVS to Subversion a few years back, and Ohloh doesn't index the old CVS submissions. As a result, the project is marked as just a few years old (it was originally written in '92), and few of the many historical contributors are not listed. I would like t

I suppose that really depends on the SCM and how the commits are structured or processed.I, for example, have commit access to one of the Linux kernel subsystems; about half the changes I commit, maybe a bit less, are contributed by users, and they appear properly attributed (to the original patch author) in the kernel git changelog and in ohloh.As for SCM changes, care must be taken during the conversion... We started out with CVS, and are currently using mercurial; when the change was made, the whole hist

This measurement is not particular good - but what software metrics are?

However it is still a brilliant move, as it will motivate a lot of developers to add projects to Ohloh's database. Developers will add just the projects they have contributed to, so that there ranking will go up.

Quantitative metrics don't work on developers. As soon as a developer learns what it is, they are smart enough to game the system.I [commit] can [commit] game [commit] any [commit] system [commit] based [commit] on [commit] commit [commit] counts[commit].[commit]

I like the dollar figure ohloh attaches to projects. "This is what it would have cost an enterprise to develop this software." It really gives you an appreciation for how much the open source community is giving to the world.

Instead of trying to evaluate free software developers, why not help them instead? We know who they are, their names and emails are just a click away from the About box of the software we use, and most of them are googleable. Some of them aren't so lucky in their life outside free software. Some would appreciate some donated webspace, computer hardware, or other support. I have started AlgoLibre.org [algolibre.org] as a first effort to remind people that some free software developers may need our assistance. If there

My biggest active FOSS project has gone through two forks in it's lifetime. This was not because of me, but because of problems when the companies I worked for tried to appropriate my (after hours) hobby work.
The final fork happened when I started my own company.

Ohloh has it listed. [ohloh.net], as well as the second of the two prior forks (both have since died as the companies couldn't maintain them without me). The newer project is however a niche-market project. It allows me to earn a living but it's hardly the

IANA OSS contributor but I have been a programming enthusiast since 1980 and a professional programmer for ~16 years. The last ten years I've reported to an executive (non-IT boss) for my own productivity and the productivity. Over the these years I've had several conversations with my various supervisors (again; non-IT) regarding how to plan & allocate resources for ~30 projects ranging from static web sites to projects of a much larger nature. I have not found it to be an easy task- setting aside the