Posted
by
samzenpuson Wednesday June 29, 2011 @03:10PM
from the cheaters-never-win-more-than-4-times dept.

An anonymous reader writes "Rybka, the winner of the last four World Computer Chess Championships, has been found guilty by a panel of 34 chess engine programmers of plagiarizing two open-source chess engines: Crafty and Fruit. The governing body of the WCCC, the International Computer Games Association, is even demanding that Rybka's author — the international chess master and MIT graduate Vasik Rajlich — returns the trophies and prize money that he fraudulently won. Rybka will no longer be allowed to compete in the World Championships, and the ICGA is asking other tournaments around the world to do the same."

Sounds like he at least made improvements to them, and isn't that what open source is supposed to be all about? In fact, the article even acknowledges "ICGA isn’t even disqualifying Rybka because it copies Fruit — rather, it’s simply upset that Rajlich claims his engine is original, and refuses to give credit where it’s due." Okay, so maybe he should have given the other coders credit, but why should that disqualify him from winning? He still won. He didn't cheat. He didn't steal the code from the other engines (it was open source). His biggest offense is denying the other coders credit.

I think he should have to share the prize with the other coders (since they contributed code to the final product). But it still doesn't take away from the fact that his fork won. It doesn't justify taking away the win, as if he had cheated. His engine is still the best, open source code and all.

And, nothing against FOSS, but why on earth would you even release code designed for competition as open source, BTW? Aren't you essentially unzipping your fly and telling you competitors all your secrets? Couldn't releasing the source code wait until after the software was "retired" from competition?

The same way that Google caught Bing ripping off search results a while ago: find some idiosyncratic behaviors (e.g. bugs) that serve no practical purpose and are highly unlikely to end up in two independent projects, and demonstrate the same weirdness in each.

No they caught them by installing the bing toolbar, bing was obtaining the results by the rights the users signed away in the EULA. Users do not have the rights to sign away the results to googles search, the users queries are fair game, but the results google gives aren't. If I signed my rights away to Microsoft to allow someone to plant cameras and microphones in my house and use the results in advertising, and I listen to my Metallica CD, does that give Microsoft the rights to use Metallica songs in thei

To come to this rather epic and libelous conclusion, the ICGA assembled a 34-person panel of programmers who have competed in past championships to analyze Rybka. Unfortunately, Rybka’s source code has never been available, so reverse engineering and straight-up move-evaluation comparison was used to analyze the originality of Rajlich’s chess engine. The panel unanimously agreed that newer versions of Rybka are based on Fruit — and worse, that the early beta versions were based on Cra

Let's see, should we take the word of Ken Thompson (yes that one), who according to the fine article seems to have been on the investigative panel, or the word of a random slashdot troll? Duhhh...

Actually neither. We should Read The Fine Reports.

Beyond the plain expression of the program, there are lots if artifacts which come through from source to binary. Even if two progammers start in the same language to develop the same algorithm to solve the same problem they will normally end up with great differences. The exceptions are the stuff of legends [livinginternet.com], and we are talking about 20 line long assembly programs. One programmer will choose an integer because it's big enough, another a long because the variable will mostly be used with other longs. This choice will not be optimised out of existence.

Looking at the report, it seems they used various different artifacts and clearly showed similarity between the programs.

who were predisposed to want Rajlich banned from competition (he kept beating them, four competitions in a row).I know a lot of competitors and they don't want people who beat them banned.Idle speculation and ad homs help nothing.

"except it's not a financially tenable move."

They could get a third party to analyze and compare.And there are striking similarities in behavior.Also, why are you overlooking the fact that they reverse engineered it and found

Similarities alone don't mean plaigarism. Logic can be gleaned from open source code, then applied to other code, perhaps engines. So long as copyright isn't violated, and licensing strictures aren't violated, reading someone's code, understanding the logic, then re-writing it is a hallowed action. For arguments supporting this, go to Groklaw, understand what BSD is (or read about the AT&T-Regents of UCB litigation), and so forth. Ask RMS.

Is this because the compiler does similar things to function calls? It's really NOT proof until you see the source. I can write a lot of code that compilers will optimize to the exact same bytecode. It's not really convincing without *source*.

I can write the same code set, alter it significantly, enough to easily pass the test of copyright, do a make, and have the results be identical in every possible way.

Approaching the logic of a chessboard layout, once having viewed open source code, I could rewrite it to yours, or anyone else's satisfaction. This is without reverse engineering the code, rather, following its logic and using a mime-- but satisfactory analog-- of what it does. Other concepts, like UI, may have protections, but I can alter the

First off: would it kill you to learn some basic HTML? Hard to separate out your comments when you're too lame to even italicize.

I know a lot of competitors and they don't want people who beat them banned.Idle speculation and ad homs help nothing.

Empanel the losers from the competition on a witch hunt against the winner. Sounds like a dick move to me. Definitely doesn't pass the smell test.

They could get a third party to analyze and compare.

Why, precisely, should the Rybka team have to pay for that? It's the ICGA that should have been the ones doing this. FIRST.

And there are striking similarities in behavior.

You can walk into any chess tournament and see "striking similarities in behavior" between members of the same chess club/team, or between players of equal skill. Chess is a logical game, relying on logical formulations. Eventually, like Checkers and Othello were, it'll be solved. The closer the programs get to solving it, the more moves alike they'll make. "Similarities in behavior" of mathematical problem-solving prove nothing. If anything, the fact that his program beat - rather than drew to - the other programs ought to prove that he was NOT using their source code.

by your argument, opening moves in chess would have been the same for hundreds of years.

Funny you should mention that.There are a grand total of 10 logical chess opening moves (8 pawns, 2 knights). Opening with knights has been derided as downright silly for centuries; the only "variation" there comes when it's immediately followed by a pawn push alongside, putting it back into "standard" opening land of a Kingside or Queenside gambit. Openings that begin with the A,H,B,G pawns are rightly derided as virtually useless. Even opening with the C and F pawns is viewed as akin to suicide, since it allows the opponent to open straight into the middle virtually uncontested.

Queen's Gambit openings, of various sorts, have dominated the arena since the early 1400's. The Italian opening was the favorite kingside method for over 300 years, until the Ruy Lopez opening passed it up in popularity. The entirety of "Black Openings" in modern chess for the past 500 years have been attempts to devise responses to these three methods of attack.

So now that I've given you a lesson, run back to your checkers board. The grownups are discussing things.

As someone who is no chess master but can count, you are wrong. Those horsies move in Ls and each one has two possible Ls he could move into . Plus the pawns can move one spot or two spots. So that is 8 more moves. Up to 20 already. There might be more, but I am too busy playing Go to count.

Read the fucking report. An analysis of the binary code revealed 60% similarity. So about 60% of the binary was completely identical. Most of the evaluation functions which are unique to Fruit (AKA not done in other algorithms) were mostly identical, usually with only some constants changed. These are functions that not only are unique in their purpose to those in Fruit, but which have the exact same binary code, the same local variables, declared in the same order. That much duplication is absolutely beyond the pale. In earlier versions of Rybka, they found that obsolete functions from the Crafty codebase were in there. So, you are claiming that not only did he magically duplicate most of the functions, he even had the same useless functions just sitting there not being called. And the exact same unit tests for those unused functions. So, there is a long history of blatant copy and pasting, some of it even completely mindless (copying unused functions). Additionally, he was offered the chance to be on the panel and offer his own input without having to release his sourcecode. He refused to respond whatsoever in his own defense.

If you had READ THE FUCKING ARTICLE PROPERLY you would know that he agreed to a bunch of rules which meant that a) his source had to be available and b) his code had to be original. The fact that the other engines are or aren't public domain is irrelevant to whether he should be excluded.

If he refused to disclose that he used open source code then he most likely violated the terms of the open source license and therefore did indeed cheat. Open Source [wikipedia.org] is not the same as Public Domain [wikipedia.org].

Going to the Rybka web site www.rybkachess.com there does not seem to be a way to download the source code, which would be required for releases under the GPL, assuming there is validity to the claim that he's copied other open source efforts.

Cheating, in this case, means violating the rules of the championship:

18th WORLD COMPUTER CHESS CHAMPIONSHIP TOURNAMENT RULES

2. Each program must be the original work of the entering developers. Programming teamswhose code is derived from or including game-playing code written by others must name allother authors, or the source of such code, in the details of their submission form

Not necessarily, most Open Source Software (OSS) licenses (eg. GPLv2, EPL, etc.) only kick in on redistribution since you need the license to not be in violation of someone's copyright. You can USE all the OSS you want without complying with the license if you don't redistribute it. On top of that some OSS Licenses don't require that you disclose that there is OSS in your redistributable nor do they require you to provide source. (eg. 3-clause BSD).

In this case however, he's clearly distributing the binaries. Fruit appears to be LGPLv2.1 and Crafty has some goofball custom pseudo-oss license that requires attribution. So if he did copy the code and redistribute he's not complying with the licenses and in violation of copyright law.

I don't find it all that unusual that 2 different good chess programs might make similar decisions, and they don't have the source to compare so unless someone is going to sue and do discovery the claim of plagiarism is (IMO) premature.

The courts exist to settle just these sorts of conflicts, and banning him on supposition is questionable.. IANAL....

A) As was already pointed out, he sells the binary engine, so he did redistribute it.

B) The rules of the competition require disclosure of any used third party libraries or components. Since he didn't disclose this usage, he violated the competition rules regardless of whether he was complying with the license conditions or not.

Right in main.c
Crafty, copyright 1996-2010 by Robert M. Hyatt, Ph.D., Associate Professor
of Computer and Information Sciences, University of Alabama at Birmingham.

Crafty is a team project consisting of the following members. These are the people involved in the continuing development of this program, there are no particular members responsible for any specific aspect of Crafty.

All rights reserved. No part of this program may be reproduced in any form or by any means, for other than your personal use, without the express written permission of the authors. This program may not be used in whole, nor in part, to enter any computer chess competition without written permission from the authors. Such permission will include the requirement that the program be entered under the name "Crafty" so that the program's ancestry will be known.

Copies of the source must contain the original copyright notice intact.

Any changes made to this software must also be made public to comply with the original intent of this software distribution project. These restrictions apply whether the distribution is being done for free or as part or all of a commercial product. The authors retain sole ownership and copyright on this program except for 'personal use' explained below.

Personal use includes any use you make of the program yourself, either by playing games with it yourself, or allowing others to play it on your machine, and requires that if others use the program, it must be clearly identified as "Crafty" to anyone playing it (on a chess server as one example). Personal use does not allow anyone to enter this into a chess tournament where other program authors are invited to participate. IE you can do your own local tournament, with Crafty + other programs, since this is for your personal enjoyment. But you may not enter Crafty into an event where it will be in competition with other programs/programmers without permission as stated previously.

He didn't even do that, he combined two separate opensource engine making one better engine. As long as he follows the licence terms for the engines, there is nothing wrong with what was done. Now if the licence said there must be attribution then there is a problem.

He didn't even do that, he combined two separate opensource engine making one better engine.

According to the allegations, he did not combine two open-source programs into a super-bot. They claim that the current version of his bot (Rybka) is a copy of Fruit, and an earlier version of his software was a copy of Crafty.

As you said, he closed-sourced them and claimed them as his own without giving attribution -- thereby breaking the software licenses of at least one of them (Fruit), which is GPL.

He didn't steal the code from the other engines (it was open source). His biggest offense is denying the other coders credit.

Well, it seems that Fruit [fruitchess.com] is open-source in the sense that people can look at the codebase, but it is not FOSS. The license text (see, e.g. readme in this tarfile [fruitchess.com]) says:

All right reserved. Fruit and PolyGlot may not be distributed as part of any software package, service or web site without prior written permission from the author.

Indeed it looks like a commerical product that you are meant to pay for. Rajlich's engine is closed-source and also commercial [rybkachess.com]. He is not making his code available, so even if Fruit were, say, released under the GPL, he would be in violation of the license. But in fact Fruit is "all rights reserved" so if Rajlich took code from it then he is blatantly violating copyright, and thus breaking the law.

I would think that the competition has a blanket ethics rule that says that you cannot win by breaking the law. So Rajlich, if he did indeed appropriate code, doesn't deserve the wins. (Yes, he obviously did ~something~ to improve upon Fruit, but he still cheated.)

Many competitions have rules regarding dishonorable behavior. If he did use substantial amounts of open-source code without crediting the original authors as seems to be the case, that would be plainly dishonorable and thus grounds for disqualification.

As for the open source aspect, in this case a quick skim of the Fruity site seems to indicate (it's not very clearly worded) that it was initially open source during development and then went to a closed/commercial model after it had some wins under its belt

First, there is the question of if he is falsely claiming credit. Second, is whether or not his is the best.

While there are nearly infinite moves in chess, there are not infinite ways of winning at chess. Last I heard, most systems use a scoring system to evaluate moves. And based on a final score tally, they make a decision. This is the "evidence" they have. But I would submit this to a group of non-chess playing programmers (who are unbiased as to who plays chess better).

What you say is completely disconnected from what actually happened. The conviction didn't happen by observing the engine, the conviction happened because the panel reverse engineered the binary exacutable and found a huge similarity, including exactly identical implementations of non-obvious functions, identical bugs and identical dead code in the commercial engine and the opens source engines.

Rybka/Houdini played a 40-game match recently, and Houdini won by a wide margin of 23.5-16.5. You can see the match here:http://livechess.chessdom.com/site/ [chessdom.com] (Check for TCEC S1 Elite Match)

To be honest, since the advent of computers in chess I lost most of my interest - even in human tournaments. Every top player does lots of preparation by computer analysis these days. If I want to see some interesting and creative games, I usually grab a commented tournament book from the 20s or so, whenever I get my hands on one.

Houdini is based on another open-source engine (Ippolit / RobboLito), which are based on reverse engineering Rybka, which is, well, see the original statement (the article is hopelessly biased and gets some facts wrong).

Many people choose not to disclose their inventions and keep them a trade secret. This is done for a good reason. Disclosure, even under an NDA, doesn't guarantee it won't get disclosed to those you don't want to disclose it to.

In this case we have a panel of 34 programming chess players. Would you want anyone of that group to see your code if you want to keep it away from programming, chess-playing people?

If the author still claims his software is original, he should release the source code to the panel under an NDA strictly for the purposes of evaluation.

Except that would open the members of the panel up to potential future claims that they plagiarized from Rybka. They should have an independent third-party, one that does not write chess engines, audit the three software programs under NDA and return an analysis of how likely it is that Rybka includes code from the other engines (a la SCO vs. IBM.)

Until this third party panel, now equipped with knowledge of the inner workings of three competitive chess engines, develops the most powerful chess engine known to man!...And then gets charged with plagiarism.

According to the article the panel is actually made up of his competitors, past and present. An NDA won't cut it (certainly wouldn't for me anyway). They should have at least given him a more impartial jury.

TFA points out that Rajlich could exonerate himself by showing the source code, but then says that this isn't possible:

It’s a tricky situation, though: with Rybka now outlawed from the WCCC, and with the ICGA asking other tournaments to block its entry, the only real way Rajlich and the rest of the Rybka team can clear their names is to show their source code — a financially untenable move. In short, Rybka is stuck between a rock and a hard place.

This doesn't really hold up. Yes, Rajlich is trying to sell his software, so he can't open-source it to the world. But to exonerate himself he doesn't have to release the source-code to the world; he simply needs to arrange for the source code to be shown to the expert panel. As long as they can both confirm that: (1) the provided source compiles to the binary used in competition, and (2) there is no substantial overlap between the provided source and other known codebases, then he's in the clear. The expert panel doesn't have to retain copies of the source code beyond the review period (all copies could be destroyed).

So, really, it should be possible for Rajlich to demonstrate the originality of his code without releasing it or decreasing his commercial opportunities. The fact that he hasn't done this is strange. In that sense, it sounds to me like the ICGA made the right decision here.

I have an honest question. I'm going to assume the program is compiled into an executable, and not a scripting language like python. How do they determine if code from an open source program was used from the binary program?

From what I gathered from glancing at TFA, the panel was looking at algorithmic similarities, not necessarily code ones. In those cases, there is no need to have the source, since the algorithms will naturally be visible through the program's execution.

From this point on, Rajlich will need to prove that he did not in fact copy Fruit or Crafty, though this may be hard to do so if the above is to be trusted (ie he could've made his own code but used the exact same behaviors by simply looking at how Fruit work

The problem with looking at just algorithm similarities is that every modern chess bot uses some variant of the same algorithm so the executable code will superficially look similar (negamax.) Computers aren't powerful enough to search the entire game tree, so you have to stop after a certain number of levels (15, for instance) and use a heuristic to evaluate the strength of that position.

The main differences between chess bots are found in that heuristic.

According to the actual report [chessvibes.com] the heuristic is obviously based on Fruit's, which is what they're really angry about.

I have an honest question. I'm going to assume the program is compiled into an executable, and not a scripting language like python. How do they determine if code from an open source program was used from the binary program?

Compile it with debugging symbols and compare to the open source program compiled with debugging symbols and compare the symbol tables. How odd that so many functions are exactly the same length and have exactly the same arguments. Run both thru a profiler and notice any identical control flow loops. I suspect there's a way to ask the GCC optimizer to compare the psuedocode before it gets assembled. Heck, just rub the raw binaries against each other and look for matches. It would be hilarious to ask/force him to compile and/or link mixmaster style

Ask a "windows security researcher" dude how he identifies a file with a virus. If he says, "use norton" then fire him and repeat. Eventually you'll find someone who knows how to use the binary equivalent of "substr".

How do they determine if code from an open source program was used from the binary program?

This would be determined by isolating a version of the executable used in competition and then asking the developers to run through a procedure using only the source code to create an exact duplicate of the competition binary. Comparing that the 1's and 0's in two different executables are identical is fairly trivial.

He doesn't even have to show the code to the expert panel (who are all competitors). He could agree with the panel an independent arbitrator, and show it to them. I don't think they'd have to be experts in the field of chess programming to spot copied code, and it shouldn't matter if he's copied algorithms.

Isn't copied algorithms the stereotypical way to catch programming school plagiarizers? Why look, one smart-ish guy, and his three moron drinking buddies, all had exactly the same picket fence error in exactly the same place... What a coincidence? This does not work on tiny toy programs, but something big enough to win at chess is probably big enough.

Yeah, TFA sounds pretty seriously confused on that point. Commercial software sourcecode is, in part or in full, shown to Customers Who Matter all the time. Sometimes, it is even made publicly available, but under a license that forbids much of anything other than inspecting it. It isn't rocket surgery.

The rules of the International Computer Games Association that hosts the championship state that the program must be an original work of the developers. If the program is derived from other sources, they must be named together with the original authors.

What aside is there? It's fairly clear that his claiming the work of others as his own is the issue.

Here's a quote from the an open letter announcing the ban.

"Each program must be the original work of the entering developers. Programming teams whose code is derived from or including game-playing code written by others must name all other authors, or the source of such code, in their submission details."

Crafty is not open source software, though its license has similarities to an open source software license.
Crafty is for "personal use only", which means that it fails the Open Source Definition [opensource.org] criteria "No Discrimination Against Fields of Endeavor".

Crafty's main.c file says: "All rights reserved. No part of this program may be reproduced in any form or by any means, for other than your personal use, without the express written permission of the authors. This program may not be used in whole, nor in

The article makes the bold claim that "IBM's Deep Blue cheated to beat Garry Kasparov" the link they give mentions merely that Kasparov made such an accusation, and that the accusation was repeated in a documentary. On what basis did he make such a claim?

- He thought humans must have intervened in the middle of the game because the machine did something he didn't expect. But no actual evidence whatsoever was provided for this serious charge.- He asked for the machine's logs, but IBM refused to provide them. If he wanted the logs, he should have made that part of the agreement beforehand. IBM probably withheld them at the time to preserve secrets on exactly how the machine worked, which could have given Kas

Maybe chess-playing people just don't like losing to computers. After all, the article mentioned by the OP states that:

Not since IBM’s Deep Blue cheated to beat Garry Kasparov in 1997 has the world of computer chess been so uproarious!

As if it were a fact. Was this ever found to be the case? I thought it was only alleged by Kasparov and never proven.Since the 34-person panel of chess-playing programmers never saw the source code to Rybka, yet still concluded that different versions were plagiarizer

Humans haven't been able to compete against computers for quite a few years. Even a cheap netbook can plow through more levels of the game tree than Deep Blue could, and Deep Blue was even using custom hardware specifically designed for chess. It's not a simple case of a sore loser.