Re: Why there is no interest in Computer with odds Vs Humans match?

Both my son and I have played many games against Komodo, Komodo MCTS, and Lc0 11248 (the one that's good at handicap play) at a wide variety of handicaps and time limits. I find "slow blitz" (usually 5' + 5") to be the most enjoyable

Excuse me but that's just a really sad statement. A father and his son gather around a chess board, and instead of playing against one another, they'd rather turn on some engine and play against it, and they find that the most enjoyable, more than playing against each other.

With so many humans to play against, playing against a cold machine without feelings is just pathetic, and I think it should only be done by people without access to the Internet, friends, family, or a partner to play with them.

Reminds me of cases where loving couples didn't work because one was obsessed with their cellphone. We, as humanity, have lost something when people would rather interact with an engine than with another human being. Specially father and son.

Re: Why there is no interest in Computer with odds Vs Humans match?

Both my son and I have played many games against Komodo, Komodo MCTS, and Lc0 11248 (the one that's good at handicap play) at a wide variety of handicaps and time limits. I find "slow blitz" (usually 5' + 5") to be the most enjoyable

Excuse me but that's just a really sad statement. A father and his son gather around a chess board, and instead of playing against one another, they'd rather turn on some engine and play against it, and they find that the most enjoyable, more than playing against each other.

With so many humans to play against, playing against a cold machine without feelings is just pathetic, and I think it should only be done by people without access to the Internet, friends, family, or a partner to play with them.

Reminds me of cases where loving couples didn't work because one was obsessed with their cellphone. We, as humanity, have lost something when people would rather interact with an engine than with another human being. Specially father and son.

Ray lives 3000 miles away in Vancouver Canada. I don't play against Komodo while he is visiting or even observing. I like to play against real people over the board. But for me there is no particular advantage to playing against an unknown person online as opposed to playing against an engine, except that I don't need a handicap to play the person!

Well he did play a blitz handicap match with IM Lawrence Trent, as did MVL. But I suppose you mean at a rapid or slower TC, not blitz. I think Carlsen would do so if someone sponsored such a match. He wouldn't have anything to lose really.

Re: Why there is no interest in Computer with odds Vs Humans match?

But for me there is no particular advantage to playing against an unknown person online as opposed to playing against an engine

That can be fixed by getting to know the person, several of my best online friends were met playing with them online like this. The computer I'm using currently was donated by such a friend, and I'm still meeting new people and befriending them like this.

So random online unknown persons are that by your own choice, and if chess is a fun activity, you're benefiting them by engaging in it and showing them your moves (that's something they can't get anywhere else) and the other way around (no other way to know what would they have played in the positions unless you play them). Nothing of this happens with chess engines, so the waste is that only half the people are having fun (...if at that, I never found much fun playing the engines...)

Chess is more than just moves played on the board, it's about the people that enjoy the game, that play each other, which is the entire point. "This person is valuable because I know them, and this one isn't valuable because I don't know them" is depersonifying, that random girl you'd play on a chess server is as much as a person as your son, and it's not her fault that you don't know her. Over the board or not.

Re: Why there is no interest in Computer with odds Vs Humans match?

The networks are trained on very fast games, vastly faster than games that would be played at human time limits. So they will play in a way that optimizes win prob. against an opponent much weaker than the level of the network at a realistic time limit. Whether that is above or below the level of Nakamura (for example) at a realistic time limit is hard to say, but I suspect it's close enough for the purposes we are talking about. The problem with the newer networks is not their strength, but as has been explained here by dkappe it is that resignation was introduced so they stopped training on piece down positions usually. So I suspect that Lc0 11248 plays pretty close to optimally for serious games against a top-ten human player at some rapid TC regardless of the handicap, because its training happens to emulate such an opponent fairly well.

"Optimally" there is just in the "bluffing" sense. We'd expect a network trained on an equal number of games, but of superior quality, to be a superior network in the "objectively correct" sense and then it would maybe "bluff" less, unless the networks bluffed strictly only when the position was lost. We'd expect that even if all training games are carried out till mate.

I think there are two unrelated issues at work here. The NN training will tend to make the engine play like a player of the same level as the training games, so the higher the level of the training games, the more objectively correct the play will be. The other issue is that MCTS causes play to automatically "bluff" to some degree, because it doesn't assume that the best response will always be played. So Komodo MCTS does benefit from this when playing humans or weaker engines, unlike A/B engines, while Lc0 benefits from both of the above when playing weaker opposition. But oddly Lc0 performs worse (Elo-wise) against weaker engines, despite this "bluffing".

Yes. I think part of the issue is what we mean by "bluff". If a move is objectively incorrect and gets punished, we might call it a "blunder". If it's objectively incorrect and succeeds, we might call it a "bluff", but in human games it depends on what the player knew. If the human player believed it was correct, we might call that a mutual blunder. It's only when the player knew it was incorrect and tried to swindle his opponent that we'd call it a "bluff".

The MCTS overlooking of better replies is more analogous to a miscalculating human believing his incorrect play was correct. I guess even an NN-based incorrect move that succeeds isn't quite the same as a human "bluff", but it feels like a bluff because the NN engine succeeds with it through its brute-force calculating advantage over humans.

Re: Why there is no interest in Computer with odds Vs Humans match?

The networks are trained on very fast games, vastly faster than games that would be played at human time limits. So they will play in a way that optimizes win prob. against an opponent much weaker than the level of the network at a realistic time limit. Whether that is above or below the level of Nakamura (for example) at a realistic time limit is hard to say, but I suspect it's close enough for the purposes we are talking about. The problem with the newer networks is not their strength, but as has been explained here by dkappe it is that resignation was introduced so they stopped training on piece down positions usually. So I suspect that Lc0 11248 plays pretty close to optimally for serious games against a top-ten human player at some rapid TC regardless of the handicap, because its training happens to emulate such an opponent fairly well.

"Optimally" there is just in the "bluffing" sense. We'd expect a network trained on an equal number of games, but of superior quality, to be a superior network in the "objectively correct" sense and then it would maybe "bluff" less, unless the networks bluffed strictly only when the position was lost. We'd expect that even if all training games are carried out till mate.

I think there are two unrelated issues at work here. The NN training will tend to make the engine play like a player of the same level as the training games, so the higher the level of the training games, the more objectively correct the play will be. The other issue is that MCTS causes play to automatically "bluff" to some degree, because it doesn't assume that the best response will always be played. So Komodo MCTS does benefit from this when playing humans or weaker engines, unlike A/B engines, while Lc0 benefits from both of the above when playing weaker opposition. But oddly Lc0 performs worse (Elo-wise) against weaker engines, despite this "bluffing".

Yes. I think part of the issue is what we mean by "bluff". If a move is objectively incorrect and gets punished, we might call it a "blunder". If it's objectively incorrect and succeeds, we might call it a "bluff", but in human games it depends on what the player knew. If the human player believed it was correct, we might call that a mutual blunder. It's only when the player knew it was incorrect and tried to swindle his opponent that we'd call it a "bluff".

The MCTS overlooking of better replies is more analogous to a miscalculating human believing his incorrect play was correct. I guess even an NN-based incorrect move that succeeds isn't quite the same as a human "bluff", but it feels like a bluff because the NN engine succeeds with it through its brute-force calculating advantage over humans.

I don't agree about MCTS; I would say that when MCTS plays an inferior move because it calculates that most reasonable replies will lose even though it sees one that is likely to win, I think that is perfectly analogous to a human bluff. MCTS is "hoping" that the opponent won't find the correct move.

Re: Why there is no interest in Computer with odds Vs Humans match?

I don't agree about MCTS; I would say that when MCTS plays an inferior move because it calculates that most reasonable replies will lose even though it sees one that is likely to win, I think that is perfectly analogous to a human bluff. MCTS is "hoping" that the opponent won't find the correct move.

If it sees a clear refutation of the move, will it really still play the bluff?

Re: Why there is no interest in Computer with odds Vs Humans match?

I don't agree about MCTS; I would say that when MCTS plays an inferior move because it calculates that most reasonable replies will lose even though it sees one that is likely to win, I think that is perfectly analogous to a human bluff. MCTS is "hoping" that the opponent won't find the correct move.

If it sees a clear refutation of the move, will it really still play the bluff?

Yes. Of course the devil is in the details, but it basically chooses its move by a weighted average of all plausible replies, so if one is bad for the engine but the other 29 are good for the engine, it might take the chance, depending on how "bad" and "good" and the number of visits. Remember, engines don't evaluate lines as won or drawn, but just as a win prob. (or centipawn score, which are convertible into each other). So it's not a simple question.

Re: Why there is no interest in Computer with odds Vs Humans match?

If it sees a clear refutation of the move, will it really still play the bluff?

Yes. Of course the devil is in the details, but it basically chooses its move by a weighted average of all plausible replies, so if one is bad for the engine but the other 29 are good for the engine, it might take the chance, depending on how "bad" and "good" and the number of visits. Remember, engines don't evaluate lines as won or drawn, but just as a win prob. (or centipawn score, which are convertible into each other). So it's not a simple question.

When I posted before, I was assuming that the number of visits would work out in a way that would avoid bluffing. Later, I thought different chess engines may be doing MCTS differently, so there might be differences in bluffing.

Does Komodo MCTS use UCT? I'm not sure from what Mark wrote a few months ago.

For MCTS you need to keep a tree in memory. In Komodo MCTS, a node in the tree consists of a visit count, sim of win probabilities, a policy value, a like to the next sibling, and a link to a child node. The key is determining which node to visit. So you start at the root node (the present position) and look at all its children, apply a formula like UCT1 to decide which node to select. The you look at the selected nodes children again using UCT1, and do on until you hit one of these:

a. Checkmate
b. Stalemate
c. 50 move draw
d. rep draw
e. leaf node

A leaf node has no children yet, so if that is selected, you generate its children. Lco/leela/lphaZero all use a NN to generate the win prob at this point, and fill in policy network information in the children. Then you back up the win probability, remembering to flip it as you go up the tree so it is from the point of view of that side. If you use a win prob in the range of 0.0..1.0 you just return 1.0 - winprob as you go up the tree to thr root. You increase the count and add with winprob to the nodes's winprob sum.

I see in that thread reference to the CP wiki, which says

UCT ... deals with the flaw of Monte-Carlo Tree Search, when a program may favor a losing move with only one or a few forced refutations, but due to the vast majority of other moves provides a better random playout score than other, better moves.

,
where:
Xj is the win ratio of the child
n is the number of times the parent has been visited
nj is the number of times the child has been visited
C is a constant to adjust the amount of exploration and incorporates the sqrt(2) from the UCB1 formula.
The first component of the UCB1 formula above corresponds to exploitation, as it is high for moves with high average win ratio. The second component corresponds to exploration, since it is high for moves with few simulations.