hi, I'm currently developing a chess engine in c++ (here is the github repository: https://github.com/crybot/NapoleonPP), it is called NapoleonPP (probably i will change the name...) and it is based on magic bitboards move generation.

I've recently completed move generation which i think is fast enough (it can calculate perft 6 in about 1,2 seconds without special improvements like threading or transposition tables) . So I started working on search and evaluation, my primary goal was to build a nice search function so i just created a very simple evaluation function that takes care only of material, movement of some pieces and center control.

Now i'm stuck...

I tried to play with the program, and i must say it plays a discrete chess, but it is sooo slow... at some point it takes more than 10 seconds to search under ply 6...

what i need now is to know what to do first to improve my engine speed.

Move ordering is probably the most important part, especially in the quiescence search. MVV-LVA is a simple and effective order for captures. For non-captures, search killer moves first and then everything else in history-heuristic order. You can then try to move captures with negative SEE to the end of the move list, so they are searched after non-captures.

Ah, and more important than most of those things, you should store the best move in the hash tables, and try it first in case of a hash hit where you cannot just return a score (because of not enough depth or a useless bound).

By the way, perft 6 in 1.2 seconds is pretty impressive, if it's from the starting position. My own engine fits a similar description to yours, and my perft is about 6 times slower than that (although we are not using the same hardware, of course).

Thanks for your answer, now I see all more clear, but I still have some doubts:regarding Transposition tables, should i save leaf nodes during alphabeta (and maybe also during quiescence search)? I tried to run the program without saving them and it seems to work faster.

That's a minor detail. In my own engine, I don't use the transposition tables at all in quiescence search, but I know others do it. You will just have to test it. But it's not critical.

For storing the best move, can i just use the index of the move inside the array?

I used to do that in my old engine (Ruy-López), but that requires you to generate all the moves before you even try the move from the transposition tables. Since this move very often produces a beta cutoff, it's worth writing a function that checks validity of a move on a given position and try to avoid generating the move list altogether.

P.S. i forgot one important thing: How can i prove my transposition table is working correctly?

That's a very difficult question. A few ideas:* You can run a search up to depth D (I assume you are using iterative deepening) and then rerun it without clearing the transposition tables: It should be almost instantaneous.* You can run a long search on a position with and without transposition tables, and the version with transposition tables should generally get to the same depth faster.* You can match your program against itself, with and without transposition tables, at a fixed depth, and the results should be even.

I recommend you spend some time trying better replacement schemes. I use buckets of size 4, which seems to work very well.

I used to do that in my old engine (Ruy-López), but that requires you to generate all the moves before you even try the move from the transposition tables. Since this move very often produces a beta cutoff, it's worth writing a function that checks validity of a move on a given position and try to avoid generating the move list altogether.

You said that you can encode a move in 16 bit, but as far as I know the minimum is 32 bit (how do you manage castle and en passants?). Omitting this, I made some test when i was working on the move generator and it came out that my move encoding (which is a class with 4 byte fields ) is faster than using a single 32 bit integer for representing it. Should I change encode method for saving some memory in the transposition table?

That's a very difficult question. A few ideas:* You can run a search up to depth D (I assume you are using iterative deepening) and then rerun it without clearing the transposition tables: It should be almost instantaneous.* You can run a long search on a position with and without transposition tables, and the version with transposition tables should generally get to the same depth faster.* You can match your program against itself, with and without transposition tables, at a fixed depth, and the results should be even.

for trying point one I should first implement iterative deeping (when should i do that?).

I will try point 2 and 3 right now.

Thank you for your detailed answers.

P.S. I've just implemented MVV-LVA and the search is now MUCH faster! thank you.

I used to do that in my old engine (Ruy-López), but that requires you to generate all the moves before you even try the move from the transposition tables. Since this move very often produces a beta cutoff, it's worth writing a function that checks validity of a move on a given position and try to avoid generating the move list altogether.[/size]

That's sound cheaper than what I implemented now, I'll try.You said that you can encode a move in 16 bit, but as far as I know the minimum is 32 bit (how do you manage castle and en passants?).

Castling and en-passants are encoded in the 4-bit "move type" described in the page I linked to. That plus 6-bit "to" and "from" gets you to 16 bits.

Omitting this, I made some test when i was working on the move generator and it came out that my move encoding (which is a class with 4 byte fields ) is faster than using a single 32 bit integer for representing it. Should I change encode method for saving some memory in the transposition table?

No, you can always convert to and from the 16-bit format just for transposition-table access. That shouldn't be too slow.

That's a very difficult question. A few ideas:[/size]* You can run a search up to depth D (I assume you are using iterative deepening) and then rerun it without clearing the transposition tables: It should be almost instantaneous.[/size]* You can run a long search on a position with and without transposition tables, and the version with transposition tables should generally get to the same depth faster.[/size]* You can match your program against itself, with and without transposition tables, at a fixed depth, and the results should be even.[/size]

for trying point one I should first implement iterative deeping (when should i do that?).

Now would be a good time. It's basically impossible to implement time control without it, and it's very hard to work with an engine that doesn't control time properly.

P.S. I've just implemented MVV-LVA and the search is now MUCH faster! thank you.

Now would be a good time. It's basically impossible to implement time control without it, and it's very hard to work with an engine that doesn't control time properly.

I've just implemented a sort of iterative deeping (without time control, just for testing the engine) and started checking correctness of the transposition table.

I have found a few bugs and the engine seems to work properly now:

* You can run a search up to depth D (I assume you are using iterative deepening) and then rerun it without clearing the transposition tables: It should be almost instantaneous.* You can run a long search on a position with and without transposition tables, and the version with transposition tables should generally get to the same depth faster.* You can match your program against itself, with and without transposition tables, at a fixed depth, and the results should be even.

point 1 and 2 are ok, but point at point 3 i noticed that then engine doesn't produce the same final state after a "self-play" with and without transposition table.

I also tried to build a perft routine with the transposition table and, probably for some kind of collision error, i got, at depth 6, a node count of 119060316 instead of 119060324 (that i can obtain if I reduce the table size...).

I think zobrist keys are ok, so what's going on?

The last thing I have doubts about is principal variation: I'm triyng to improve move ordering and it seems that principal variation moves are the most important moves to try during alpha-beta search, but, with transposition table implemented, aren't them already the first moves tried (best moves)?

The data structure seems correct. The only comment I have is that if you switch to using a 16-bit representation for the move, the size of a hash entry will go from 24 bytes to 16 bytes (this depends on the compiler, of course).

The data structure seems correct. The only comment I have is that if you switch to using a 16-bit representation for the move, the size of a hash entry will go from 24 bytes to 16 bytes (this depends on the compiler, of course).

I know, I planned to change move encoding not just for the hashentry, but also for move generation. It will take some time, so i'll do it when i can.

Anyway now a HashEntry is a 18 bytes object.

Can you tell me something about what i wrote in my previous post?

The last thing I have doubts about is principal variation: I'm triyng to improve move ordering and it seems that principal variation moves are the most important moves to try during alpha-beta search, but, with transposition table implemented, aren't them already the first moves tried (best moves)?

It doesn't... but what it does do (EDIT: by default) is align 32 bit ints on a 4 byte boundary, and 16 bit ints on a 2 byte boundary. So there is 2 bytes wasted after you have 2 bytes followed by an int.

You get the best packing in classes and structs by having the largest members first in the struct/class. Changing the order shouldn't require changes to any client code.

Edited by Paradigm Shifter, 26 April 2013 - 10:02 AM.

"Most people think, great God will come from the sky, take away everything, and make everybody feel high" - Bob Marley

You can't expect the state to be the same (see [url="http://chessprogramming.wikispaces.com/Search+Instability"search instability). But you can expect playing at a fixed depth should produce similar playing strength with and without transposition tables, which is what I suggested you test.

You can't expect the state to be the same (see [url="http://chessprogramming.wikispaces.com/Search+Instability"search instability). But you can expect playing at a fixed depth should produce similar playing strength with and without transposition tables, which is what I suggested you test.

sorry then, I misunderstood what you said.

Now I think i have no more problems and dubts with search, I just have to improve move ordering and try to apply more advanced search strategies.