Hello. I wanted to try to implement an improved UCB algorithm
(article UCB revisited: Improved regret bounds for the stochastic
multi-armed bandit problem),
but i'am a little bit lost in the code. where is the main part where the
ucb algorithm is implemented.
I read Petr Baudis' Master's Thesis <http://pasky.or.cz/go/prace.pdf>, and
understand that the main policy of pachi is RAVE, but that
you implemented ucb to , but can't find the code !
Thanks,
JM
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://rover.ms.mff.cuni.cz/pipermail/pachi/attachments/20111231/6e1f57b0/attachment.html>