We introduce a novel algorithm called Upper Confi- dence Weighted Learning (UCWL) for online mul- ticlass learning from binary feedback. UCWL com- bines the Upper Confidence Bound (UCB) frame- work with the Soft Confidence Weighted (SCW) online learning scheme. UCWL achieves state of the art performance (especially on noisy and non- separable data) with low computational costs. Es- timated confidence intervals are used for informed exploration, which enables faster learning than the uninformed exploration case or the case where ex- ploration is not used. The targeted application set- ting is human-robot interaction (HRI), in which a robot is learning to classify its observations while a human teaches it by providing only binary feedback (e.g., right/wrong). Results in an HRI experiment, and with two benchmark datasets, show UCWL outperforms other algorithms in the online binary feedback setting, and surprisingly even sometimes beats state-of-the-art algorithms that get full feed- back, while UCWL gets only binary feedback on the same data.