I propose a patch, which accelerates the utf-16 decoder. With PEP 393 utf-16 decoder slowed down a few times (3-4x), this patch returns the performance at the level of Python 3.2 and even higher (+10-30% over 3.2).
In addition, it fixes a few bugs in the utf-16 decoder. Also as a side effect is possible acceleration of other decoders.

Here are two new patch. Checking for characters out-of-range moved,
making the code simpler. Theoretically it is a bit slow down decoding of
short UCS1 strings (up to 1 and 3 chars on 32- and 64-bit), but
practically there is no difference. The second patch is different from
the first patch that masks are not calculated and specified explicitly.
I am not sure that it improves readability. The commiter has the choice.

Thank you, Antoine. Now only issue14625 waits for review.
> changeset: 77012:3430d7329a3b
> +* UTF-8 and UTF-16 decoding is now 2x to 4x faster.
In fact now UTF-16 decoding faster for a maximum of +25% compared to Python 3.2 on my computers (and sometimes a little slower yet). 2x to 4x it is faster compared to former slow-downed Python 3.3 (thanks to PEP 393).