10.8.5 The binary Collation Compared to _bin Collations

This section describes how the binary
collation for binary strings compares to the
_bin collations for nonbinary strings.

Binary strings (as stored using the
BINARY,
VARBINARY, and
BLOB data types) have a character
set and collation named binary. Binary
strings are sequences of bytes and the numeric values of those
bytes determine comparison and sort order.

Nonbinary strings (as stored using the
CHAR,
VARCHAR, and
TEXT data types) have a character
set and collation other than binary. A given
nonbinary character set can have several collations, each of
which defines a particular comparison and sort order for the
characters in the set. One of these is the binary collation for
the character set, indicated by a _bin suffix
in the collation name. For example, the binary collations for
latin1 and utf8 are named
latin1_bin and utf8_bin,
respectively.

The binary collation differs from the
_bin collations in several respects.

The unit for comparison and sorting.
Binary strings are sequences of bytes. For the
binary collation, comparison and sorting
are based on numeric byte values. Nonbinary strings are
sequences of characters, which might be multibyte. Collations
for nonbinary strings define an ordering of the character
values for comparison and sorting. For the
_bin collation, this ordering is based on
numeric character code values, which is similar to ordering
for binary strings except that character code values might be
multibyte.

Character set conversion.
A nonbinary string has a character set and is automatically
converted to another character set in many cases, even when
the string has a _bin collation:

When assigning column values from another column that has a
different character set:

For binary string columns, no conversion occurs. For the
preceding cases, the string value is copied byte-wise.

Lettercase conversion.
Collations for nonbinary character sets provide information
about lettercase of characters, so characters in a nonbinary
string can be converted from one lettercase to another, even
for _bin collations that ignore lettercase
for ordering: