The mbrtowc() function examines at most
n bytes of the multibyte character byte
string pointed to by s, converts those bytes
to a wide character, and stores the wide character in the wchar_t object
pointed to by wc if
wc is not
NULL and s
points to a valid character.

Conversion happens in accordance with the conversion state described by the
mbstate_t object pointed to by mbs. The
mbstate_t object must be initialized to zero before the application's first
call to mbrtowc(). If the previous call to
mbrtowc() did not return (size_t)-1, the
mbstate_t object can safely be reused without reinitialization.

The behaviour of mbrtowc() is affected by the
LC_CTYPE category of the current locale. If
the locale is changed without reinitialization of the mbstate_t object pointed
to by mbs, the behaviour of
mbrtowc() is undefined.

Unlike mbtowc(3),
mbrtowc() will accept an incomplete byte
sequence pointed to by s which does not form
a complete character but is potentially part of a valid character. In this
case, mbrtowc() consumes all such bytes.
The conversion state saved in the mbstate_t object pointed to by
mbs will be used to restart the suspended
conversion during the next call to
mbrtowc().

In state-dependent encodings, s may point to a
special sequence of bytes called a “shift sequence”. Shift
sequences switch between character code sets available within an encoding
scheme. One encoding scheme using shift sequences is ISO/IEC 2022-JP, which
can switch e.g. from ASCII (which uses one byte per character) to JIS X 0208
(which uses two bytes per character). Shift sequence bytes correspond to no
individual wide character, so mbrtowc()
treats them as if they were part of the subsequent multibyte character.
Therefore they do contribute to the number of bytes in the multibyte
character.

Special cases in interpretation of arguments are as follows:

wc == NULL

The conversion from a multibyte character to a wide character is performed
and the conversion state may be affected, but the resulting wide character
is discarded.

This can be used to find out how many bytes are contained in the multibyte
character pointed to by s.

s == NULL

mbrtowc() ignores
wc and
n, and behaves equivalent to

mbrtowc(NULL, "", 1, mbs);

which attempts to use the mbstate_t object pointed to by
mbs to start or continue conversion using
the empty string as input, and discards the conversion result.

If conversion succeeds, this call always returns zero. Unlike
mbtowc(3), the value
returned does not indicate whether the current encoding of the locale is
state-dependent, i.e. uses shift sequences.

mbs == NULL

mbrtowc() uses its own internal state
object to keep the conversion state, instead of an mbstate_t object
pointed to by mbs. This internal
conversion state is initialized once at program startup. It is not safe to
call mbrtowc() again with a
NULLmbs argument if
mbrtowc() returned (size_t)-1 because
at this point the internal conversion state is undefined.

Calling any other functions in libc never
changes the internal conversion state object of
mbrtowc().

The bytes pointed to by s form a
terminating NUL character. If wc is not
NULL, a NUL wide character has been
stored in the wchar_t object pointed to by
wc.

positive

s points to a valid character, and the
value returned is the number of bytes completing the character. If
wc is not
NULL, the corresponding wide character
has been stored in the wchar_t object pointed to by
wc.

(size_t)-1

s points to an illegal byte sequence
which does not form a valid multibyte character in the current locale.
mbrtowc() sets
errno to EILSEQ. The conversion state
object pointed to by mbs is left in an
undefined state and must be reinitialized before being used again.

Because applications using mbrtowc() are
shielded from the specifics of the multibyte character encoding scheme, it
is impossible to repair byte sequences containing encoding errors. Such
byte sequences must be treated as invalid and potentially malicious input.
Applications must stop processing the byte string pointed to by
s and either discard any wide characters
already converted, or cope with truncated input.

(size_t)-2

s points to an incomplete byte sequence
of length n which has been consumed and
contains part of a valid multibyte character. The character may be
completed by calling mbrtowc() again
with s pointing to one or more subsequent
bytes of the multibyte character and mbs
pointing to the conversion state object used during conversion of the
incomplete byte sequence.

mbrtowc() is not suitable for programs that
care about internals of the character encoding scheme used by the byte string
pointed to by s.

It is possible that mbrtowc() fails because
of locale configuration errors. An “invalid” character sequence
may simply be encoded in a different encoding than that of the current locale.

The special cases for s == NULL and
mbs == NULL do not make any sense. Instead of
passing NULL for
mbs,
mbtowc(3) can be used.

Earlier versions of this man page implied that calling
mbrtowc() with a
NULLs
argument would always set mbs to the initial
conversion state. But this is true only if the previous call to
mbrtowc() using
mbs did not return (size_t)-1 or (size_t)-2.
It is recommended to zero the mbstate_t object instead.