mysql_set_charset

(PHP 5 >= 5.2.3)

mysql_set_charset — Sets the client character set

Warning

This extension was deprecated in PHP 5.5.0, and it was removed in PHP 7.0.0.
Instead, the MySQLi or PDO_MySQL extension should be used.
See also MySQL: choosing an API guide and
related FAQ for more information.
Alternatives to this function include:

Description

Parameters

charset

A valid character set name.

link_identifier

The MySQL connection. If the
link identifier is not specified, the last link opened by
mysql_connect() is assumed. If no such link is found, it
will try to create one as if mysql_connect() had been called
with no arguments. If no connection is found or established, an
E_WARNING level error is generated.

User Contributed Notes 8 notes

I needed to access the database from within one particular webhosting service. Pages are UTF-8 encoded and data received by forms should be inserted into database without changing the encoding. The database is also in UTF-8.

Neither SET character set 'utf8' or SET names 'utf8' worked properly here, so this workaround made sure all variables are set to utf-8.

Once this is set we need not manually encode the text into utf using utf8_encode() or other functions. The arabic ( or any UTF8 supported ) text can be passed directly to the database and it is automatically converted by PHP.For eg.<?php$link = mysql_connect('localhost', 'user', 'password');mysql_set_charset('utf8',$link);$db_selected = mysql_select_db('emp_feedback', $link);if (!$db_selected) { die ('Database access error : ' . mysql_error());}$query = "INSERT INTO feedback ( EmpName, Message ) VALUES ('$_empName','$_message')";mysql_query($query) or die('Error, Feedback insert into database failed');?>Note that here $_empName is stored in English while $_message is in Arabic.

I just hope that the text below will help someone who is struggling with charset encoding, specially when php-charset is different from the mysql-charset. Let me add that I really think that the php man-pages on the mysql-functions are lacking a lot of details on this important issues. Could someone add some useful text here?

3. Set your apache environment to utf-8 by adding 'AddDefaultCharset utf-8' to your .htaccess. If you do not use apache add 'default_charset utf-8' to your php.ini. You have to do either of them (not both), php will use the apache setting where needed.

5. Check that the above line are listened to by check the 'page info' of your pages in firefox. It should show 2 (!!) utf-8 entries.

======== all of the above sofar has nothing to do with mysql ;-) ======

6. Do *NOT* (repeat NOT!) set the 'names' (set names *) or _ANY_ 'character set' (set character set *) (opposed to what they tell you on these pages).

7. Check the previous item by listing the results of the mysql query 'SHOW session VARIABLES'. All char_sets here should say 'latin1', except for the system one which is always 'utf8'. All collations should say 'latin1_*'. Furthermore the php function mysql_client_encoding() should also return latin1 (though I don't understand why; what does this value mean, I would think if php (being the client) is utf8 encoded this would be utf8?)

8 Finally test the above by storing this string in your db and output it in your webpage: 'Iñtërnâtiônàlizætiøn and €'.

Now what was interesting during testing and debugging of the above findings was:1. If I would run 'mysql_set_charset('utf8')' _OR_ 'mysql_query("SET NAMES 'utf8'");' and then run a query in which I would have 'where char_column = 'abc''it would die with 'Illegal mix of collations'

2. If I would run 'mysql_query("SET character_set_client = 'utf8';"); mysql_query("SET character_set_result = 'utf8';")' the query would work BUT the non-ascii-characters would show scrambled in the browser.

3. BUT these 2 points above work just fine on my local dev-machine (php 5.2.3 & mysql 5.0.45)!!!!!!!!

This draws me to these 3 conclusions:

1. The Php-mysql-function library (5.2.+) does a fine job translating utf-8 queries & results to/from latin1! It's better to let php handle this for you then to have mysql do this.

2. Mysql (4.0.+) has 1 or more bugs (well, let's say unfinished features) that involve the charset-translations that are solved in 5.0.+.

3. It is not well enough documented! (Otherwise I would have to write this)

One last remark: clearly characters that exist in utf8 and not in latin1 (and vv.) will get lost during utf8-latin1-utf8 translation.

If any of the above is not correct or not complete feel free to correct this! (Or better yet, add a chapter to the php manual :-)

I need to revoke most of my post below. What I found out afterwards is this:

1. if you do not use mysql_set_char mysql will NOT do any translations and thus store a utf8-character-byte as is. If you then retrieve this byte from the db and output it in a utf8 page it will show just fine BUT if other apps query this byte (expecting to find a latin1 byte) they will go wrong.

2. the 'bug' mentioned before only occurs if you use a ucase or lcase function in your statement (like: latin1_col = ucase('utf8 string')