CakePHP and Character Set in the Database
Today I found something in the CakePHP which saves me a lot of effort when I need to migrate the database as well as searching for non-English characters from the Database. What is the story:
I had 2-3 projects build with Cake which were for non-English environments. That means the localization of the projects as well as storing proper data into DB with proper encoding. The current project is the same.
In these projects I saw that the data was not stored properly in the MySQL. From the Cake project interface everything looks ok, but when I open phpMyAdmin – the chars looked like strange symbols /not ??? neither ӗ&୶/. So far I ignored this because as far it is working from CakePHP interface for me everything was ok.
Until today when I need a Search functionality which should get results even if the users searched by lowercase or uppercase. The example is: if you have an entry in the database “CakePHP” and when you searching for “cakephp” or “CAKEPHP” both results to extract “CakePHP” from the database. Looking pretty easy, right? Well, without the proper encoding was quite difficult. How I solve that problem.
Here is the solution:
In the config/database.php file as you probably know there are 2 connections set and they look something like this:
var $default = array(
'driver' => 'mysql',
'persistent' => false,
'host' => 'localhost',
'login' => 'user',
'password' => 'password',
'database' => 'project_name',
'prefix' => ''
);
var $test = array(
'driver' => 'mysql',
'persistent' => false,
'host' => 'localhost',
'login' => 'user',
'password' => 'password',
'database' => 'project_name-test',
'prefix' => ''
);
}
When I check the DBO file, I noticed that there is a missing node in these arrays. It’s optional, but I think it’s good to be placed even as empty variable when the default configuration is downloaded from svn.
This trigger a function which execute a query like this
after the connection is established, where encoding in my case is UTF8. With this the data is stored properly.
the new configuration will look like this:
var $default = array(
'driver' => 'mysql',
'persistent' => false,
'host' => 'localhost',
'login' => 'user',
'password' => 'password',
'database' => 'project_name',
'prefix' => '',
'encoding' => 'utf8'
);
var $test = array(
'driver' => 'mysql',
'persistent' => false,
'host' => 'localhost',
'login' => 'user',
'password' => 'password',
'database' => 'project_name-test',
'prefix' => '',
'encoding' => 'utf8'
);
}
Now the data is stored properly in the database and it’s possible to use functions in MySQL such as UPPER(), LOWER() etc.

Good to know. I used to put
$this->execute(’SET NAMES utf8′);
in app_model.php, but this is obviously more elegant
rp.
thx for a very nice post.
i’ve been struggling with UTF-8 for quite a while already. however i still not sure about the different b/w utf8, UTF-8 utf-8
any1 has any idea of which one we’d use?
Tom, basically this is the syntax of the database format. if you look in the code of MySQL Database connector you will see there is a function:
return $this->_execute('SET NAMES ' . $enc) != false;
}
which do the magic
So UTF8 or utf8 – it doesn’t matter in this case, about utf-8 – it is the HTML encoding
Hope this make sense
Thanks man! 5 hours of my life