When working with MySQL, the most reliable way to create a database with UTF-8 character encoding is:
CREATE DATABASE dbname
CHARACTER SET utf8mb4
COLLATE utf8mb4_unicode_ci;
In MySQL, utf8
is actually a subset of UTF-8 that only supports characters up to 3 bytes. For full Unicode support including emoji and special characters, always use utf8mb4
:
-- This supports complete Unicode including emoji:
CREATE DATABASE modern_app
CHARACTER SET utf8mb4
COLLATE utf8mb4_unicode_ci;
-- Whereas this has limitations:
CREATE DATABASE legacy_app
CHARACTER SET utf8
COLLATE utf8_general_ci;
Here are some real-world implementations:
-- Basic web application database
CREATE DATABASE web_app_db
CHARACTER SET utf8mb4
COLLATE utf8mb4_unicode_ci;
-- With additional options
CREATE DATABASE ecommerce
CHARACTER SET utf8mb4
COLLATE utf8mb4_unicode_ci
DEFAULT ENCRYPTION='N';
After creation, verify the character set with:
SELECT default_character_set_name, default_collation_name
FROM information_schema.SCHEMATA
WHERE schema_name = 'your_database_name';
To change character set for existing databases:
ALTER DATABASE existing_db
CHARACTER SET utf8mb4
COLLATE utf8mb4_unicode_ci;
For proper operation, also set connection character set:
-- In MySQL client
SET NAMES utf8mb4;
-- In PHP PDO
$pdo = new PDO('mysql:host=hostname;dbname=dbname;charset=utf8mb4',
'username', 'password');
When working with multilingual applications, setting the proper character encoding is crucial. MySQL's default character set might not be UTF-8 depending on your server configuration. Here's the complete syntax:
CREATE DATABASE database_name
CHARACTER SET utf8mb4
COLLATE utf8mb4_unicode_ci;
The utf8mb4
character set is actually what most developers need rather than just utf8
:
- Supports full Unicode including emojis (4-byte UTF-8)
- Backward compatible with standard UTF-8 (3-byte)
- Recommended by MySQL for all new applications
Basic database creation:
CREATE DATABASE multilingual_app
CHARACTER SET utf8mb4
COLLATE utf8mb4_unicode_ci;
With additional options:
CREATE DATABASE ecommerce_db
CHARACTER SET utf8mb4
COLLATE utf8mb4_unicode_ci
DEFAULT ENCRYPTION='Y';
After creation, verify with:
SELECT default_character_set_name, default_collation_name
FROM information_schema.SCHEMATA
WHERE schema_name = 'your_database_name';
If you need to change character set after creation:
ALTER DATABASE existing_db
CHARACTER SET utf8mb4
COLLATE utf8mb4_unicode_ci;
For complete UTF-8 support, also set client connection:
SET NAMES utf8mb4;