How to Create a MySQL Database with UTF-8 Character Set: Complete Command Guide


7 views

When working with MySQL, the most reliable way to create a database with UTF-8 character encoding is:

CREATE DATABASE dbname 
CHARACTER SET utf8mb4 
COLLATE utf8mb4_unicode_ci;

In MySQL, utf8 is actually a subset of UTF-8 that only supports characters up to 3 bytes. For full Unicode support including emoji and special characters, always use utf8mb4:

-- This supports complete Unicode including emoji:
CREATE DATABASE modern_app 
CHARACTER SET utf8mb4 
COLLATE utf8mb4_unicode_ci;

-- Whereas this has limitations:
CREATE DATABASE legacy_app 
CHARACTER SET utf8 
COLLATE utf8_general_ci;

Here are some real-world implementations:

-- Basic web application database
CREATE DATABASE web_app_db
CHARACTER SET utf8mb4
COLLATE utf8mb4_unicode_ci;

-- With additional options
CREATE DATABASE ecommerce 
CHARACTER SET utf8mb4
COLLATE utf8mb4_unicode_ci
DEFAULT ENCRYPTION='N';

After creation, verify the character set with:

SELECT default_character_set_name, default_collation_name
FROM information_schema.SCHEMATA
WHERE schema_name = 'your_database_name';

To change character set for existing databases:

ALTER DATABASE existing_db
CHARACTER SET utf8mb4
COLLATE utf8mb4_unicode_ci;

For proper operation, also set connection character set:

-- In MySQL client
SET NAMES utf8mb4;

-- In PHP PDO
$pdo = new PDO('mysql:host=hostname;dbname=dbname;charset=utf8mb4', 
               'username', 'password');

When working with multilingual applications, setting the proper character encoding is crucial. MySQL's default character set might not be UTF-8 depending on your server configuration. Here's the complete syntax:

CREATE DATABASE database_name
CHARACTER SET utf8mb4
COLLATE utf8mb4_unicode_ci;

The utf8mb4 character set is actually what most developers need rather than just utf8:

  • Supports full Unicode including emojis (4-byte UTF-8)
  • Backward compatible with standard UTF-8 (3-byte)
  • Recommended by MySQL for all new applications

Basic database creation:

CREATE DATABASE multilingual_app
CHARACTER SET utf8mb4
COLLATE utf8mb4_unicode_ci;

With additional options:

CREATE DATABASE ecommerce_db
CHARACTER SET utf8mb4
COLLATE utf8mb4_unicode_ci
DEFAULT ENCRYPTION='Y';

After creation, verify with:

SELECT default_character_set_name, default_collation_name
FROM information_schema.SCHEMATA
WHERE schema_name = 'your_database_name';

If you need to change character set after creation:

ALTER DATABASE existing_db
CHARACTER SET utf8mb4
COLLATE utf8mb4_unicode_ci;

For complete UTF-8 support, also set client connection:

SET NAMES utf8mb4;