Many developers encounter encoding issues when piping UTF-8 output to the less
command. While terminal emulators typically handle Unicode characters properly, the default configuration of less
might display raw byte sequences instead of proper UTF-8 characters.
First, let's verify your current environment settings:
$ locale
LANG="en_US.UTF-8"
LC_COLLATE="C"
LC_CTYPE="C"
LC_MESSAGES="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_ALL="C"
The key issue here is that your LC_CTYPE
is set to "C", which tells the system to use ASCII rather than UTF-8 for character handling.
1. Temporary Solution: Using LESSCHARSET
You can explicitly tell less
to use UTF-8 encoding:
$ echo -e '\xe2\x82\xac' | LESSCHARSET=utf-8 less
Or make it permanent by adding to your shell configuration:
export LESSCHARSET=utf-8
2. Permanent Solution: Fixing Locale Settings
A more thorough solution is to properly configure your locale settings:
export LC_ALL=en_US.UTF-8
export LANG=en_US.UTF-8
Add these lines to your ~/.bash_profile
or ~/.zshrc
(depending on your shell).
3. Alternative: Using less with -r flag
For some cases, the raw control characters option might help:
$ echo -e '\xe2\x82\xac' | less -r
After making changes, test with various UTF-8 characters:
$ echo -e '\xe2\x82\xac \xf0\x9f\x98\x80 \xe0\xa4\xb9' | less
Should display: € ? ह
If you're still having issues, you might need to build less
from source with proper UTF-8 support:
brew reinstall less --with-regex=pcre
Or on systems without Homebrew:
wget https://www.greenwoodsoftware.com/less/less-590.tar.gz
tar xvf less-590.tar.gz
cd less-590
./configure --with-regex=pcre
make
sudo make install
When working with UTF-8 encoded text in Mac Terminal, you might encounter display issues specifically with the less
command. While direct terminal output shows characters correctly, piping through less
renders them as escaped sequences.
$ locale
LANG="en_US.UTF-8"
LC_COLLATE="C"
LC_CTYPE="C"
LC_MESSAGES="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_ALL="C"
The key issue lies in your locale settings where most variables are set to "C" instead of UTF-8 encoding.
Option 1: Force UTF-8 mode in less
echo -e '\xe2\x82\xac' | LESSCHARSET=utf-8 less
Option 2: Set environment variables before using less
export LC_CTYPE=en_US.UTF-8
export LC_ALL=en_US.UTF-8
echo -e '\xe2\x82\xac' | less
Add these lines to your shell configuration file (~/.bashrc, ~/.zshrc, etc.):
# Set locale to UTF-8
export LC_CTYPE=en_US.UTF-8
export LC_ALL=en_US.UTF-8
# Default less options for UTF-8
export LESSCHARSET=utf-8
After making changes, verify with:
$ locale
$ echo -e '\xe2\x82\xac' | less
You should now see the euro symbol (€) displayed correctly in both direct output and when piped to less.
If the issue persists, consider:
- Updating your terminal emulator
- Using
most
as an alternative pager - Checking font support in your terminal
- Test with different UTF-8 characters: 日本語, русский, 中文
- Check terminal encoding settings (Preferences → Encodings)
- Try different fonts like Menlo or Monaco