NAME
locale
—
character encoding and localization
conventions
SYNOPSIS
locale |
[-a | -m |
charmap ] |
DESCRIPTION
If the locale
utility is invoked without
any arguments, the current locale configuration is shown. Values for
categories that are not set in the environment or that are overridden by
LC_ALL
are displayed between double quotes.
The options are as follows:
-a
- Display a list of supported locales.
-m
- Display a list of supported character encodings. On OpenBSD, this always returns UTF-8 only.
charmap
- Display the currently selected character encoding. On OpenBSD, this returns either US-ASCII or UTF-8.
A locale is a set of environment variables telling programs which character encoding, language and cultural conventions the user prefers. Programs in the OpenBSD base system ignore the locale except for the character encoding, and it is not recommended to use any of these variables except that the following non-default setting is supported as an option:
export
LC_CTYPE=en_US.UTF-8
Programs installed from packages(7) may or may not change behavior according to the locale. Many programs use the X/Open System Interfaces naming scheme for the contents of the variables listed below, which is language[_TERRITORY][.encoding][@modifier]
The behavior of some library functions may also depend on the locale, and it does on most other operating systems. The OpenBSD C library tends to avoid locale-dependent behavior except with respect to character encoding. See the manual pages of individual functions for details.
The character encoding locale LC_CTYPE
instructs programs which character encoding to assume for text input and to
use for text output. A character encoding maps each character of a given
character set to a byte sequence suitable for storing or transmitting the
character.
The OpenBSD base system supports two
locales: the default of LC_CTYPE=C
selects the
US-ASCII character set and encoding, treating the bytes 0x80 to 0xff as
non-printable characters of application-specific meaning.
LC_CTYPE=POSIX
is an alias for
LC_CTYPE=C
. The alternative of
LC_CTYPE=en_US.UTF-8
selects the UTF-8 encoding of
the Unicode character set, which is supported by many parts of the system,
but not yet fully supported by all parts.
If the value of LC_CTYPE
ends in
‘.UTF-8
’, programs in the
OpenBSD base system ignore the beginning of it,
treating for example zh_CN.UTF-8 exactly like en_US.UTF-8. Programs from
packages(7) may however make a difference. If the value of
LC_CTYPE
is unsupported, programs and libraries in
the OpenBSD base systems fall back to
LC_CTYPE=C
.
Some programs, for example write(1), deliberately ignore the locale and always use US-ASCII only. See the manual pages of individual programs for details.
ENVIRONMENT
The locale configuration consists of the following environment variables:
LC_ALL
- Overrides all other
LC_*
variables below. LC_COLLATE
- Intended to affect collation order. It may for example affect alphabetic sorting, regular expressions including equivalence classes, and the strcoll(3) and strxfrm(3) functions.
LC_CTYPE
- Intended to affect character encoding, character classification, and case conversion. For example, it is used by mbtowc(3), iswctype(3), iswalnum(3), towlower(3), fgetwc(3), fputwc(3), printf(3), and scanf(3).
LC_MESSAGES
- Intended to affect the output of informative and diagnostic messages and the interpretation of interactive responses, in particular regarding the language. It is used by catopen(3).
LC_MONETARY
- Intended to affect monetary formatting.
LC_NUMERIC
- Intended to affect numeric, non-monetary formatting, for example the radix character and thousands separators. On other operating systems, it may for example affect printf(3), scanf(3), and strtod(3).
LC_TIME
- Intended to affect date and time formats. It may for example affect strftime(3).
LANG
- Fallback if any of the above is unset.
NLSPATH
- Used by catopen(3) to locate message catalogs.
FILES
- /usr/share/locale/UTF-8/LC_CTYPE
- Character classification, case conversion, and character display width database in mklocale(1) binary output format used by setlocale(3).
- /usr/local/share/locale/
- Localization data for
packages(7), in particular
LC_MESSAGES
catalogs in GNU gettext format. - /usr/local/share/nls/
- Localization data for
packages(7), in particular
LC_MESSAGES
catalogs in catopen(3) format. - /usr/src/share/locale/ctype/en_US.UTF-8.src
- Character classification, case conversion, and character display width database in mklocale(1) input format.
- /usr/libdata/perl5/unicore/
- Complete Unicode data used for generating the above database.
- /usr/src/gnu/usr.bin/perl/lib/unicore/UnicodeData.txt
- The most important parts of Unicode data in a compact, more easily human-readable format.
EXIT STATUS
The locale
utility exits 0 on
success, and >0 if an error occurs.
SEE ALSO
mklocale(1), setlocale(3), Unicode::UCD(3p)
Related ports: converters/libiconv, devel/gettext, textproc/icu4c
STANDARDS
With respect to locale support, most libraries and programs in the
OpenBSD base system, including the
locale
utility, implement a subset of the
IEEE Std 1003.1-2008 (“POSIX.1”)
specification.
HISTORY
The locale
utility was first standardized
in the X/Open Portability Guide Issue 4
(“XPG4”).
It was rewritten from scratch for OpenBSD 5.4 during the 2013 Toronto hackathon.
AUTHORS
Stefan Sperling <stsp@openbsd.org> with contributions from Philip Guenther <guenther@openbsd.org> and Jeremie Courreges-Anglas <jca@openbsd.org>. This manual page was written by Ingo Schwarze <schwarze@openbsd.org>.
BUGS
The locale
concept is inadequate for
inter-process communication. Two processes exchanging text, for example over
a network, using sockets, in shared memory, or even using plain text files
always need a protocol-specific way to negotiate the character encoding
used.
The list of supported locales is perpetually incomplete.