Codebase list libencode-locale-perl / 2c880c5
More initial description of the problem this module solves Gisle Aas 13 years ago
1 changed file(s) with 38 addition(s) and 14 deletion(s). Raw diff Collapse all Expand all
128128 $string = decode(locale => $bytes);
129129 $bytes = encode(locale => $string);
130130
131 binmode(STDIN, ":encoding(console_in)");
132 binmode(STDOUT, ":encoding(console_out)");
133 binmode(STDERR, ":encoding(console_out)");
131 if (-t) {
132 binmode(STDIN, ":encoding(console_in)");
133 binmode(STDOUT, ":encoding(console_out)");
134 binmode(STDERR, ":encoding(console_out)");
135 }
134136
135137 # Processing file names passed in as arguments
136138 $file = decode(locale => $ARGV[0]);
140142
141143 =head1 DESCRIPTION
142144
143 Perl uses Unicode to represent strings internally but many of the interfaces it
144 has to the outside world is still byte based. Programs therefore needs to decode
145 strings that enter the program from the outside and encode them again on the way
146 out.
147
148 The POSIX locale system is used to specify both the language conventions to use
149 and the prefered character set to consume and output. This module looks up the
150 charset (called a CODESET in the locale jargon) and arrange for the L<Encode>
151 module to know this encoding under the name "locale".
152
153 In addition the following functions and variables are provided:
145 In many applications it's wise to let Perl use Unicode for the strings it
146 processes. Most of the interfaces Perl has to the outside world is still byte
147 based. Programs therefore needs to decode byte strings that enter the program
148 from the outside and encode them again on the way out.
149
150 The POSIX locale system is used to specify both the language conventions
151 requested by the user and the preferred character set to consume and
152 output. The C<Encode::Locale> module looks up the charset and encoding (called
153 a CODESET in the locale jargon) and arrange for the L<Encode> module to know
154 this encoding under the name "locale". It means bytes obtained from the
155 environment can be converted to Unicode strings by calling C<<
156 Encode::encode(locale => $bytes) >> and converted back again with C<<
157 Encode::decode(locale => $string) >>.
158
159 Where file systems interfaces pass file names in and out of the program we also
160 need care. The trend is for operating systems to use a fixed file encoding
161 that don't actually depend on the locale; and this module determines the most
162 appropriate encoding for file names. The L<Encode> module will know this
163 encoding under the name "locale_fs". For traditional Unix systems this will
164 be an alias to the same encoding as "locale".
165
166 For programs running in a terminal window (called a "Console" on some systems)
167 the "locale" encoding is usually a good choice for what to expect as input and
168 output. Some systems allows us to query the encoding set for the terminal and
169 C<Encode::Locale> will do that if available and make these encodings known
170 under the C<Encode> aliases "console_in" and "console_out". For systems where
171 we can't determine the terminal encoding these will be aliased as the same
172 encoding as "locale". The advice is to use "console_in" for input known to
173 come from the terminal and "console_out" for output known to go from the
174 terminal.
175
176 In addition to arranging for various Encode aliases the following functions and
177 variables are provided:
154178
155179 =over
156180