More initial description of the problem this module solves
Gisle Aas
13 years ago
128 | 128 | $string = decode(locale => $bytes); |
129 | 129 | $bytes = encode(locale => $string); |
130 | 130 | |
131 | binmode(STDIN, ":encoding(console_in)"); | |
132 | binmode(STDOUT, ":encoding(console_out)"); | |
133 | binmode(STDERR, ":encoding(console_out)"); | |
131 | if (-t) { | |
132 | binmode(STDIN, ":encoding(console_in)"); | |
133 | binmode(STDOUT, ":encoding(console_out)"); | |
134 | binmode(STDERR, ":encoding(console_out)"); | |
135 | } | |
134 | 136 | |
135 | 137 | # Processing file names passed in as arguments |
136 | 138 | $file = decode(locale => $ARGV[0]); |
140 | 142 | |
141 | 143 | =head1 DESCRIPTION |
142 | 144 | |
143 | Perl uses Unicode to represent strings internally but many of the interfaces it | |
144 | has to the outside world is still byte based. Programs therefore needs to decode | |
145 | strings that enter the program from the outside and encode them again on the way | |
146 | out. | |
147 | ||
148 | The POSIX locale system is used to specify both the language conventions to use | |
149 | and the prefered character set to consume and output. This module looks up the | |
150 | charset (called a CODESET in the locale jargon) and arrange for the L<Encode> | |
151 | module to know this encoding under the name "locale". | |
152 | ||
153 | In addition the following functions and variables are provided: | |
145 | In many applications it's wise to let Perl use Unicode for the strings it | |
146 | processes. Most of the interfaces Perl has to the outside world is still byte | |
147 | based. Programs therefore needs to decode byte strings that enter the program | |
148 | from the outside and encode them again on the way out. | |
149 | ||
150 | The POSIX locale system is used to specify both the language conventions | |
151 | requested by the user and the preferred character set to consume and | |
152 | output. The C<Encode::Locale> module looks up the charset and encoding (called | |
153 | a CODESET in the locale jargon) and arrange for the L<Encode> module to know | |
154 | this encoding under the name "locale". It means bytes obtained from the | |
155 | environment can be converted to Unicode strings by calling C<< | |
156 | Encode::encode(locale => $bytes) >> and converted back again with C<< | |
157 | Encode::decode(locale => $string) >>. | |
158 | ||
159 | Where file systems interfaces pass file names in and out of the program we also | |
160 | need care. The trend is for operating systems to use a fixed file encoding | |
161 | that don't actually depend on the locale; and this module determines the most | |
162 | appropriate encoding for file names. The L<Encode> module will know this | |
163 | encoding under the name "locale_fs". For traditional Unix systems this will | |
164 | be an alias to the same encoding as "locale". | |
165 | ||
166 | For programs running in a terminal window (called a "Console" on some systems) | |
167 | the "locale" encoding is usually a good choice for what to expect as input and | |
168 | output. Some systems allows us to query the encoding set for the terminal and | |
169 | C<Encode::Locale> will do that if available and make these encodings known | |
170 | under the C<Encode> aliases "console_in" and "console_out". For systems where | |
171 | we can't determine the terminal encoding these will be aliased as the same | |
172 | encoding as "locale". The advice is to use "console_in" for input known to | |
173 | come from the terminal and "console_out" for output known to go from the | |
174 | terminal. | |
175 | ||
176 | In addition to arranging for various Encode aliases the following functions and | |
177 | variables are provided: | |
154 | 178 | |
155 | 179 | =over |
156 | 180 |