Commit 2c880c52e0ed136495a20906bf4acef0ebe1f36d - libencode-locale-perl

More initial description of the problem this module solves Gisle Aas 13 years ago

1 changed file(s) with 38 addition(s) and 14 deletion(s). Raw diff Collapse all Expand all

+38

-14

lib/Encode/Locale.pm less more

128	128	$string = decode(locale => $bytes);
129	129	$bytes = encode(locale => $string);
130	130
131		binmode(STDIN, ":encoding(console_in)");
132		binmode(STDOUT, ":encoding(console_out)");
133		binmode(STDERR, ":encoding(console_out)");
	131	if (-t) {
	132	binmode(STDIN, ":encoding(console_in)");
	133	binmode(STDOUT, ":encoding(console_out)");
	134	binmode(STDERR, ":encoding(console_out)");
	135	}
134	136
135	137	# Processing file names passed in as arguments
136	138	$file = decode(locale => $ARGV[0]);

140	142
141	143	=head1 DESCRIPTION
142	144
143		Perl uses Unicode to represent strings internally but many of the interfaces it
144		has to the outside world is still byte based. Programs therefore needs to decode
145		strings that enter the program from the outside and encode them again on the way
146		out.
147
148		The POSIX locale system is used to specify both the language conventions to use
149		and the prefered character set to consume and output. This module looks up the
150		charset (called a CODESET in the locale jargon) and arrange for the L<Encode>
151		module to know this encoding under the name "locale".
152
153		In addition the following functions and variables are provided:
	145	In many applications it's wise to let Perl use Unicode for the strings it
	146	processes. Most of the interfaces Perl has to the outside world is still byte
	147	based. Programs therefore needs to decode byte strings that enter the program
	148	from the outside and encode them again on the way out.
	149
	150	The POSIX locale system is used to specify both the language conventions
	151	requested by the user and the preferred character set to consume and
	152	output. The C<Encode::Locale> module looks up the charset and encoding (called
	153	a CODESET in the locale jargon) and arrange for the L<Encode> module to know
	154	this encoding under the name "locale". It means bytes obtained from the
	155	environment can be converted to Unicode strings by calling C<<
	156	Encode::encode(locale => $bytes) >> and converted back again with C<<
	157	Encode::decode(locale => $string) >>.
	158
	159	Where file systems interfaces pass file names in and out of the program we also
	160	need care. The trend is for operating systems to use a fixed file encoding
	161	that don't actually depend on the locale; and this module determines the most
	162	appropriate encoding for file names. The L<Encode> module will know this
	163	encoding under the name "locale_fs". For traditional Unix systems this will
	164	be an alias to the same encoding as "locale".
	165
	166	For programs running in a terminal window (called a "Console" on some systems)
	167	the "locale" encoding is usually a good choice for what to expect as input and
	168	output. Some systems allows us to query the encoding set for the terminal and
	169	C<Encode::Locale> will do that if available and make these encodings known
	170	under the C<Encode> aliases "console_in" and "console_out". For systems where
	171	we can't determine the terminal encoding these will be aliased as the same
	172	encoding as "locale". The advice is to use "console_in" for input known to
	173	come from the terminal and "console_out" for output known to go from the
	174	terminal.
	175
	176	In addition to arranging for various Encode aliases the following functions and
	177	variables are provided:
154	178
155	179	=over
156	180