espa-nol (HEAD) - Codebase list

Tree @HEAD (Download .tar.gz)

espa~nol.dicc Release 1.11-- A Spanish (Español) dictionary for using
with ispell 3.1.13 or later

Copyright (c) 1994 1995 1996 1999 2001 2005 2008 2010 Santiago Rodriguez and Jesus Carretero

    This program is free software; you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation.

    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU General Public License for more details. This software can be
    obtained from http://www.datsi.fi.upm.es/~coes/

This set of data files implements a Spanish (castellano) dictionary to
be used with the international ispell program, version 3.1.13 or
further. The dictionary contains 54,000 roots (aprox.).

If you want to run the spanish dictionary, you have to undefine the
NO8BIT macro in the local.h configuration file.


Contents:

	1. Uncompressing the package.
	2. Building the dictionary.
	3. Installing the dictionary.
	4. Supported Formatters.
	5. MSDOS Dictionary.

	1. The distribution is included in the espa~nol.tar.gz or
espa~nol-1.11.tar.gz file. To extract sources that are in files ending
in `.tar.gz' you can use the command
      gzip -d < espa~nol.tar.gz | tar xf -

where `espa~nol.tar.gz' is the name of the file.

This file is expanded to the following files:

	espa~nol.aff: Affixes file.

	espa~nol.words: Contains a list of words that appear in the
official espa~nol dictionary (Diccionario de la Real Academia Espa'nola
de la Lengua 22nd edition).

	espa~nol.comp: Contains a list of words not appearing in the
official dictionary but being used in computer related texts.

	espa~nol.nofl: Contains a list of words not appearing in the
official dictionary but being used normal spanish and they are "correct"
words.

	antiguas.words: Contains a list of words that appear in the
official espa~nol dictionary and they are old ones that are not
currently in use.

	espa~nol.words+: Contains the expanded list of words generated
	from the espa~nol.words and espa~nol.comp word files.

	e~nes: Script for replacing the 'n and 'N by ~n and ~N in the
espa~nol.aff, espa~nol.words and espa~nol.words+. If you use the
second way to specify this letter you have to run this script. This
script uses the sed utility. It has been checked by using the GNU sed
version 2.05. If you want to run this script make sure that you have the
GNU sed installed and type:

		make e~ne

	Makefile: Makefile for building the hash file
(espa~nol.hash) from the affix file and the espa~nol.words file.
This way of building the hash file needs about 50Mb of paging space and
100 Mb of temporary disk space. Please, ensure that you have
enough disk space in the tmp partition (usually /usr/tmp).
If you do not have it, you have to set the TMPDIR environment variable to
a path where you can allocate 100 Mb of temporary disk storage.

	2. If you want to create the espa~nol.hash just type:

		make

	Quick building: If you want to create the espa~nol.hash from
	the expanded word list (espa~nol.words+), just type:

		make build

	It does not need so much temporary space.

	The size of the spanish dictionary (espa~nol.hash) is
3.9Mbytes (Solaris 2.7 has this problem). If you get a size much bigger,
probably it is due to the sort command of the operating system. In this
case we recommend to install the textutils package of GNU and be sure that
the sort command that you use is the textutils one.

	3. To install the hash file become root and type

		make install

	4. Spanish acute chars may be codes in different ways.
	Six different formatters are supported:

	Default formatter: The acute characters are coded as follows:

		'a	á
		'e	é
		'i	í
		'o	ó
		'u	ú
		'n	ñ
		"u	ü
		'A	Á
		'E	É
		'I	Í
		'O	Ó
		'U	Ú
		'N	Ñ
		"U	Ü
	TeX formatter: The acute characters are coded as follows:

		\'a	á
		\'e	é
		\'{\i}	í
		\'o	ó
		\'u	ú
		\'n	ñ
		\"u	ü
		\'A	Á
		\'E	É
		\'{\I}	Í
		\'O	Ó
		\'U	Ú
		\'N	Ñ
		\"U	Ü
	plainTeX formatter: The acute characters are coded as follows:

		\'{a}	á
		\'{e}	é
		\'{\i}	í
		\'{o}	ó
		\'{u}	ú
		\'{n}	ñ
		\"{u}	ü
		\'{A}	Á
		\'{E}	É
		\'{\I}	Í
		\'{O}	Ó
		\'{U}	Ú
		\'{N}	Ñ
		\"{U}	Ü

	latin1 formatter: The acute characters are coded as specified
	in the iso_8859_1 code.

	utf8 formatter: The acute characters are coded as specified
	in the utf8 code.

	msdos formatter: The acute characters are coded as specified
	in the extended ASCII MSDOS code.

	html formatter: The acute characters are coded as follows:

		&aacute;	á
		&eacute;	é
		&iacute;	í
		&oacute;	ó
		&uacute;	ú
		&ntilde;	ñ
		&uuml;		ü
		&Aacute;	Á
		&Eacute;	É
		&Iacute;	Í
		&Oacute;	Ó
		&Uacute;	Ú
		&Ntilde;	Ñ
		&Uuml;		Ü

	If you want to run ispell by using one of the previous formats
	please type:

		ispell -T <formatter> -d espa~nol <file>

	5. espa~nol.hash file is available for MSDOS users at:

	http://www.datsi.fi.upm.es/~coes/

	Note that the affixes list and the word list are under
development. We are currently working on them. If you find words
that does not appear in the word list or words that must not appear in
the word list, please send a message to espanol-bugs@datsi.fi.upm.es

Santiago Rodriguez & Jesus Carretero
Departamento de Arquitectura
y Tecnologia de Sistemas Informaticos (DATSI)
Universidad Politecnica de Madrid
December 1994
January 1995
February 1995
November 1996
April 1999
June 2001
March 2005
November 2005
May 2008
November 2010
Email: srodri@fi.upm.es, jesus.carretero@uc3m.es
Commit History @HEAD

»»