Contributors - ispell (f415ee52-d38b-40b7-9620-58e0205650b2/main)

Tree @f415ee52-d38b-40b7-9620-58e0205650b2/main (Download .tar.gz)

Contributors @f415ee52-d38b-40b7-9620-58e0205650b2/main — raw · history · blame

Ispell has a long and convoluted history.  I have tried to track down
as much as possible about it and condense it below.

THE DEVELOPMENT OF SPELL-CHECKING AND THE FIRST ISPELL

The following background information on spelling checkers in general,
and ispell in particular, was provided to me by Les Earnest
(les@dec-lite.stanford.edu):

> The earliest spelling checker (of sorts) of which I am aware was in a
> program that attempted to automatically receive human-keyed Morse
> code, which can be ambiguous because of the variable timing between
> dots, dashes, intercharacter pauses, and interword pauses.  This
> program didn't use a full dictionary; instead, used a table of
> digraphs (two-letter sequences) that occur in English and barred
> improper letter sequences.  This program was written by someone at MIT
> Lincoln Lab around 1959 and, I think, ran on the TX-2 computer there.
> Unfortunately, I don't remember his name.  I might still have the
> paper he wrote in my files but it would take a major search to find it
> and I might not succeed.
>
> A program that I wrote in 1961 to read cursive writing contained a
> real spelling checker, using the 10,000 most common English words.
> It is reported in:
>   L. Earnest, "Machine Recognition of Cursive Writing," Information
>   Processing 62, (Proc. IFIP Congress 1962, Munich), North-Holland,
>   Amsterdam, 1963.
> and
>   N. Lindgren, ``Machine Recognition of Human Language, Part III -
>   Cursive Script Recognition'', IEEE Spectrum, May 1965.
>
> I brought that dictionary to Stanford and got a PhD student to write
> a spelling checker for text in Lisp running on our PDP-6 computer at
> the Stanford Artificial Intelligence Lab around 1967.
> Unfortunately, I do not remember which student it was; it could have
> been Gil Falk.  It was a rather simple program (certainly much
> simpler than the earlier cursive writing program) and I didn't think
> of it as a significant development at the time.
>
> [Later], I got another PhD student, Ralph Gorin, to do a better and
> faster spelling checker sometime in the early '70s, still using my
> old dictionary.  Ralph later wrote an article about it in CACM.  I
> believe that he later augmented the dictionary.

[note: Ralph has since informed me that he wrote no such article.  The
program was called SPELL and was written in 1971.  Ralph provided me
with a reference to "Computer Programs for Spelling Correction", by
James L. Peterson, Springer-Verlag, Berlin, 1980, No. 96 in the series
"Lecture Notes in Computer Science."  This book states that Ralph's
SPELL program, which was the direct ancestor of ispell, was the first
computer program written for checking the spelling of text documents.
The book is also a good source of references on spelling programs.]

> ...
>    
> [Ispell] was originally written in PDP-10 assembly language and ran
> under the WAITS operating system, which is similar to TOPS-10 but existed
> only on SAIL (a dual processor KA10/PDP-6 system).  It was and is called
> SPELL on that machine.  It later was modified to run under Tenex and
> TOPS-20.

[Ralph mentions that SPELL was also ported to MIT's ITS and TOPS-10.]
   
The Tenex version of ispell was later revised by W. E. Matson (1974),
and Bill Ackerman (1978).  Bill has provided the following information:

> I came across the SPELL program in 1978 on ITS.  It was a port from
> Stanford, and had the names Ralph Gorin (approximately 1971) and
> Wayne Matson (1974) associated with it.  I did 3 things to it:
>
>    Rewrote it as a native program for ITS, and, shortly thereafter,
>       TOPS-20.  (I never did anything for TOPS-10, and am not aware
>       that it ever ran on TOPS-10, though it may have.)
>
>    Replaced the heuristics for suffix removal, which I found unreliable
>       and unsatisfactory, with an algorithm that was driven by specific
>       suffix flags in the dictionary.  This way, the dictionary would have
>       complete control over what words were legal, and there would be no
>       spurious hits.
>
>    Apparently most importantly, though I had no idea at time, gave it
>       the name "ISPELL", for "ITS version of spell", since I didn't
>       consider myself authorized to throw away an existing program
>       and overwrite it with a new one under the same name.
>
> I have not followed the history of the program since then, and do not know
> if it still uses the "suffix flags" in its dictionary.  But if it does,
> I introduced them.  The Ispell algorithm that uses those flags to make
> accurate decisions about the legality of words was documented in great
> detail in James Peterson's Springer-Verlag book.  (He spent a semester
> at MIT while working on the book, and I provided him with a lot of
> information and documentation at that time.)
>
>                            Bill Ackerman
>                            wba@apollo.hp.com

Michael Adler adds:

> I did work on ispell in 1982.  Actually, I stole the ispell
> dictionary and suffix compression algorithm and wrote a spelling
> checker for CP/M in 8080 assembler that I very creatively called "SPELL."
> By sorting the dictionary alphabetically and using a difference encoding
> I managed to pack the entire dictionary that Bill was using in about
> 56Kb.  The CP/M program read a document, sorted all the words alphabetically
> and then checked them.  It then reread the document and compared words as
> it found them against the in memory, sorted and checked words.  SPELL was
> around in the public domain on CP/M.
>
> I was in high school at the time and talked to Bill only over email.
> We wound up in the same compiler group at Apollo in the late 80's by
> coincidence.

DEVELOPMENT OF THE C/UNIX VERSION OF ISPELL

In 1983, Pace Willisson (pace@prep.ai.mit.edu) wrote a C/Unix version
from scratch, based on the ispell documentation.

In 1987, Walt Buehring revised and enhanced ispell, and posted it to the
Usenet along with a dictionary.  In addition, Walt wrote the first version
of "ispell.el", the emacs interface.

Geoff Kuenning (geoff@ITcorp.com, that's me, and by the way I
pronounce it "Kenning"; the "u" is silent) picked up this version,
fixed some bugs, and added further enhancements, all of which made me
the de-facto ispell maintainer for the net.  I also put quite a bit of
work into improving the quality of the dictionaries.  In 1987 I began
work on the "munchlist" script, which I originally intended to be used
to add flags to personal dictionary entries.  At the same time I was
studying German, and wanted to use ispell to check the papers I was
writing for that class.  After thinking about it for some time, I
realized that the suffix flags could be table-driven, which would both
add flexibility and would get rid of certain difficult-to-find bugs.
In 1988 I rewrote major portions of the code to do this, resulting in
the first multi-lingual version.  Ole Bjoern Hessen (obh@ifi.uio.no)
in Norway alpha-tested this version and provided several important
enhancements.

Bob Devine (vianet!devine) provided two larger dictionaries (which
became the basis for english.1 and english.2) to me for inclusion
with subsequent releases.

Ashwin Ram (ram@cs.yale.edu) made substantial enhancements to Walt
Buehring's emacs interface, and provided them to me for inclusion
with an earlier release.

The emacs interface was then completely overhauled by Ken Stevens
(stevens@hplabs.hp.com), who also beta-tested the software and
without whom this posting would not have been possible.  If there's a
feature in the emacs interface that you like, you probably have Ken to
thank for it.  His efforts have been tireless for many years.

Martin Boyer made major contributions to the munchlist script,
including producing a version that runs under perl (see
languages/Where for instructions on how to get that version).
Philippe-Andre Prindeville provided xspell (a Motif-based X
interface), and Moritz Willers provided a NeXTStep interface.

DEVELOPMENT AND DEATH OF ISPELL 4.0

Meanwhile, and unbeknownst to me, Pace Willisson was working on his
own improvements to ispell.  He focused primarily on dictionary size
and startup time.  His solution was a dictionary compression algorithm
that detected and encoded frequent letter pairs.  This also reduced
the time needed to read it in.  Pace also changed some internal data
structures to improve startup time.  Pace and I eventually discovered
each other's efforts, and discussed re-merging our changes, but we
decided that there would be too much work involved.  This was partly
because I was close to a release and didn't want to delay it with an
extensive and error-prone merge.

In late 1992 (if my memory serves correctly), Richard Stallman
contacted me, asking for permission to distribute ispell as part of
the GNU suite.  I responded that he was welcome to distribute it, but
that I was not willing to place my software under the Gnu Public
License.  Through a misunderstanding, neither of us considered the
possibility of finding a compromise license that both could live with.
So Richard started a search for an alternate version, and found Pace
working right in his back yard.

I have been told that when FSF first learned of Pace's version, they
again considered using International Ispell instead because it was
both more popular and more capable, but this idea was rejected due to
the license misunderstanding.  Instead, FSF enhanced Pace's version
somewhat and called it ispell 4.0, apparently in the hopes that by
numbering the version higher, it would become the standard.

When ispell 4.0 was released, much confusion ensued.  Many ispell
users innocently "upgraded" to 4.0 and then screamed when they could
not find features to which they had grown accustomed.  Europeans in
general were angered by the apparent provincialism shown by the
"dropping" of international support.  I found myself inundated with
questions about a version I had never heard of or seen.

One of the earliest and most common suggestions was that FSF should
rename their version "gispell".  This had a lot of precedent, both in
the naming of other FSF utilities and in the then-recent change of the
suffix used by gzip from ".z" to ".gz".  Unfortunately, the FSF
refused to do this.  I may have inadvertently contributed to this
refusal with a Usenet posting in which I tried to clarify what had
happened, pointing out that the FSF version was more recently related
to Pace's than my own.  This may have been seen as an acknowledgment
that FSF should have the rights to the name "ispell," and that I
should rename my version.

A flame war arose, and I decided that the only way to solve the
problem was to rename my version to eliminate the confusion.  However,
at about the same time Richard Stallman and I began negotiating via
e-mail.  We itemized and clarified his objections to my license, and I
learned from a third party that FSF is willing to distribute software
that falls under the University of California license (also known as
the Berkeley license).  Richard and I agreed that if I changed my
license to be a paraphrase of the UC license, FSF would be willing to
distribute my version with no changes.  Since then, ispell 4.0 has
been dropped by FSF and has pretty well disappeared from the net,
leaving International Ispell as the version of choice for nearly everyone.

OTHER CONTRIBUTORS

Numerous other people have contributed enhancements, suggestions, and
bug fixes.  I used to attempt to keep track of all of them, and list
their names here.  However, I found that keeping the list of names
correct was as time-consuming as fixing bugs, so I finally decided
that the list had grown too long, and stopped keeping it current.
This is unfortunate, because many people have made significant
contributions to ispell and I would like to have a way to acknowledge
them.  So please remember that hundreds of people have helped in this
effort, and that I appreciate each and every one of them.