GaelSpell:
English summary
Kevin P. Scannell
Summary
This page has been provided as an aid to package maintainers and others who might be unable to read the GaelSpell home page which is entirely in Irish. This is in no sense a translation of the Irish page (which contains much more detailed descriptions).
For Linux users, we have packages which provide Irish language support for the most widely used spellcheckers in the Open Source community: ispell-gaeilge, for Geoff Kuenning's International Ispell, aspell-gaeilge for Kevin Atkinson's Aspell, and hunspell-gaeilge for the OpenOffice.org spellchecker. The word lists are identical, just packaged differently.
Diarmaid Mac Mathúna has also repackaged the same underlying word list for use on Windows machines; it is available (under the GPL) from the GaelSpell site: www.gaelspell.com.
Features
- Large Word List. There are around 330,000 words in the database; this is, by my estimates, about five times larger than the new Irish spellchecker released by Microsoft (can't tell for sure -- it's closed-source!) The coverage is equivalent to a dictionary with around 26,000 headwords -- almost twice as big as a typical pocket dictionary (e.g. the Oxford or the Collins Gem).
- Grammatical Completeness. I have written software which generates every inflected form of a dictionary headword when provided with a limited amount of grammatical information. For instance, by adding the word fuaimnigh to the underlying database as a second declension verb, 87 inflected forms are added to the word list (all verb endings plus lenition, eclipsis, prefix "d'" etc.)
- Accuracy. The only absolute rule when generating a spellchecker is that there should be no misspelled words in the basic word lists. Every word has been checked against print sources at least once. The software which generates the inflected forms has been tested various ways, including through the use of the shell script "igcheck" which checks a word list for letter combinations which are illegal or "pre-standard" in Irish. The other word lists I've seen contain anywhere from 10% to 40% English or misspelled Irish words.
- Frequent Updates. I have provided major updates every six months or so since the initial release and plan to continue this for the foreseeable future. Candidates for addition to the word list are harvested via statistical methods as part of the Crúbadán web crawling project; this is an effective way of keeping up with the latest terminology. I have also been adding words from the print dictionaries published by An Gúm and the resources available from acmhainn.ie.
- Dialect support (ispell only). There are three different installation options included with the ispell-gaeilge package, described below under Alternate Models.
- Phonetic support (aspell only). The file gaeilge_phonet.dat provides a complete "coarse" encoding of the pronunciation of Irish. This allows aspell to make more intelligent suggestions when it comes across a misspelled word. For instance, where ispell gives no suggestions for the pre-standard imfhiosach, aspell uses the phonetics file to encode this as "*M*S*K", thereby recognizing and suggesting the correct spelling iomasach.
Alternate Models
The default word list conforms strictly to standardized Irish spelling. You can generate either a "literary" or "dialect" model (ispell only) by changing the variable INSTALLATION at the top of the Makefile to gaeilgelit or gaeilgemor and using "make" as usual.
The gaeilgelit model contains many obsolete or obscure (but standardly spelled) words which are probably best left out of any good Irish spellchecker. For instance, brúitíneach (a stumpy or stuffy person in Ó Dónaill) is a likely misspelling of the much more common word bruitíneach (the measles). Other typical "dangerous" word pairs: deirc for déirc. múid for muid, etc.
The gaeilgemor model, on the other hand, contains non-standard or dialect spellings (alongside the standard spellings) and accepts non-standard inflections of verbs. This greatly reduces its effectiveness as a spellchecking tool; indeed, anyone who uses non-standard forms so frequently that he or she finds the standard model inadequate will likely disagree with the very concept of an Irish spellchecker in the first place!
With all this in mind, I strongly urge installers to make the standard model the default on your system.

This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.