summaryrefslogtreecommitdiff
path: root/man/hunspell.1
blob: 1d409db08f182319357798e0c75ec50e462e8041 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
.TH hunspell 1 "2011-01-21"
.LO 1
.SH NAME
hunspell \- spell checker, stemmer and morphological analyzer
.SH SYNOPSIS
hunspell [\-1aDGHhLlmnrstvw] [\-\-check\-url] [\-d dict[,dict2,...]] [\-\-help] [\-i enc] [\-p dict] [\-vv\] [\-\-version] [file(s)]
.SH DESCRIPTION
.I Hunspell
is fashioned after the
.I Ispell
program.  The most common usage is "hunspell" or "hunspell filename". 
Without filename parameter, hunspell checks the standard input.
Typing "cat" and "exsample" in two input lines, we got an asterisk
(it means "cat" is a correct word) and a line with corrections:
.PP
.RS
.nf
$ hunspell -d en_US
Hunspell 1.2.3
*
& exsample 4 0: example, examples, ex sample, ex-sample
.fi
.RE
.PP
Correct words signed with an '*', '+' or '-', unrecognized
words signed with '#' or '&' in output lines (see later).
(Close the standard input with Ctrl-d on Unix/Linux and
Ctrl-Z Enter or Ctrl-C on Windows.)
.PP
With filename parameters,
.I hunspell
will display each word of the files which does not appear in the dictionary at the
top of the screen and allow you to change it.  If there are "near
misses" in the dictionary, then they are
also displayed on following lines.
Finally, the line containing the
word and the previous line
are printed at the bottom of the screen.  If your terminal can
display in reverse video, the word itself is highlighted.  You have the
option of replacing the word completely, or choosing one of the
suggested words. Commands are single characters as follows
(case is ignored):
.PP
.RS
.IP R
Replace the misspelled word completely.
.IP Space
Accept the word this time only.
.IP A
Accept the word for the rest of this
.I hunspell
session.
.IP I
Accept the word, capitalized as it is in the
file, and update private dictionary.
.IP U
Accept the word, and add an uncapitalized (actually, all lower-case)
version to the private dictionary.
.IP S
Ask a stem and a model word and store them in the private dictionary.
The stem will be accepted also with the affixes of the model word.
.IP 0-\fIn\fR
Replace with one of the suggested words.
.IP X
Write the rest of this file, ignoring misspellings, and start next file.
.IP Q
Exit immediately and leave the file unchanged.
.IP ^Z
Suspend hunspell.
.IP ?
Give help screen.
.RE
.SH OPTIONS
.IP \fB\-1\fR
Check only first field in lines (delimiter = tabulator).
.IP \fB\-a\fR
The
.B \-a
option
is intended to be used from other programs through a pipe.  In this
mode,
.I hunspell
prints a one-line version identification message, and then begins
reading lines of input.  For each input line,
a single line is written to the standard output for each word
checked for spelling on the line.  If the word
was found in the main dictionary, or your personal dictionary, then the
line contains only a '*'.  If the word was found through affix removal,
then the line contains a '+', a space, and the root word. 
If the word was found through compound formation (concatenation of two
words, then the line contains only a '\-'.
.IP ""
If the word
is not in the dictionary, but there are near misses, then the line
contains an '&', a space, the misspelled word, a space, the number of
near misses,
the number of
characters between the beginning of the line and the
beginning of the misspelled word, a colon, another space,
and a list of the near
misses separated by
commas and spaces.
.IP ""
Also, each near miss or guess is capitalized the same as the input
word unless such capitalization is illegal;
in the latter case each near miss is capitalized correctly
according to the dictionary.
.IP ""
Finally, if the word does not appear in the dictionary, and
there are no near misses, then the line contains a '#', a space,
the misspelled word, a space,
and the character offset from the beginning of the line.
Each sentence of text input is terminated
with an additional blank line, indicating that
.I hunspell
has completed processing the input line.
.IP ""
These output lines can be summarized as follows:
.RS
.IP OK:
*
.IP Root:
+ <root>
.IP Compound:
\-
.IP Miss:
& <original> <count> <offset>: <miss>, <miss>, ...
.IP None:
# <original> <offset>
.RE
.IP ""
For example, a dummy dictionary containing the words "fray", "Frey",
"fry", and "refried" might produce the following response to the
command "echo 'frqy refries | hunspell \-a":
.RS
.nf
(#) Hunspell 0.4.1 (beta), 2005-05-26
& frqy 3 0: fray, Frey, fry
& refries 1 5: refried
.fi
.RE
.IP ""
This mode
is also suitable for interactive use when you want to figure out the
spelling of a single word (but this is the default behavior of hunspell
without -a, too).
.IP ""
When in the
.B \-a
mode,
.I hunspell
will also accept lines of single words prefixed with any
of '*', '&', '@', '+', '\-', '~', '#', '!', '%', '`', or '^'.
A line starting with '*' tells
.I hunspell
to insert the word into the user's dictionary (similar to the I command).
A line starting with '&' tells
.I hunspell
to insert an all-lowercase version of the word into the user's
dictionary (similar to the U command).
A line starting with '@' causes
.I hunspell
to accept this word in the future (similar to the A command).
A line starting with '+', followed immediately by
.B tex
or
.B nroff
will cause
.I hunspell
to parse future input according the syntax of that formatter.
A line consisting solely of a '+' will place
.I hunspell
in TeX/LaTeX mode (similar to the
.B \-t
option) and '\-' returns
.I hunspell
to nroff/troff mode (but these commands are obsolete).
However, the string character type is
.I not
changed;
the '~' command must be used to do this.
A line starting with '~' causes
.I hunspell
to set internal parameters (in particular, the default string
character type) based on the filename given in the rest of the line.
(A file suffix is sufficient, but the period must be included.
Instead of a file name or suffix, a unique name, as listed in the language
affix file, may be specified.)
However, the formatter parsing is
.I not
changed;  the '+' command must be used to change the formatter.
A line prefixed with '#' will cause the
personal dictionary to be saved.
A line prefixed with '!' will turn on
.I terse
mode (see below), and a line prefixed with '%' will return
.I hunspell
to normal (non-terse) mode.
A line prefixed with '`' will turn on verbose-correction mode (see below);
this mode can only be disabled by turning on terse mode with '%'.
.IP ""
Any input following the prefix
characters '+', '\-', '#', '!', '%', or '`' is ignored, as is any input
following the filename on a '~' line.
To allow spell-checking of lines beginning with these characters, a
line starting with '^' has that character removed before it is passed
to the spell-checking code.
It is recommended that programmatic interfaces prefix every data line
with an uparrow to protect themselves against future changes in
.IR hunspell .
.IP ""
To summarize these:
.IP ""
.RS
.IP *
Add to personal dictionary
.IP @
Accept word, but leave out of dictionary
.IP #
Save current personal dictionary
.IP ~
Set parameters based on filename
.IP +
Enter TeX mode
.IP \-
Exit TeX mode
.IP !
Enter terse mode
.IP %
Exit terse mode
.IP "`"
Enter verbose-correction mode
.IP ^
Spell-check rest of line
.fi
.RE
.IP ""
In
.I terse
mode,
.I hunspell
will not print lines beginning with '*', '+', or '\-', all of which
indicate correct words.
This significantly improves running speed when the driving program is
going to ignore correct words anyway.
.IP ""
In
.I verbose-correction
mode,
.I hunspell
includes the original word immediately after the indicator character
in output lines beginning with '*', '+', and '\-', which simplifies
interaction for some programs.

.IP \fB\-\-check\-url\fR
Check URLs, e-mail addresses and directory paths.

.IP \fB\-D\fR
Show detected path of the loaded dictionary, and list of the
search path and the available dictionaries.

.IP \fB\-d\ dict,dict2,...\fR
Set dictionaries by their base names with or without paths.
Example of the syntax:
.PP          
\-d en_US,en_geo,en_med,de_DE,de_med
.PP          
en_US and de_DE are base dictionaries, they consist of
aff and dic file pairs: en_US.aff, en_US.dic and de_DE.aff, de_DE.dic.
En_geo, en_med, de_med are special dictionaries: dictionaries 
without affix file. Special dictionaries are optional extension
of the base dictionaries usually with special (medical, law etc.)
terms. There is no naming convention for special dictionaries,
only the ".dic" extension: dictionaries without affix file will
be an extension of the preceding base dictionary (right
order of the parameter list needs for good suggestions). First
item of \-d parameter list must be a base dictionary.

.IP \fB\-G\fR
Print only correct words or lines.

.IP \fB\-H\fR
The input file is in SGML/HTML format.

.IP \fB\-h,\ \-\-help\fR
Short help.

.IP \fB\-i\ enc\fR
Set input encoding.

.IP \fB\-L\fR
Print lines with misspelled words.

.IP \fB\-l\fR
The "list" option
is used to produce a list of misspelled words from the standard input.

.IP \fB\-m\fR
Analyze the words of the input text (see also hunspell(4) about
morphological analysis). Without dictionary morphological data,
signs the flags of the affixes of the word forms for dictionary
developers.

.IP \fB\-n\fR
The input file is in nroff/troff format.

.IP \fB\-P\ password\fR
Set password for encrypted dictionaries.

.IP \fB\-p\ dict\fR
Set path of personal dictionary.
The default dictionary depends on the locale settings. The
following environment variables are searched: LC_ALL,
LC_MESSAGES, and LANG. If none are set then the default personal
dictionary is $HOME/.hunspell_default.

Setting
.I \-d
or  the
.I DICTIONARY
environmental variable, personal dictionary will be
.BR $HOME/.hunspell_dicname

.IP \fB\-r\fR
Warn of the rare words, wich are also potential spelling mistakes.

.IP \fB\-s\fR
Stem the words of the input text (see also hunspell(4) about
stemming). It depends from the dictionary data.

.IP \fB\-t\fR
The input file is in TeX or LaTeX format.

.IP \fB\-v,\ \-\-version\fR
Print version number.

.IP \fB\-vv\fR
Print ispell(1) compatible version number.

.IP \fB\-w\fR
Print misspelled words (= lines) from one word/line input.

.SH EXAMPLES
.TP
.B hunspell \-d en_US english.html
.TP
.B hunspell \-d en_US,en_US_med medical.txt
.TP
.B hunspell \-d ~/openoffice.org2.4/share/dict/ooo/de_DE
.TP
.B hunspell *.html
.TP
.B hunspell \-l text.html
.SH ENVIRONMENT
.TP
.B DICTIONARY
Similar to 
.I \-d. 
.TP
.B DICPATH
Dictionary path.
.TP
.B WORDLIST
Equivalent to 
.I \-p.
.SH FILES
The default dictionary depends on the locale settings. The
following environment variables are searched: LC_ALL,
LC_MESSAGES, and LANG. If none are set then the following
fallbacks are used:

.BI /usr/share/myspell/default.aff
Path of default affix file. See hunspell(4).
.PP
.BI /usr/share/myspell/default.dic
Path of default dictionary file.
See hunspell(4).
.PP
.BI $HOME/.hunspell_default.
Default path to personal dictionary.
.SH SEE ALSO
.B hunspell (3), hunspell(4)
.SH AUTHOR
Author of Hunspell executable is László Németh. For Hunspell library,
see hunspell(3).
.PP
This manual based on Ispell's manual. See ispell(1).
.SH BUGS
There are some layout problems with long lines.