Hunspell 3

Hunspell is a spell checker, stemmer, and morphological analyzer library. It provides functions for spell checking, suggesting corrections, stemming words, analyzing morphology, and generating morphological variants. It works by loading affix and dictionary files, and supports additional dictionaries. It also has functions for adding and removing words and getting encoding and version information.

Uploaded by

Aaditya Chauhan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

86 views3 pages

Hunspell 3

Uploaded by

Aaditya Chauhan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

hunspell(3) hunspell(3)

NAME
hunspell - spell checking, stemming, morphological generation and analysis
SYNOPSIS
#include <hunspell/hunspell.hxx> /* or */
#include <hunspell/hunspell.h>

Hunspell(const char affpath, const char dpath);

Hunspell(const char affpath, const char dpath, const char * key);

˜Hunspell();

int add_dic(const char *dpath);

int add_dic(const char dpath, const char key);

int spell(const char *word);

int spell(const char word, int info, char **root);

int suggest(char***slst, const char *word);

int analyze(char***slst, const char *word);

int stem(char***slst, const char *word);

int stem(char*slst, char morph, int n);

int generate(char***slst, const char word, const char word2);

int generate(char***slst, const char *word, char **desc, int n);

void free_list(char ***slst, int n);

int add(const char *word);

int add_with_affix(const char word, const char example);

int remove(const char *word);

char * get_dic_encoding();

const char * get_wordchars();

unsigned short * get_wordchars_utf16(int *len);

struct cs_info * get_csconv();

const char * get_version();

DESCRIPTION
The Hunspell library routines give the user word-level linguistic functions: spell checking and correction,
stemming, morphological generation and analysis in item-and-arrangement style.
The optional C header contains the C interface of the C++ library with Hunspell_create and

2011-02-01 1
hunspell(3) hunspell(3)

Hunspell_destroy constructor and destructor, and an extra HunHandle parameter (the allocated object) in
the wrapper functions (see in the C header file hunspell.h).
The basic spelling functions, spell() and suggest() can be used for stemming, morphological generation and
analysis by XML input texts (see XML API).
Constructor and destructor
Hunspell’s constructor needs paths of the affix and dictionary files. See the hunspell(4) manual page for
the dictionary format. Optional key parameter is for dictionaries encrypted by the hzip tool of the Hunspell
distribution.
Extra dictionaries
The add_dic() function load an extra dictionary file. The extra dictionaries use the affix file of the allocated
Hunspell object. Maximal number of the extra dictionaries is limited in the source code (20).
Spelling and correction
The spell() function returns non-zero, if the input word is recognised by the spell checker, and a zero value
if not. Optional reference variables return a bit array (info) and the root word of the input word. Info bits
checked with the SPELL_COMPOUND, SPELL_FORBIDDEN or SPELL_WARN macros sign compound
words, explicit forbidden and probably bad words. From version 1.3, the non-zero return value is 2 for the
dictionary words with the flag "WARN" (probably bad words).
The suggest() function has two input parameters, a reference variable of the output suggestion list, and an
input word. The function returns the number of the suggestions. The reference variable will contain the
address of the newly allocated suggestion list or NULL, if the return value of suggest() is zero. Maximal
number of the suggestions is limited in the source code.
The spell() and suggest() can recognize XML input, see the XML API section.
Morphological functions
The plain stem() and analyze() functions are similar to the suggest(), but instead of suggestions, return
stems and results of the morphological analysis. The plain generate() waits a second word, too. This extra
word and its affixation will be the model of the morphological generation of the requested forms of the first
word.
The extended stem() and generate() use the results of a morphological analysis:
char ** result, result2;
int n1 = analyze(&result, "words");
int n2 = stem(&result2, result, n1);
The morphological annotation of the Hunspell library has fixed (two letter and a colon) field identifiers, see
the hunspell(4) manual page.
char ** result;
char * affix = "is:plural"; // description depends from dictionaries, too
int n = generate(&result, "word", &affix, 1);
for (int i = 0; i < n; i++) printf("%s0, result[i]);
Memory deallocation
The free_list() function frees the memory allocated by suggest(), analyze, generate and stem() functions.
Other functions
The add(), add_with_affix() and remove() are helper functions of a personal dictionary implementation to
add and remove words from the base dictionary in run-time. The add_with_affix() uses a second word as a
model of the enabled affixation of the new word.
The get_dic_encoding() function returns "ISO8859-1" or the character encoding defined in the affix file
with the "SET" keyword.
The get_csconv() function returns the 8-bit character case table of the encoding of the dictionary.
The get_wordchars() and get_wordchars_utf16() return the extra word characters definied in affix file for
tokenization by the "WORDCHARS" keyword.

2011-02-01 2
hunspell(3) hunspell(3)

The get_version() returns the version string of the library.

XML API
The spell() function returns non-zero for the "<?xml?>" input indicating the XML API support.
The suggest() function stems, analyzes and generates the forms of the input word, if it was added by one of
the following "SPELLML" syntaxes:
<?xml?>
<query type="analyze">
<word>dogs</word>
</query>
<?xml?>
<query type="stem">
<word>dogs</word>
</query>
<?xml?>
<query type="generate">
<word>dog</word>
<word>cats</word>
</query>
<?xml?>
<query type="generate">
<word>dog</word>
<code><a>is:pl</a><a>is:poss</a></code>
</query>
The outputs of the type="stem" query and the stem() library function are the same. The output of the
type="analyze" query is a string contained a <code><a>result1</a><a>result2</a>...</code> element. This
element can be used in the second syntax of the type="generate" query.
EXAMPLE
See analyze.cxx in the Hunspell distribution.
AUTHORS
Hunspell based on Ispell’s spell checking algorithms and OpenOffice.org’s Myspell source code.
Author of International Ispell is Geoff Kuenning.
Author of MySpell is Kevin Hendricks.
Author of Hunspell is László Németh.
Author of the original C API is Caolan McNamara.
Author of the Aspell table-driven phonetic transcription algorithm and code is Björn Jacke.
See also THANKS and Changelog files of Hunspell distribution.

2011-02-01 3

Online Restaurant Table Reservation Management System
100% (3)
Online Restaurant Table Reservation Management System
69 pages
ETL Tester Resume TN
No ratings yet
ETL Tester Resume TN
9 pages
Hunmorph Open Source Word Analysis
No ratings yet
Hunmorph Open Source Word Analysis
9 pages
Urdu Component Development Project: Application Programming Interface (Spellcheck Utility)
No ratings yet
Urdu Component Development Project: Application Programming Interface (Spellcheck Utility)
7 pages
Exam Preparation Questions PCL I 2022
No ratings yet
Exam Preparation Questions PCL I 2022
6 pages
Data Structures and Algorithms II Fall 2019 Programming Assignment #1
No ratings yet
Data Structures and Algorithms II Fall 2019 Programming Assignment #1
7 pages
Hunspell 4
No ratings yet
Hunspell 4
17 pages
ITP 2 Assign
No ratings yet
ITP 2 Assign
3 pages
Non-Statistical Language-Blind Morpheme (Candidate) Extraction - An Unsupervised Machine Learning Approach
No ratings yet
Non-Statistical Language-Blind Morpheme (Candidate) Extraction - An Unsupervised Machine Learning Approach
22 pages
Tries
No ratings yet
Tries
4 pages
03 CS107 Practice Midterm
No ratings yet
03 CS107 Practice Midterm
6 pages
Smart English-Chinese Dictionary With Python
No ratings yet
Smart English-Chinese Dictionary With Python
13 pages
Lecture Notes On Tries
No ratings yet
Lecture Notes On Tries
10 pages
CSCI 2270 - Data Structures and Algorithms Instructor Hoenigman Assignment 2 Due Friday, February 3 Before 3pm Word Analysis
No ratings yet
CSCI 2270 - Data Structures and Algorithms Instructor Hoenigman Assignment 2 Due Friday, February 3 Before 3pm Word Analysis
5 pages
Asst 3
No ratings yet
Asst 3
2 pages
Speller
No ratings yet
Speller
6 pages
4-Tolerant Retrieval
No ratings yet
4-Tolerant Retrieval
82 pages
Name Synopsis: Hunspell
No ratings yet
Name Synopsis: Hunspell
5 pages
Unix For Poets
No ratings yet
Unix For Poets
25 pages
Dictionaries: 'One' 'Uno'
No ratings yet
Dictionaries: 'One' 'Uno'
10 pages
Unit Vapplications Notes
No ratings yet
Unit Vapplications Notes
13 pages
Dictionary Using Tries in C
No ratings yet
Dictionary Using Tries in C
6 pages
05 - Dictionaries and Tuples
No ratings yet
05 - Dictionaries and Tuples
61 pages
CD Rec Process
No ratings yet
CD Rec Process
74 pages
Lecture3 Tolerent
No ratings yet
Lecture3 Tolerent
81 pages
Program # 6 Word Counter Using A Binary Search Tree: CS2 Fall 2010
No ratings yet
Program # 6 Word Counter Using A Binary Search Tree: CS2 Fall 2010
1 page
Hash Table1
No ratings yet
Hash Table1
8 pages
Gebrekidan Yonatan Yakob
No ratings yet
Gebrekidan Yonatan Yakob
14 pages
H02 Midterm Practice
No ratings yet
H02 Midterm Practice
4 pages
Data Preparation in Python
No ratings yet
Data Preparation in Python
8 pages
2 Tokens Naturalness of Code
No ratings yet
2 Tokens Naturalness of Code
56 pages
Lec 8
No ratings yet
Lec 8
17 pages
Part 4: Implementing The Solution in Python
No ratings yet
Part 4: Implementing The Solution in Python
5 pages
COS 126 - Assignment 8
No ratings yet
COS 126 - Assignment 8
2 pages
CS20A - Assignment 5
No ratings yet
CS20A - Assignment 5
4 pages
Experiment No. 9 3118013: Aim: Theory: Lexical Analyzer
No ratings yet
Experiment No. 9 3118013: Aim: Theory: Lexical Analyzer
16 pages
URBN216 Report 2
No ratings yet
URBN216 Report 2
5 pages
Chapter-4 - Data Structure-File Structure
No ratings yet
Chapter-4 - Data Structure-File Structure
34 pages
CSE143-Computer Programming II Programming Assignment #6 Due: Thursday, 11/16/17 9 PM
No ratings yet
CSE143-Computer Programming II Programming Assignment #6 Due: Thursday, 11/16/17 9 PM
3 pages
NLP Week 02
No ratings yet
NLP Week 02
54 pages
Write Yourself A Scheme in 48 Hours (2007) - Jonathan Tang
No ratings yet
Write Yourself A Scheme in 48 Hours (2007) - Jonathan Tang
138 pages
Sample of University of Michigan School of Information Masters of Applied Data Science Python and Statistics Assessments
No ratings yet
Sample of University of Michigan School of Information Masters of Applied Data Science Python and Statistics Assessments
5 pages
Ps 1
No ratings yet
Ps 1
4 pages
Crack Maang Companies - DSA Questions (C++) - 1
No ratings yet
Crack Maang Companies - DSA Questions (C++) - 1
257 pages
Tutorial 2 - Python Exercises
No ratings yet
Tutorial 2 - Python Exercises
2 pages
Pranav Compiler Design Lab File
No ratings yet
Pranav Compiler Design Lab File
32 pages
Completed UNIT-III 20.9.17
No ratings yet
Completed UNIT-III 20.9.17
61 pages
08 C Io Handout
No ratings yet
08 C Io Handout
22 pages
Lecture C 03
No ratings yet
Lecture C 03
56 pages
Dsa Practical 5
No ratings yet
Dsa Practical 5
14 pages
English Dictionary: Department of Computer Engineering
No ratings yet
English Dictionary: Department of Computer Engineering
17 pages
Strings
No ratings yet
Strings
8 pages
14 Strings in Python
No ratings yet
14 Strings in Python
19 pages
Lecture 04 Inaryseachtree
No ratings yet
Lecture 04 Inaryseachtree
20 pages
Recursive Backtracking
No ratings yet
Recursive Backtracking
14 pages
Thesaurus Implementing Binary Search Algorithm: Polytechnic University of The Philippines
No ratings yet
Thesaurus Implementing Binary Search Algorithm: Polytechnic University of The Philippines
3 pages
Lecture2 Indexing
No ratings yet
Lecture2 Indexing
49 pages
Assignment 2
No ratings yet
Assignment 2
4 pages
BBE SonicSweet Manual
No ratings yet
BBE SonicSweet Manual
13 pages
Lastexception 63823506223
No ratings yet
Lastexception 63823506223
1 page
Get 211 Computer Programming Major
No ratings yet
Get 211 Computer Programming Major
28 pages
SPM Syllabus 2024
No ratings yet
SPM Syllabus 2024
4 pages
Array Stacks and Queue
No ratings yet
Array Stacks and Queue
8 pages
Abstraction
No ratings yet
Abstraction
26 pages
Module-3 Functions
No ratings yet
Module-3 Functions
18 pages
Java The Complete Reference Notes
100% (1)
Java The Complete Reference Notes
71 pages
C Programming
No ratings yet
C Programming
23 pages
IBM App Connect Enterprise 12.0.6.0
0% (1)
IBM App Connect Enterprise 12.0.6.0
43 pages
Python Durga Notes
100% (1)
Python Durga Notes
369 pages
IT.-Sem-VI-Software Architecture-Sample Questions
No ratings yet
IT.-Sem-VI-Software Architecture-Sample Questions
8 pages
HTML Servlet
No ratings yet
HTML Servlet
4 pages
Assigning Relationships in P6
No ratings yet
Assigning Relationships in P6
11 pages
CPP Dynamic Type Recovery
No ratings yet
CPP Dynamic Type Recovery
129 pages
ScriptSharp-v0 4 2
100% (1)
ScriptSharp-v0 4 2
58 pages
Weeks 8-9 Sessions 21-26 Sub Procedures and Functions
No ratings yet
Weeks 8-9 Sessions 21-26 Sub Procedures and Functions
7 pages
Synopsis
No ratings yet
Synopsis
9 pages
Infor Solution License Manager Installation and Configuration Guide
No ratings yet
Infor Solution License Manager Installation and Configuration Guide
52 pages
Programming in Java Assignment3: NPTEL Online Certification Courses Indian Institute of Technology Kharagpur
100% (1)
Programming in Java Assignment3: NPTEL Online Certification Courses Indian Institute of Technology Kharagpur
6 pages
Karthik Microfocus CV
No ratings yet
Karthik Microfocus CV
4 pages
Lecture 1 of Software Construction
No ratings yet
Lecture 1 of Software Construction
115 pages
Number Range Concept
No ratings yet
Number Range Concept
10 pages
Essential Scala
No ratings yet
Essential Scala
245 pages
Vinay Kumar V
No ratings yet
Vinay Kumar V
3 pages
Chapter4 - MapReduce
No ratings yet
Chapter4 - MapReduce
29 pages
Mafabi Kennedy Resume 2
No ratings yet
Mafabi Kennedy Resume 2
1 page
Word Processing
No ratings yet
Word Processing
32 pages