org.getopt.stempel
Class Stemmer

java.lang.Object
  extended byorg.getopt.stempel.Stemmer

public class Stemmer
extends java.lang.Object

Stemmer class is a convenient facade for other stemmer-related classes. The core stemming algorithm and its implementation is taken verbatim from the Egothor project ( www.egothor.org ).

Even though the stemmer tables supplied in the distribution package are built for Polish language, there is nothing language-specific here.

Author:
Andrzej Bialecki <ab@getopt.org>

Field Summary
 int MIN_LENGTH
          Minimum length of input words to be processed.
 
Constructor Summary
Stemmer()
          Create a Stemmer using stemmer table loaded from resource path pointed to by System property org.getopt.stempel.table.
Stemmer(java.lang.String stemmerTable)
          Create a Stemmer using selected stemmer table
Stemmer(Trie stemmer)
          Create a Stemmer using pre-loaded stemmer table
 
Method Summary
 java.lang.String getTableResPath()
          Return resource path to the stemmer table, or null if initialized with preloaded table.
static void main(java.lang.String[] args)
          Testing method.
 java.lang.String stem(java.lang.String word, boolean hideMissing)
          Stem a word.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

MIN_LENGTH

public int MIN_LENGTH
Minimum length of input words to be processed. Shorter words are returned unchanged.

Constructor Detail

Stemmer

public Stemmer()
Create a Stemmer using stemmer table loaded from resource path pointed to by System property org.getopt.stempel.table. If this property is missing, it is assumed that the included stemmer_2000.out table is to be used.


Stemmer

public Stemmer(java.lang.String stemmerTable)
Create a Stemmer using selected stemmer table

Parameters:
stemmerTable - resource path to stemmer table. This resource will be looked up using this class's ClassLoader.

Stemmer

public Stemmer(Trie stemmer)
Create a Stemmer using pre-loaded stemmer table

Parameters:
stemmer - pre-loaded stemmer table
Method Detail

getTableResPath

public java.lang.String getTableResPath()
Return resource path to the stemmer table, or null if initialized with preloaded table.


stem

public java.lang.String stem(java.lang.String word,
                             boolean hideMissing)
Stem a word. For performance reasons words shorter than MIN_LENGTH characters are not processed, but simply returned.

Parameters:
word - input word to be stemmed.
hideMissing - if true, and the stem could not be found, return the input word. If false, return null in such case.
Returns:
stemmed word, or null if the stem could not be generated.

main

public static void main(java.lang.String[] args)
                 throws java.lang.Exception
Testing method. Stemmer table file name is taken from the first argument, and the second argument is the word to stem. If one argument is given, the default table is assumed, and the argument is a word to stem.

Parameters:
args -
Throws:
java.lang.Exception