diff options
Diffstat (limited to 'python/PyStemmer/README')
-rw-r--r-- | python/PyStemmer/README | 18 |
1 files changed, 18 insertions, 0 deletions
diff --git a/python/PyStemmer/README b/python/PyStemmer/README new file mode 100644 index 0000000000..161b13c630 --- /dev/null +++ b/python/PyStemmer/README @@ -0,0 +1,18 @@ +Snowball stemming algorithms, for information retrieval + +Stemming algorithms + +PyStemmer provides access to efficient algorithms for calculating a "stemmed" +form of a word. This is a form with most of the common morphological endings +removed; hopefully representing a common linguistic base form. This is most +useful in building search engines and information retrieval software; +for example, a search with stemming enabled should be able to find a document +containing "cycling" given the query "cycles". + +PyStemmer provides algorithms for several (mainly european) languages, by +wrapping the libstemmer library from the Snowball project in a Python module. + +It also provides access to the classic Porter stemming algorithm for english: +although this has been superceded by an improved algorithm, the original +algorithm may be of interest to information retrieval researchers wishing +to reproduce results of earlier experiments. |