Abstract
The automatic removal of suffixes from words in English is of particular interest in the field of information retrieval. An algorithm for suffix stripping is described, which has been implemented as a short, fast program in BCPL. Although simple, it performs slightly better than a much more elaborate system with which it has been compared. It effectively works by treating complex suffixes as compounds made up of simple suffixes, and removing the simple suffixes in a number of steps. In each step the removal of the suffix is made to depend upon the form of the remaining stem, which usually involves a measure of its syllable length.
Keywords
Affiliated Institutions
Related Publications
MUMmer4: A fast and versatile genome alignment system
The MUMmer system and the genome sequence aligner nucmer included within it are among the most widely used alignment packages in genomics. Since the last major release of MUMmer...
A Scalable Hierarchical Distributed Language Model
Neural probabilistic language models (NPLMs) have been shown to be competitive with and occasionally superior to the widely-used n-gram language models. The main drawback of NPL...
Mining frequent patterns without candidate generation
Mining frequent patterns in transaction databases, time-series databases, and many other kinds of databases has been studied popularly in data mining research. Most of the previ...
Appraisal of a simple arsenic removal method for ground water of Bangladesh
Abstract A simple three‐pitcher (locally known as '3‐kalshi') filtration assembly made entirely from readily available local materials is tested for its efficacy in removing ars...
GHOSTX: An Improved Sequence Homology Search Algorithm Using a Query Suffix Array and a Database Suffix Array
DNA sequences are translated into protein coding sequences and then further assigned to protein families in metagenomic analyses, because of the need for sensitivity. However, h...
Publication Info
- Year
- 1980
- Type
- article
- Volume
- 14
- Issue
- 3
- Pages
- 130-137
- Citations
- 8045
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1108/eb046814