An algorithm for suffix stripping

1980 Program electronic library and information systems 8,045 citations

Abstract

The automatic removal of suffixes from words in English is of particular interest in the field of information retrieval. An algorithm for suffix stripping is described, which has been implemented as a short, fast program in BCPL. Although simple, it performs slightly better than a much more elaborate system with which it has been compared. It effectively works by treating complex suffixes as compounds made up of simple suffixes, and removing the simple suffixes in a number of steps. In each step the removal of the suffix is made to depend upon the form of the remaining stem, which usually involves a measure of its syllable length.

Keywords

SuffixStripping (fiber)Simple (philosophy)Computer scienceSuffix arrayGeneralized suffix treeCompressed suffix arrayAlgorithmField (mathematics)Measure (data warehouse)SIMPLE algorithmNatural language processingSuffix treeArtificial intelligenceMathematicsData miningLinguisticsPhysicsPure mathematicsEngineering

Affiliated Institutions

Related Publications

Publication Info

Year
1980
Type
article
Volume
14
Issue
3
Pages
130-137
Citations
8045
Access
Closed

Social Impact

Altmetric

Social media, news, blog, policy document mentions

Citation Metrics

8045
OpenAlex
293
Influential
4437
CrossRef

Cite This

Martin Porter (1980). An algorithm for suffix stripping. Program electronic library and information systems , 14 (3) , 130-137. https://doi.org/10.1108/eb046814

Identifiers

DOI
10.1108/eb046814

Data Quality

Data completeness: 77%