Recent Studies in Automatic Text Analysis and Document Retrieval

1973 Journal of the ACM 44 citations

Abstract

Many experts in mechanized text processing now agree that useful automatic language analysis procedures are largely unavailable and that the existing linguistic methodologies generally produce disappointing results. An attempt is made in the present study to identify those automatic procedures which appear most effective as a replacement for the missing language analysis. A series of computer experiments is described, designed to simulate a conventional document retrieval environment. It is found that a simple duplication, by automatic means, of the standard, manual document indexing and retrieval operations will not produce acceptable output results. New mechanized approaches to document handling are proposed, including document ranking methods, automatic dictionary and word list generation, and user feedback searches. It is shown that the fully automatic methodology is superior in effectiveness to the conventional procedures in normal use.

Keywords

Computer scienceAutomatic indexingInformation retrievalSearch engine indexingRanking (information retrieval)Word (group theory)Document retrievalNatural language processingDocument processingSimple (philosophy)Data miningArtificial intelligence

Affiliated Institutions

Related Publications

Publication Info

Year
1973
Type
article
Volume
20
Issue
2
Pages
258-278
Citations
44
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

44
OpenAlex

Cite This

Gerard Salton (1973). Recent Studies in Automatic Text Analysis and Document Retrieval. Journal of the ACM , 20 (2) , 258-278. https://doi.org/10.1145/321752.321757

Identifiers

DOI
10.1145/321752.321757