Abstract
A simple method for categorizing texts into pre-determined text genre categories using the statistical standard technique of discriminant analysis is demonstrated with application to the Brown corpus. Discriminant analysis makes it possible use a large number of parameters that may be specific for a certain corpus or information stream, and combine them into a small number of functions, with the parameters weighted on basis of how useful they are for discriminating text genres. An application to information retrieval is discussed.
Keywords
Affiliated Institutions
Related Publications
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
Abstract Motivation Biomedical text mining is becoming increasingly important as the number of biomedical documents rapidly grows. With the progress in natural language processi...
Factor Analysis and AIC
The information criterion AIC was introduced to extend the method of maximum likelihood to the multimodel situation. It was obtained by relating the successful experience of the...
An exploration of large vocabulary tools for small vocabulary phonetic recognition
While research in large vocabulary continuous speech recognition (LVCSR) has sparked the development of many state of the art research ideas, research in this domain suffers fro...
Publication Info
- Year
- 1994
- Type
- article
- Volume
- 2
- Pages
- 1071-1071
- Citations
- 293
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.3115/991250.991324
- arXiv
- cmp-lg/9410008