Abstract

Incorporating feature selection into a classification or regression method often carries a number of advantages. In this paper we formalize feature selection specifically from a discriminative perspective of improving classification/regression accuracy. The feature selection method is developed as an extension to the recently proposed maximum entropy discrimination (MED) framework. We describe MED as a flexible (Bayesian) regularization approach that subsumes, e.g., support vector classification, regression and exponential family models. For brevity, we restrict ourselves primarily to feature selection in the context of linear classification/regression methods and demonstrate that the proposed approach indeed carries substantial improvements in practice. Moreover, we discuss and develop various extensions of feature selection, including the problem of dealing with example specific but unobserved degrees of freedom -- alignments or invariants.

Keywords

Feature selectionDiscriminative modelArtificial intelligenceComputer sciencePrinciple of maximum entropyPattern recognition (psychology)Machine learningEntropy (arrow of time)RegressionFeature (linguistics)Support vector machineMathematicsStatistics

Affiliated Institutions

Related Publications

Publication Info

Year
2013
Type
article
Pages
291-300
Citations
76
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

76
OpenAlex

Cite This

Tony Jebara, Tommi Jaakkola (2013). Feature Selection and Dualities in Maximum Entropy Discrimination. arXiv (Cornell University) , 291-300. https://doi.org/10.48550/arxiv.1301.3865

Identifiers

DOI
10.48550/arxiv.1301.3865