Abstract
Information filtering (IF) systems usually filter data items by correlating a set of terms representing the user's interest (a user profile) with similar sets of terms representing the data items. Many techniques can be employed for constructing user profiles automatically, but they usually yield large sets of term. Various dimensionality-reduction techniques can be applied in order to reduce the number of terms in a user profile. We describe a new terms selection technique including a dimensionality-reduction mechanism which is based on the analysis of a trained artificial neural network (ANN) model. Its novel feature is the identification of an optimal set of terms that can classify correctly data items that are relevant to a user. The proposed technique was compared with the classical Rocchio algorithm. We found that when using all the distinct terms in the training set to train an ANN, the Rocchio algorithm outperforms the ANN based filtering system, but after applying the new dimensionality-reduction technique, leaving only an optimal set of terms, the improved ANN technique outperformed both the original ANN and the Rocchio algorithm.
Original language | English |
---|---|
Pages (from-to) | 469-483 |
Number of pages | 15 |
Journal | Information Processing and Management |
Volume | 42 |
Issue number | 2 |
DOIs | |
State | Published - 1 Mar 2006 |
Keywords
- Artificial neural network
- Feature selection
- Information filtering
- User profile
ASJC Scopus subject areas
- Information Systems
- Media Technology
- Computer Science Applications
- Management Science and Operations Research
- Library and Information Sciences