A new approach for fuzzy clustering of web documents

Menahem Friedman, Moti Schneider, Mark Last, Omer Zaafrany, Abraham Kandel

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Scopus citations

Abstract

Most existing methods of document clustering are based on the classical vector-space model, which represents each document by a fixed-size vector of key terms or key phrases. In large and diverse document collections such as the World Wide Web, this approach suffers from a tremendous computational overload, since the constant size of the term vector equals to the total number of key terms in all documents. We propose a new fuzzy-based approach to clustering documents that are represented by vectors of variable size. Each entry in a vector consists of two Fields. The first field is the name of a key phrase in the document and the second denotes an importance weight associated with this key phrase within the particular document. We will describe the proposed approach in detail and show how it is implemented in a real world application from the area of web monitoring.

Original languageEnglish
Title of host publication2004 IEEE International Conference on Fuzzy Systems - Proceedings
Pages377-381
Number of pages5
DOIs
StatePublished - 1 Dec 2004
Event2004 IEEE International Conference on Fuzzy Systems - Proceedings - Budapest, Hungary
Duration: 25 Jul 200429 Jul 2004

Publication series

NameIEEE International Conference on Fuzzy Systems
Volume1
ISSN (Print)1098-7584

Conference

Conference2004 IEEE International Conference on Fuzzy Systems - Proceedings
Country/TerritoryHungary
CityBudapest
Period25/07/0429/07/04

ASJC Scopus subject areas

  • Software
  • Theoretical Computer Science
  • Artificial Intelligence
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'A new approach for fuzzy clustering of web documents'. Together they form a unique fingerprint.

Cite this