KACTUS 2: Privacy preserving in classification tasks using k-anonymity

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

10 Scopus citations


k-anonymity is the method used for masking sensitive data which successfully solves the problem of re-linking of data with an external source and makes it difficult to re-identify the individual. Thus k-anonymity works on a set of quasi-identifiers (public sensitive attributes), whose possible availability and linking is anticipated from external dataset, and demands that the released dataset will contain at least k records for every possible quasi-identifier value. Another aspect of k is its capability of maintaining the truthfulness of the released data (unlike other existing methods). This is achieved by generalization, a primary technique in k-anonymity. Generalization consists of generalizing attribute values and substituting them with semantically consistent but less precise values. When the substituted value doesn't preserve semantic validity the technique is called suppression which is a private case of generalization. We present a hybrid approach called compensation which is based on suppression and swapping for achieving privacy. Since swapping decreases the truthfulness of attribute values there is a tradeoff between level of swapping (information truthfulness) and suppression (information loss) incorporated in our algorithm. We use k-anonymity to explore the issue of anonymity preservation. Since we do not use generalization, we do not need a priori knowledge of attribute semantics. We investigate data anonymization in the context of classification and use tree properties to satisfy k-anonymization. Our work improves previous approaches by increasing classification accuracy.

Original languageEnglish
Title of host publicationProtecting Persons While Protecting the People - Second Annual Workshop on Information Privacy and National Security, ISIPS 2008, Revised Selected Papers
Number of pages19
StatePublished - 1 Dec 2009
Event2nd Annual Workshop on Privacy and Security, ISIPS 2008 - New Brunswick, NJ, United States
Duration: 12 May 200812 May 2008

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume5661 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Conference2nd Annual Workshop on Privacy and Security, ISIPS 2008
Country/TerritoryUnited States
CityNew Brunswick, NJ


  • Anonymity
  • Data mining
  • Generalization
  • Privacy preserving
  • Suppression

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science (all)


Dive into the research topics of 'KACTUS 2: Privacy preserving in classification tasks using k-anonymity'. Together they form a unique fingerprint.

Cite this