New multi-dimensional sorting based K-anonymity microaggregation for statistical disclosure control

Abdun Naser Mahmood, Md Enamul Kabir

Research output: A Conference proceeding or a Chapter in BookConference contribution

3 Citations (Scopus)

Abstract

In recent years, there has been an alarming increase of online identity theft and attacks using personally identifiable information. The goal of privacy preservation is to de-associate individuals from sensitive or microdata information. Microaggregation techniques seeks to protect microdata in such a way that can be published and mined without providing any private information that can be linked to specific individuals. Microaggregation works by partitioning the microdata into groups of at least k records and then replacing the records in each group with the centroid of the group. An optimal microaggregation method must minimize the information loss resulting from this replacement process. The challenge is how to minimize the information loss during the microaggregation process. This paper presents a new microaggregation technique for Statistical Disclosure Control (SDC). It consists of two stages. In the first stage, the algorithm sorts all the records in the data set in a particular way to ensure that during microaggregation very dissimilar observations are never entered into the same cluster. In the second stage an optimal microaggregation method is used to create k-anonymous clusters while minimizing the information loss. It works by taking the sorted data and simultaneously creating two distant clusters using the two extreme sorted values as seeds for the clusters. The performance of the proposed technique is compared against the most recent microaggregation methods. Experimental results using benchmark datasets show that the proposed algorithm has the lowest information loss compared with a basket of techniques in the literature
Original languageEnglish
Title of host publicationInternational Conference on Security and Privacy in Communication Systems (SecureComm 2012)
Subtitle of host publicationSecurity and Privacy in Communication Networks
EditorsA. D. Keromytis, R. Di Pietro
Place of PublicationGermany
PublisherSpringer
Pages256-272
Number of pages17
Volume106
ISBN (Electronic)9783642368837
ISBN (Print)9783642368820
DOIs
Publication statusPublished - 2013
Externally publishedYes
Event8th International Conference on Security and Privacy in Communication Networks - Padua, Italy
Duration: 3 Sep 20125 Sep 2012

Publication series

NameLecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
PublisherSpringer
Volume106
ISSN (Print)1867-8211

Conference

Conference8th International Conference on Security and Privacy in Communication Networks
CountryItaly
CityPadua
Period3/09/125/09/12

Fingerprint

Sorting
Seed

Cite this

Mahmood, A. N., & Kabir, M. E. (2013). New multi-dimensional sorting based K-anonymity microaggregation for statistical disclosure control. In A. D. Keromytis, & R. Di Pietro (Eds.), International Conference on Security and Privacy in Communication Systems (SecureComm 2012): Security and Privacy in Communication Networks (Vol. 106, pp. 256-272). (Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering; Vol. 106). Germany: Springer. https://doi.org/10.1007/978-3-642-36883-7_16
Mahmood, Abdun Naser ; Kabir, Md Enamul. / New multi-dimensional sorting based K-anonymity microaggregation for statistical disclosure control. International Conference on Security and Privacy in Communication Systems (SecureComm 2012): Security and Privacy in Communication Networks. editor / A. D. Keromytis ; R. Di Pietro. Vol. 106 Germany : Springer, 2013. pp. 256-272 (Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering).
@inproceedings{adc5054a48f64dfbad7488beacb98a83,
title = "New multi-dimensional sorting based K-anonymity microaggregation for statistical disclosure control",
abstract = "In recent years, there has been an alarming increase of online identity theft and attacks using personally identifiable information. The goal of privacy preservation is to de-associate individuals from sensitive or microdata information. Microaggregation techniques seeks to protect microdata in such a way that can be published and mined without providing any private information that can be linked to specific individuals. Microaggregation works by partitioning the microdata into groups of at least k records and then replacing the records in each group with the centroid of the group. An optimal microaggregation method must minimize the information loss resulting from this replacement process. The challenge is how to minimize the information loss during the microaggregation process. This paper presents a new microaggregation technique for Statistical Disclosure Control (SDC). It consists of two stages. In the first stage, the algorithm sorts all the records in the data set in a particular way to ensure that during microaggregation very dissimilar observations are never entered into the same cluster. In the second stage an optimal microaggregation method is used to create k-anonymous clusters while minimizing the information loss. It works by taking the sorted data and simultaneously creating two distant clusters using the two extreme sorted values as seeds for the clusters. The performance of the proposed technique is compared against the most recent microaggregation methods. Experimental results using benchmark datasets show that the proposed algorithm has the lowest information loss compared with a basket of techniques in the literature",
keywords = "Disclosure control, Microaggregation, Microdata protection, Privacy, k-anonymity",
author = "Mahmood, {Abdun Naser} and Kabir, {Md Enamul}",
year = "2013",
doi = "10.1007/978-3-642-36883-7_16",
language = "English",
isbn = "9783642368820",
volume = "106",
series = "Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering",
publisher = "Springer",
pages = "256--272",
editor = "Keromytis, {A. D.} and {Di Pietro}, R.",
booktitle = "International Conference on Security and Privacy in Communication Systems (SecureComm 2012)",
address = "Netherlands",

}

Mahmood, AN & Kabir, ME 2013, New multi-dimensional sorting based K-anonymity microaggregation for statistical disclosure control. in AD Keromytis & R Di Pietro (eds), International Conference on Security and Privacy in Communication Systems (SecureComm 2012): Security and Privacy in Communication Networks. vol. 106, Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol. 106, Springer, Germany, pp. 256-272, 8th International Conference on Security and Privacy in Communication Networks, Padua, Italy, 3/09/12. https://doi.org/10.1007/978-3-642-36883-7_16

New multi-dimensional sorting based K-anonymity microaggregation for statistical disclosure control. / Mahmood, Abdun Naser; Kabir, Md Enamul.

International Conference on Security and Privacy in Communication Systems (SecureComm 2012): Security and Privacy in Communication Networks. ed. / A. D. Keromytis; R. Di Pietro. Vol. 106 Germany : Springer, 2013. p. 256-272 (Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering; Vol. 106).

Research output: A Conference proceeding or a Chapter in BookConference contribution

TY - GEN

T1 - New multi-dimensional sorting based K-anonymity microaggregation for statistical disclosure control

AU - Mahmood, Abdun Naser

AU - Kabir, Md Enamul

PY - 2013

Y1 - 2013

N2 - In recent years, there has been an alarming increase of online identity theft and attacks using personally identifiable information. The goal of privacy preservation is to de-associate individuals from sensitive or microdata information. Microaggregation techniques seeks to protect microdata in such a way that can be published and mined without providing any private information that can be linked to specific individuals. Microaggregation works by partitioning the microdata into groups of at least k records and then replacing the records in each group with the centroid of the group. An optimal microaggregation method must minimize the information loss resulting from this replacement process. The challenge is how to minimize the information loss during the microaggregation process. This paper presents a new microaggregation technique for Statistical Disclosure Control (SDC). It consists of two stages. In the first stage, the algorithm sorts all the records in the data set in a particular way to ensure that during microaggregation very dissimilar observations are never entered into the same cluster. In the second stage an optimal microaggregation method is used to create k-anonymous clusters while minimizing the information loss. It works by taking the sorted data and simultaneously creating two distant clusters using the two extreme sorted values as seeds for the clusters. The performance of the proposed technique is compared against the most recent microaggregation methods. Experimental results using benchmark datasets show that the proposed algorithm has the lowest information loss compared with a basket of techniques in the literature

AB - In recent years, there has been an alarming increase of online identity theft and attacks using personally identifiable information. The goal of privacy preservation is to de-associate individuals from sensitive or microdata information. Microaggregation techniques seeks to protect microdata in such a way that can be published and mined without providing any private information that can be linked to specific individuals. Microaggregation works by partitioning the microdata into groups of at least k records and then replacing the records in each group with the centroid of the group. An optimal microaggregation method must minimize the information loss resulting from this replacement process. The challenge is how to minimize the information loss during the microaggregation process. This paper presents a new microaggregation technique for Statistical Disclosure Control (SDC). It consists of two stages. In the first stage, the algorithm sorts all the records in the data set in a particular way to ensure that during microaggregation very dissimilar observations are never entered into the same cluster. In the second stage an optimal microaggregation method is used to create k-anonymous clusters while minimizing the information loss. It works by taking the sorted data and simultaneously creating two distant clusters using the two extreme sorted values as seeds for the clusters. The performance of the proposed technique is compared against the most recent microaggregation methods. Experimental results using benchmark datasets show that the proposed algorithm has the lowest information loss compared with a basket of techniques in the literature

KW - Disclosure control

KW - Microaggregation

KW - Microdata protection

KW - Privacy

KW - k-anonymity

U2 - 10.1007/978-3-642-36883-7_16

DO - 10.1007/978-3-642-36883-7_16

M3 - Conference contribution

SN - 9783642368820

VL - 106

T3 - Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

SP - 256

EP - 272

BT - International Conference on Security and Privacy in Communication Systems (SecureComm 2012)

A2 - Keromytis, A. D.

A2 - Di Pietro, R.

PB - Springer

CY - Germany

ER -

Mahmood AN, Kabir ME. New multi-dimensional sorting based K-anonymity microaggregation for statistical disclosure control. In Keromytis AD, Di Pietro R, editors, International Conference on Security and Privacy in Communication Systems (SecureComm 2012): Security and Privacy in Communication Networks. Vol. 106. Germany: Springer. 2013. p. 256-272. (Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering). https://doi.org/10.1007/978-3-642-36883-7_16