Multiple distribution data description learning algorithm for novelty detection

Research output: A Conference proceeding or a Chapter in BookConference contribution

3 Citations (Scopus)

Abstract

Current data description learning methods for novelty detection such as support vector data description and small sphere with large margin construct a spherically shaped boundary around a normal data set to separate this set from abnormal data. The volume of this sphere is minimized to reduce the chance of accepting abnormal data. However those learning methods do not guarantee that the single spherically shaped boundary can best describe the normal data set if there exist some distinctive data distributions in this set. We propose in this paper a new data description learning method that constructs a set of spherically shaped boundaries to provide a better data description to the normal data set. An optimisation problem is proposed and solving this problem results in an iterative learning algorithm to determine the set of spherically shaped boundaries. We prove that the classification error will be reduced after each iteration in our learning method. Experimental results on 28 well-known data sets show that the proposed method provides lower classification error rates.

Original languageEnglish
Title of host publicationAdvances in Knowledge Discovery and Data Mining
Subtitle of host publication15th Pacific-Asia Conference, PAKDD 2011, Proceedings, Part II
EditorsJoshua Zhexue Huang, Longbing Cao, Jaideep Srivastava
Place of PublicationBerlin, Germany
PublisherSpringer
Pages246-257
Number of pages12
ISBN (Electronic)9783642208478
ISBN (Print)9783642208461
DOIs
Publication statusPublished - 2011
Event15th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2011 - Shenzhen, China
Duration: 24 May 201127 May 2011

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 2
Volume6635 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference15th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2011
CountryChina
CityShenzhen
Period24/05/1127/05/11

Fingerprint

Novelty Detection
Data description
Data Distribution
Learning algorithms
Learning Algorithm
Support Vector Data Description
Margin
Iterative Algorithm
Error Rate
Optimization Problem
Iteration

Cite this

Le, T., Tran, D., Ma, W., & Sharma, D. (2011). Multiple distribution data description learning algorithm for novelty detection. In J. Z. Huang, L. Cao, & J. Srivastava (Eds.), Advances in Knowledge Discovery and Data Mining: 15th Pacific-Asia Conference, PAKDD 2011, Proceedings, Part II (pp. 246-257). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 6635 LNAI, No. PART 2). Berlin, Germany: Springer. https://doi.org/10.1007/978-3-642-20847-8_21
Le, Trung ; Tran, Dat ; Ma, Wanli ; Sharma, Dharmendra. / Multiple distribution data description learning algorithm for novelty detection. Advances in Knowledge Discovery and Data Mining: 15th Pacific-Asia Conference, PAKDD 2011, Proceedings, Part II. editor / Joshua Zhexue Huang ; Longbing Cao ; Jaideep Srivastava. Berlin, Germany : Springer, 2011. pp. 246-257 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); PART 2).
@inproceedings{fb03834b936041968e97d2fa19180a24,
title = "Multiple distribution data description learning algorithm for novelty detection",
abstract = "Current data description learning methods for novelty detection such as support vector data description and small sphere with large margin construct a spherically shaped boundary around a normal data set to separate this set from abnormal data. The volume of this sphere is minimized to reduce the chance of accepting abnormal data. However those learning methods do not guarantee that the single spherically shaped boundary can best describe the normal data set if there exist some distinctive data distributions in this set. We propose in this paper a new data description learning method that constructs a set of spherically shaped boundaries to provide a better data description to the normal data set. An optimisation problem is proposed and solving this problem results in an iterative learning algorithm to determine the set of spherically shaped boundaries. We prove that the classification error will be reduced after each iteration in our learning method. Experimental results on 28 well-known data sets show that the proposed method provides lower classification error rates.",
keywords = "Novelty detection, one-class classification, spherically shaped boundary, support vector data description",
author = "Trung Le and Dat Tran and Wanli Ma and Dharmendra Sharma",
year = "2011",
doi = "10.1007/978-3-642-20847-8_21",
language = "English",
isbn = "9783642208461",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer",
number = "PART 2",
pages = "246--257",
editor = "Huang, {Joshua Zhexue } and Cao, {Longbing } and Srivastava, {Jaideep }",
booktitle = "Advances in Knowledge Discovery and Data Mining",
address = "Netherlands",

}

Le, T, Tran, D, Ma, W & Sharma, D 2011, Multiple distribution data description learning algorithm for novelty detection. in JZ Huang, L Cao & J Srivastava (eds), Advances in Knowledge Discovery and Data Mining: 15th Pacific-Asia Conference, PAKDD 2011, Proceedings, Part II. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), no. PART 2, vol. 6635 LNAI, Springer, Berlin, Germany, pp. 246-257, 15th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2011, Shenzhen, China, 24/05/11. https://doi.org/10.1007/978-3-642-20847-8_21

Multiple distribution data description learning algorithm for novelty detection. / Le, Trung; Tran, Dat; Ma, Wanli; Sharma, Dharmendra.

Advances in Knowledge Discovery and Data Mining: 15th Pacific-Asia Conference, PAKDD 2011, Proceedings, Part II. ed. / Joshua Zhexue Huang; Longbing Cao; Jaideep Srivastava. Berlin, Germany : Springer, 2011. p. 246-257 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 6635 LNAI, No. PART 2).

Research output: A Conference proceeding or a Chapter in BookConference contribution

TY - GEN

T1 - Multiple distribution data description learning algorithm for novelty detection

AU - Le, Trung

AU - Tran, Dat

AU - Ma, Wanli

AU - Sharma, Dharmendra

PY - 2011

Y1 - 2011

N2 - Current data description learning methods for novelty detection such as support vector data description and small sphere with large margin construct a spherically shaped boundary around a normal data set to separate this set from abnormal data. The volume of this sphere is minimized to reduce the chance of accepting abnormal data. However those learning methods do not guarantee that the single spherically shaped boundary can best describe the normal data set if there exist some distinctive data distributions in this set. We propose in this paper a new data description learning method that constructs a set of spherically shaped boundaries to provide a better data description to the normal data set. An optimisation problem is proposed and solving this problem results in an iterative learning algorithm to determine the set of spherically shaped boundaries. We prove that the classification error will be reduced after each iteration in our learning method. Experimental results on 28 well-known data sets show that the proposed method provides lower classification error rates.

AB - Current data description learning methods for novelty detection such as support vector data description and small sphere with large margin construct a spherically shaped boundary around a normal data set to separate this set from abnormal data. The volume of this sphere is minimized to reduce the chance of accepting abnormal data. However those learning methods do not guarantee that the single spherically shaped boundary can best describe the normal data set if there exist some distinctive data distributions in this set. We propose in this paper a new data description learning method that constructs a set of spherically shaped boundaries to provide a better data description to the normal data set. An optimisation problem is proposed and solving this problem results in an iterative learning algorithm to determine the set of spherically shaped boundaries. We prove that the classification error will be reduced after each iteration in our learning method. Experimental results on 28 well-known data sets show that the proposed method provides lower classification error rates.

KW - Novelty detection

KW - one-class classification

KW - spherically shaped boundary

KW - support vector data description

UR - http://www.scopus.com/inward/record.url?scp=79957956807&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-20847-8_21

DO - 10.1007/978-3-642-20847-8_21

M3 - Conference contribution

SN - 9783642208461

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 246

EP - 257

BT - Advances in Knowledge Discovery and Data Mining

A2 - Huang, Joshua Zhexue

A2 - Cao, Longbing

A2 - Srivastava, Jaideep

PB - Springer

CY - Berlin, Germany

ER -

Le T, Tran D, Ma W, Sharma D. Multiple distribution data description learning algorithm for novelty detection. In Huang JZ, Cao L, Srivastava J, editors, Advances in Knowledge Discovery and Data Mining: 15th Pacific-Asia Conference, PAKDD 2011, Proceedings, Part II. Berlin, Germany: Springer. 2011. p. 246-257. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); PART 2). https://doi.org/10.1007/978-3-642-20847-8_21