Abstract
The 3 most important issues for anomaly detection based intrusion detection systems by using data mining methods are: feature selection, data value normalization, and the choice of data mining algorithms. In this paper, we study primarily the feature selection of network traffic and its impact on the detection rates. We use KDD CUP 1999 dataset as the sample for the study. We group the features of the dataset into 4 groups: Group I contains the basic network traffic features; Group II is actually not network traffic related, but the features collected from hosts; Group III and IV are temporally aggregated features. In this paper, we demonstrate the different detection rates of choosing the different combinations of these groups. We also demonstrate the effectiveness and the ineffectiveness in finding anomalies by looking at the network data alone. In addition, we also briefly investigate the effectiveness of data normalization. To validate our findings, we conducted the same experiments with 3 different clustering algorithms - K-means clustering, fuzzy C means clustering (FCM), and fuzzy entropy clustering (FE).
Original language | English |
---|---|
Title of host publication | Proceedings of the IEEE Intelligence and Security Informatics Conference 2008 (ISI 2008) |
Editors | C Yang, W Chen, W Hsu, T Wu |
Place of Publication | United States |
Publisher | IEEE, Institute of Electrical and Electronics Engineers |
Pages | 245-247 |
Number of pages | 3 |
ISBN (Print) | 9781424424146 |
DOIs | |
Publication status | Published - 2008 |
Event | IEEE Intelligence and Security Informatics Conference 2008 (ISI 2008) - Taipei, Taiwan, Province of China Duration: 17 Jun 2008 → 20 Jun 2008 |
Conference
Conference | IEEE Intelligence and Security Informatics Conference 2008 (ISI 2008) |
---|---|
Country/Territory | Taiwan, Province of China |
City | Taipei |
Period | 17/06/08 → 20/06/08 |