A Study on the Feature Selection of Network Traffic for intrusion Detection Purpose

    Research output: A Conference proceeding or a Chapter in BookConference contribution

    12 Citations (Scopus)

    Abstract

    The 3 most important issues for anomaly detection based intrusion detection systems by using data mining methods are: feature selection, data value normalization, and the choice of data mining algorithms. In this paper, we study primarily the feature selection of network traffic and its impact on the detection rates. We use KDD CUP 1999 dataset as the sample for the study. We group the features of the dataset into 4 groups: Group I contains the basic network traffic features; Group II is actually not network traffic related, but the features collected from hosts; Group III and IV are temporally aggregated features. In this paper, we demonstrate the different detection rates of choosing the different combinations of these groups. We also demonstrate the effectiveness and the ineffectiveness in finding anomalies by looking at the network data alone. In addition, we also briefly investigate the effectiveness of data normalization. To validate our findings, we conducted the same experiments with 3 different clustering algorithms - K-means clustering, fuzzy C means clustering (FCM), and fuzzy entropy clustering (FE).
    Original languageEnglish
    Title of host publicationProceedings of the IEEE Intelligence and Security Informatics Conference 2008 (ISI 2008)
    EditorsC Yang, W Chen, W Hsu, T Wu
    Place of PublicationUnited States
    PublisherIEEE, Institute of Electrical and Electronics Engineers
    Pages245-247
    Number of pages3
    ISBN (Print)9781424424146
    DOIs
    Publication statusPublished - 2008
    EventIEEE Intelligence and Security Informatics Conference 2008 (ISI 2008) - Taipei, Taiwan, Province of China
    Duration: 17 Jun 200820 Jun 2008

    Conference

    ConferenceIEEE Intelligence and Security Informatics Conference 2008 (ISI 2008)
    CountryTaiwan, Province of China
    CityTaipei
    Period17/06/0820/06/08

    Fingerprint

    Intrusion detection
    Data mining
    Feature extraction
    Fuzzy clustering
    Clustering algorithms
    Entropy
    Experiments

    Cite this

    Ma, W., Tran, D., & Sharma, D. (2008). A Study on the Feature Selection of Network Traffic for intrusion Detection Purpose. In C. Yang, W. Chen, W. Hsu, & T. Wu (Eds.), Proceedings of the IEEE Intelligence and Security Informatics Conference 2008 (ISI 2008) (pp. 245-247). United States: IEEE, Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/ISI.2008.4565069
    Ma, Wanli ; Tran, Dat ; Sharma, Dharmendra. / A Study on the Feature Selection of Network Traffic for intrusion Detection Purpose. Proceedings of the IEEE Intelligence and Security Informatics Conference 2008 (ISI 2008). editor / C Yang ; W Chen ; W Hsu ; T Wu. United States : IEEE, Institute of Electrical and Electronics Engineers, 2008. pp. 245-247
    @inproceedings{07f94fc32e664d00bb07912da08867e6,
    title = "A Study on the Feature Selection of Network Traffic for intrusion Detection Purpose",
    abstract = "The 3 most important issues for anomaly detection based intrusion detection systems by using data mining methods are: feature selection, data value normalization, and the choice of data mining algorithms. In this paper, we study primarily the feature selection of network traffic and its impact on the detection rates. We use KDD CUP 1999 dataset as the sample for the study. We group the features of the dataset into 4 groups: Group I contains the basic network traffic features; Group II is actually not network traffic related, but the features collected from hosts; Group III and IV are temporally aggregated features. In this paper, we demonstrate the different detection rates of choosing the different combinations of these groups. We also demonstrate the effectiveness and the ineffectiveness in finding anomalies by looking at the network data alone. In addition, we also briefly investigate the effectiveness of data normalization. To validate our findings, we conducted the same experiments with 3 different clustering algorithms - K-means clustering, fuzzy C means clustering (FCM), and fuzzy entropy clustering (FE).",
    author = "Wanli Ma and Dat Tran and Dharmendra Sharma",
    year = "2008",
    doi = "10.1109/ISI.2008.4565069",
    language = "English",
    isbn = "9781424424146",
    pages = "245--247",
    editor = "C Yang and W Chen and W Hsu and T Wu",
    booktitle = "Proceedings of the IEEE Intelligence and Security Informatics Conference 2008 (ISI 2008)",
    publisher = "IEEE, Institute of Electrical and Electronics Engineers",
    address = "United States",

    }

    Ma, W, Tran, D & Sharma, D 2008, A Study on the Feature Selection of Network Traffic for intrusion Detection Purpose. in C Yang, W Chen, W Hsu & T Wu (eds), Proceedings of the IEEE Intelligence and Security Informatics Conference 2008 (ISI 2008). IEEE, Institute of Electrical and Electronics Engineers, United States, pp. 245-247, IEEE Intelligence and Security Informatics Conference 2008 (ISI 2008), Taipei, Taiwan, Province of China, 17/06/08. https://doi.org/10.1109/ISI.2008.4565069

    A Study on the Feature Selection of Network Traffic for intrusion Detection Purpose. / Ma, Wanli; Tran, Dat; Sharma, Dharmendra.

    Proceedings of the IEEE Intelligence and Security Informatics Conference 2008 (ISI 2008). ed. / C Yang; W Chen; W Hsu; T Wu. United States : IEEE, Institute of Electrical and Electronics Engineers, 2008. p. 245-247.

    Research output: A Conference proceeding or a Chapter in BookConference contribution

    TY - GEN

    T1 - A Study on the Feature Selection of Network Traffic for intrusion Detection Purpose

    AU - Ma, Wanli

    AU - Tran, Dat

    AU - Sharma, Dharmendra

    PY - 2008

    Y1 - 2008

    N2 - The 3 most important issues for anomaly detection based intrusion detection systems by using data mining methods are: feature selection, data value normalization, and the choice of data mining algorithms. In this paper, we study primarily the feature selection of network traffic and its impact on the detection rates. We use KDD CUP 1999 dataset as the sample for the study. We group the features of the dataset into 4 groups: Group I contains the basic network traffic features; Group II is actually not network traffic related, but the features collected from hosts; Group III and IV are temporally aggregated features. In this paper, we demonstrate the different detection rates of choosing the different combinations of these groups. We also demonstrate the effectiveness and the ineffectiveness in finding anomalies by looking at the network data alone. In addition, we also briefly investigate the effectiveness of data normalization. To validate our findings, we conducted the same experiments with 3 different clustering algorithms - K-means clustering, fuzzy C means clustering (FCM), and fuzzy entropy clustering (FE).

    AB - The 3 most important issues for anomaly detection based intrusion detection systems by using data mining methods are: feature selection, data value normalization, and the choice of data mining algorithms. In this paper, we study primarily the feature selection of network traffic and its impact on the detection rates. We use KDD CUP 1999 dataset as the sample for the study. We group the features of the dataset into 4 groups: Group I contains the basic network traffic features; Group II is actually not network traffic related, but the features collected from hosts; Group III and IV are temporally aggregated features. In this paper, we demonstrate the different detection rates of choosing the different combinations of these groups. We also demonstrate the effectiveness and the ineffectiveness in finding anomalies by looking at the network data alone. In addition, we also briefly investigate the effectiveness of data normalization. To validate our findings, we conducted the same experiments with 3 different clustering algorithms - K-means clustering, fuzzy C means clustering (FCM), and fuzzy entropy clustering (FE).

    U2 - 10.1109/ISI.2008.4565069

    DO - 10.1109/ISI.2008.4565069

    M3 - Conference contribution

    SN - 9781424424146

    SP - 245

    EP - 247

    BT - Proceedings of the IEEE Intelligence and Security Informatics Conference 2008 (ISI 2008)

    A2 - Yang, C

    A2 - Chen, W

    A2 - Hsu, W

    A2 - Wu, T

    PB - IEEE, Institute of Electrical and Electronics Engineers

    CY - United States

    ER -

    Ma W, Tran D, Sharma D. A Study on the Feature Selection of Network Traffic for intrusion Detection Purpose. In Yang C, Chen W, Hsu W, Wu T, editors, Proceedings of the IEEE Intelligence and Security Informatics Conference 2008 (ISI 2008). United States: IEEE, Institute of Electrical and Electronics Engineers. 2008. p. 245-247 https://doi.org/10.1109/ISI.2008.4565069