Spam Recognition using Linear Regression and Radial Basis Function Neural Network

Tich Phuoc Tran, Min Li, Dat Tran, Dam Duong Ton

    Research output: A Conference proceeding or a Chapter in BookChapter

    Abstract

    Spamming is the abuse of electronic messaging systems to send unsolicited bulk messages. It is becoming a serious problem for organizations and individual email users due to the growing popularity and low cost of electronic mails. Unlike other web threats such as hacking and Internet worms which directly damage our information assets, spam could harm the computer networks in an indirect way ranging from network problems like increased server load, decreased network performance and viruses to personnel issues like lost employee time, phishing scams, and offensive content. Though a large amount of research has been conducted in this area to prevent spamming from undermining the usability of email, currently existing filtering methods' performance still suffers from extensive computation (with large volume of emails received) and unreliable predictive capability (due to highly dynamic nature of emails). In this chapter, we discuss the challenging problems of Spam Recognition and then propose an anti-spam filtering framework; in which appropriate dimension reduction schemes and powerful classification models are employed. In particular, Principal Component Analysis transforms data to a lower dimensional space which is subsequently used to train an Artificial Neural Network based classifier. A cost-sensitive empirical analysis with a publicly available email corpus, namely Ling-Spam, suggests that our spam recognition framework outperforms other state¬of-the-art learning methods in terms of spam detection capability. In the case of extremely high misclassification cost, while other methods' performance deteriorates significantly as the cost factor increases, our model still remains stable accuracy with low computation cost.
    Original languageEnglish
    Title of host publicationPattern Recognition
    EditorsPeng-Yeng Yin
    Place of PublicationIndia
    PublisherIn-Tech
    Pages513-532
    Number of pages20
    Edition1
    ISBN (Print)9789533070148
    DOIs
    Publication statusPublished - 2009

      Fingerprint

    Cite this

    Tran, T. P., Li, M., Tran, D., & Ton, D. D. (2009). Spam Recognition using Linear Regression and Radial Basis Function Neural Network. In P-Y. Yin (Ed.), Pattern Recognition (1 ed., pp. 513-532). India: In-Tech. https://doi.org/10.5772/7529