Adaptable Term Weighting Framework for Text Classification

Research output: A Conference proceeding or a Chapter in BookConference contribution

1 Citation (Scopus)

Abstract

In text classification, term frequency and term co-occurrence factors are dominantly used in weighting term features. Category relevance factors have recently been used to propose term weighting approaches. However, these approaches are mainly based on their own-designed text classifiers to adapt to category information, where the advantages of popular text classifiers have been ignored. This paper proposes a term weighting framework for text classification tasks. The framework firstly inherits the benefits of provided category information to estimate the weighting of features. Secondly, based on the feedback information, it is able to continuously adjust feature weightings to find the best representations for documents. Thirdly, the framework robustly makes it possible to work with different text classifiers on classifying the text representations, based on category information. On several corpora with SVM classifier, experiments show that given predicted information from TFxIDF method as initial status, the proposed approach leverages accuracy results and outperforms current text classification approaches.
Original languageEnglish
Title of host publicationInternationoal Conference on Computational Linguistics and Intelligent Text Processing (CICLing 2011)
Subtitle of host publicationLecture Notes in Computer Science
EditorsAlexander F Gelbukh
Place of PublicationBerlin Heidelberg
PublisherSpringer Verlag
Pages254-265
Number of pages12
Volume6609
ISBN (Electronic)9783642194375
ISBN (Print)9783642194368
DOIs
Publication statusPublished - 2011
EventInternational Conference on Intelligent Text Processing and Computational Linguistics - Tokyo, Japan
Duration: 20 Feb 201126 Feb 2011
https://www.cicling.org/2011/

Conference

ConferenceInternational Conference on Intelligent Text Processing and Computational Linguistics
Abbreviated titleCICLING
CountryJapan
CityTokyo
Period20/02/1126/02/11
Internet address

Fingerprint

Classifiers
Feedback
Experiments

Cite this

Hyunh, D., Tran, D., Ma, W., & Sharma, D. (2011). Adaptable Term Weighting Framework for Text Classification. In A. F. Gelbukh (Ed.), Internationoal Conference on Computational Linguistics and Intelligent Text Processing (CICLing 2011): Lecture Notes in Computer Science (Vol. 6609, pp. 254-265). Berlin Heidelberg: Springer Verlag. https://doi.org/10.1007/978-3-642-19437-5_21
Hyunh, Dat ; Tran, Dat ; Ma, Wanli ; Sharma, Dharmendra. / Adaptable Term Weighting Framework for Text Classification. Internationoal Conference on Computational Linguistics and Intelligent Text Processing (CICLing 2011): Lecture Notes in Computer Science. editor / Alexander F Gelbukh. Vol. 6609 Berlin Heidelberg : Springer Verlag, 2011. pp. 254-265
@inproceedings{df7262671e7c427c84caae8a691376d2,
title = "Adaptable Term Weighting Framework for Text Classification",
abstract = "In text classification, term frequency and term co-occurrence factors are dominantly used in weighting term features. Category relevance factors have recently been used to propose term weighting approaches. However, these approaches are mainly based on their own-designed text classifiers to adapt to category information, where the advantages of popular text classifiers have been ignored. This paper proposes a term weighting framework for text classification tasks. The framework firstly inherits the benefits of provided category information to estimate the weighting of features. Secondly, based on the feedback information, it is able to continuously adjust feature weightings to find the best representations for documents. Thirdly, the framework robustly makes it possible to work with different text classifiers on classifying the text representations, based on category information. On several corpora with SVM classifier, experiments show that given predicted information from TFxIDF method as initial status, the proposed approach leverages accuracy results and outperforms current text classification approaches.",
keywords = "Machine Learning, Text Classification",
author = "Dat Hyunh and Dat Tran and Wanli Ma and Dharmendra Sharma",
year = "2011",
doi = "10.1007/978-3-642-19437-5_21",
language = "English",
isbn = "9783642194368",
volume = "6609",
pages = "254--265",
editor = "Gelbukh, {Alexander F}",
booktitle = "Internationoal Conference on Computational Linguistics and Intelligent Text Processing (CICLing 2011)",
publisher = "Springer Verlag",
address = "Germany",

}

Hyunh, D, Tran, D, Ma, W & Sharma, D 2011, Adaptable Term Weighting Framework for Text Classification. in AF Gelbukh (ed.), Internationoal Conference on Computational Linguistics and Intelligent Text Processing (CICLing 2011): Lecture Notes in Computer Science. vol. 6609, Springer Verlag, Berlin Heidelberg, pp. 254-265, International Conference on Intelligent Text Processing and Computational Linguistics, Tokyo, Japan, 20/02/11. https://doi.org/10.1007/978-3-642-19437-5_21

Adaptable Term Weighting Framework for Text Classification. / Hyunh, Dat; Tran, Dat; Ma, Wanli; Sharma, Dharmendra.

Internationoal Conference on Computational Linguistics and Intelligent Text Processing (CICLing 2011): Lecture Notes in Computer Science. ed. / Alexander F Gelbukh. Vol. 6609 Berlin Heidelberg : Springer Verlag, 2011. p. 254-265.

Research output: A Conference proceeding or a Chapter in BookConference contribution

TY - GEN

T1 - Adaptable Term Weighting Framework for Text Classification

AU - Hyunh, Dat

AU - Tran, Dat

AU - Ma, Wanli

AU - Sharma, Dharmendra

PY - 2011

Y1 - 2011

N2 - In text classification, term frequency and term co-occurrence factors are dominantly used in weighting term features. Category relevance factors have recently been used to propose term weighting approaches. However, these approaches are mainly based on their own-designed text classifiers to adapt to category information, where the advantages of popular text classifiers have been ignored. This paper proposes a term weighting framework for text classification tasks. The framework firstly inherits the benefits of provided category information to estimate the weighting of features. Secondly, based on the feedback information, it is able to continuously adjust feature weightings to find the best representations for documents. Thirdly, the framework robustly makes it possible to work with different text classifiers on classifying the text representations, based on category information. On several corpora with SVM classifier, experiments show that given predicted information from TFxIDF method as initial status, the proposed approach leverages accuracy results and outperforms current text classification approaches.

AB - In text classification, term frequency and term co-occurrence factors are dominantly used in weighting term features. Category relevance factors have recently been used to propose term weighting approaches. However, these approaches are mainly based on their own-designed text classifiers to adapt to category information, where the advantages of popular text classifiers have been ignored. This paper proposes a term weighting framework for text classification tasks. The framework firstly inherits the benefits of provided category information to estimate the weighting of features. Secondly, based on the feedback information, it is able to continuously adjust feature weightings to find the best representations for documents. Thirdly, the framework robustly makes it possible to work with different text classifiers on classifying the text representations, based on category information. On several corpora with SVM classifier, experiments show that given predicted information from TFxIDF method as initial status, the proposed approach leverages accuracy results and outperforms current text classification approaches.

KW - Machine Learning

KW - Text Classification

UR - https://link.springer.com/chapter/10.1007%2F978-3-642-19437-5_21#citeas

U2 - 10.1007/978-3-642-19437-5_21

DO - 10.1007/978-3-642-19437-5_21

M3 - Conference contribution

SN - 9783642194368

VL - 6609

SP - 254

EP - 265

BT - Internationoal Conference on Computational Linguistics and Intelligent Text Processing (CICLing 2011)

A2 - Gelbukh, Alexander F

PB - Springer Verlag

CY - Berlin Heidelberg

ER -

Hyunh D, Tran D, Ma W, Sharma D. Adaptable Term Weighting Framework for Text Classification. In Gelbukh AF, editor, Internationoal Conference on Computational Linguistics and Intelligent Text Processing (CICLing 2011): Lecture Notes in Computer Science. Vol. 6609. Berlin Heidelberg: Springer Verlag. 2011. p. 254-265 https://doi.org/10.1007/978-3-642-19437-5_21