Abstract
Term frequency and document frequency are currently used to measure term significance in text classification. However, these measures cannot provide sufficient information to differentiate important terms. Thus, in this research, a new term ranking (weighting) approach for text classification will be proposed.
The approach firstly is based on relations among terms to estimates the important levels of terms in a document. Secondly, the proposed approach provides a considerable representation for the text documents. The results from experiment show that with the same data in Wikipedia corpus the term weighting approach provides higher accuracy in comparison to the popular approaches based on term frequency.
The approach firstly is based on relations among terms to estimates the important levels of terms in a document. Secondly, the proposed approach provides a considerable representation for the text documents. The results from experiment show that with the same data in Wikipedia corpus the term weighting approach provides higher accuracy in comparison to the popular approaches based on term frequency.
Original language | English |
---|---|
Title of host publication | ASCS '11 Proceedings of Thirty-Fourth Australasian Computer Science Conference |
Editors | Mark Reynold |
Place of Publication | Darlinghurst, Australia |
Publisher | Australian Computer Society |
Pages | 145-152 |
Number of pages | 7 |
Volume | 113 |
ISBN (Print) | 9781920682934 |
Publication status | Published - 17 Jan 2011 |
Event | 34th Australasian Computer Science Conference (ACSC 2011) - Perth, Perth, Australia Duration: 17 Jan 2011 → 20 Jan 2011 https://50years.acs.org.au/content/dam/acs/50-years/journals/crpit/Vol113.pdf |
Conference
Conference | 34th Australasian Computer Science Conference (ACSC 2011) |
---|---|
Abbreviated title | ACSC 2011 |
Country/Territory | Australia |
City | Perth |
Period | 17/01/11 → 20/01/11 |
Internet address |