A New Approach to Compressed File Fragment Identification

Research output: A Conference proceeding or a Chapter in BookConference contribution

1 Citation (Scopus)

Abstract

Identifying the underlying type of a file given only a file fragment is a big challenge in digital forensics. Many methods have been applied to file type identification; however the identification accuracies of most of file types are still very low, especially for files having complex structures because their contents are compound data built from different data types. In this paper, we propose a new approach based on the deflate-encoded data detection, entropy-based clustering, and the use of machine learning techniques to identify deflate-encoded file fragments. Experiments on the popular compound file type showed high identification accuracy for the proposed method.
Original languageEnglish
Title of host publicationInternational Joint Conference CISIS 2015 and ICEUTE 2015
Subtitle of host publicationCISIS'15 and ICEUTE'15
EditorsAlvaro Herrero, Bruno Baruque, Javier Sedano, Hector Quintian, Emilio Corchado
Place of PublicationCham, Switzerland
PublisherSpringer
Pages377-387
Number of pages11
Volume369
ISBN (Electronic)9783319197135
ISBN (Print)9783319197128
DOIs
Publication statusPublished - 2015
EventThe 8th International Conference on Computational Intelligence in Security for Information Systems - http://cisis.usal.es , Burgos, Spain
Duration: 15 Jun 201517 Jun 2015
http://cisis.usal.es

Publication series

NameAdvances in Intelligent Systems and Computing
PublisherSpringer
Volume369
ISSN (Print)2194-5357
ISSN (Electronic)2194-5356

Conference

ConferenceThe 8th International Conference on Computational Intelligence in Security for Information Systems
Abbreviated titleCISIS 2015
CountrySpain
CityBurgos
Period15/06/1517/06/15
Internet address

Fingerprint

Learning systems
Entropy
Experiments
Digital forensics

Cite this

NGUYEN, K., TRAN, D., MA, W., & SHARMA, D. (2015). A New Approach to Compressed File Fragment Identification. In A. Herrero, B. Baruque, J. Sedano, H. Quintian, & E. Corchado (Eds.), International Joint Conference CISIS 2015 and ICEUTE 2015: CISIS'15 and ICEUTE'15 (Vol. 369, pp. 377-387). (Advances in Intelligent Systems and Computing; Vol. 369). Cham, Switzerland: Springer. https://doi.org/10.1007/978-3-319-19713-5_32
NGUYEN, Khoa ; TRAN, Dat ; MA, Wanli ; SHARMA, Dharmendra. / A New Approach to Compressed File Fragment Identification. International Joint Conference CISIS 2015 and ICEUTE 2015: CISIS'15 and ICEUTE'15. editor / Alvaro Herrero ; Bruno Baruque ; Javier Sedano ; Hector Quintian ; Emilio Corchado. Vol. 369 Cham, Switzerland : Springer, 2015. pp. 377-387 (Advances in Intelligent Systems and Computing).
@inproceedings{0a27d735d1aa438bbeb47a8203041de4,
title = "A New Approach to Compressed File Fragment Identification",
abstract = "Identifying the underlying type of a file given only a file fragment is a big challenge in digital forensics. Many methods have been applied to file type identification; however the identification accuracies of most of file types are still very low, especially for files having complex structures because their contents are compound data built from different data types. In this paper, we propose a new approach based on the deflate-encoded data detection, entropy-based clustering, and the use of machine learning techniques to identify deflate-encoded file fragments. Experiments on the popular compound file type showed high identification accuracy for the proposed method.",
keywords = "Compressed file fragment classification, File fragment classification, Shannon entropy, SVM",
author = "Khoa NGUYEN and Dat TRAN and Wanli MA and Dharmendra SHARMA",
year = "2015",
doi = "10.1007/978-3-319-19713-5_32",
language = "English",
isbn = "9783319197128",
volume = "369",
series = "Advances in Intelligent Systems and Computing",
publisher = "Springer",
pages = "377--387",
editor = "Alvaro Herrero and Bruno Baruque and Javier Sedano and Hector Quintian and Emilio Corchado",
booktitle = "International Joint Conference CISIS 2015 and ICEUTE 2015",
address = "Netherlands",

}

NGUYEN, K, TRAN, D, MA, W & SHARMA, D 2015, A New Approach to Compressed File Fragment Identification. in A Herrero, B Baruque, J Sedano, H Quintian & E Corchado (eds), International Joint Conference CISIS 2015 and ICEUTE 2015: CISIS'15 and ICEUTE'15. vol. 369, Advances in Intelligent Systems and Computing, vol. 369, Springer, Cham, Switzerland, pp. 377-387, The 8th International Conference on Computational Intelligence in Security for Information Systems, Burgos, Spain, 15/06/15. https://doi.org/10.1007/978-3-319-19713-5_32

A New Approach to Compressed File Fragment Identification. / NGUYEN, Khoa; TRAN, Dat; MA, Wanli; SHARMA, Dharmendra.

International Joint Conference CISIS 2015 and ICEUTE 2015: CISIS'15 and ICEUTE'15. ed. / Alvaro Herrero; Bruno Baruque; Javier Sedano; Hector Quintian; Emilio Corchado. Vol. 369 Cham, Switzerland : Springer, 2015. p. 377-387 (Advances in Intelligent Systems and Computing; Vol. 369).

Research output: A Conference proceeding or a Chapter in BookConference contribution

TY - GEN

T1 - A New Approach to Compressed File Fragment Identification

AU - NGUYEN, Khoa

AU - TRAN, Dat

AU - MA, Wanli

AU - SHARMA, Dharmendra

PY - 2015

Y1 - 2015

N2 - Identifying the underlying type of a file given only a file fragment is a big challenge in digital forensics. Many methods have been applied to file type identification; however the identification accuracies of most of file types are still very low, especially for files having complex structures because their contents are compound data built from different data types. In this paper, we propose a new approach based on the deflate-encoded data detection, entropy-based clustering, and the use of machine learning techniques to identify deflate-encoded file fragments. Experiments on the popular compound file type showed high identification accuracy for the proposed method.

AB - Identifying the underlying type of a file given only a file fragment is a big challenge in digital forensics. Many methods have been applied to file type identification; however the identification accuracies of most of file types are still very low, especially for files having complex structures because their contents are compound data built from different data types. In this paper, we propose a new approach based on the deflate-encoded data detection, entropy-based clustering, and the use of machine learning techniques to identify deflate-encoded file fragments. Experiments on the popular compound file type showed high identification accuracy for the proposed method.

KW - Compressed file fragment classification

KW - File fragment classification

KW - Shannon entropy

KW - SVM

UR - http://www.scopus.com/inward/record.url?scp=84946729929&partnerID=8YFLogxK

UR - http://www.mendeley.com/research/new-approach-compressed-file-fragment-identification

U2 - 10.1007/978-3-319-19713-5_32

DO - 10.1007/978-3-319-19713-5_32

M3 - Conference contribution

SN - 9783319197128

VL - 369

T3 - Advances in Intelligent Systems and Computing

SP - 377

EP - 387

BT - International Joint Conference CISIS 2015 and ICEUTE 2015

A2 - Herrero, Alvaro

A2 - Baruque, Bruno

A2 - Sedano, Javier

A2 - Quintian, Hector

A2 - Corchado, Emilio

PB - Springer

CY - Cham, Switzerland

ER -

NGUYEN K, TRAN D, MA W, SHARMA D. A New Approach to Compressed File Fragment Identification. In Herrero A, Baruque B, Sedano J, Quintian H, Corchado E, editors, International Joint Conference CISIS 2015 and ICEUTE 2015: CISIS'15 and ICEUTE'15. Vol. 369. Cham, Switzerland: Springer. 2015. p. 377-387. (Advances in Intelligent Systems and Computing). https://doi.org/10.1007/978-3-319-19713-5_32