TY - JOUR
T1 - ToN_IoT
T2 - The Role of Heterogeneity and the Need for Standardization of Features and Attack Types in IoT Network Intrusion Data Sets
AU - Booij, Tim M.
AU - Chiscop, Irina
AU - Meeuwissen, Erik
AU - Moustafa, Nour
AU - Hartog, Frank T.H.Den
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2022/1/1
Y1 - 2022/1/1
N2 - The Internet of Things (IoT) is reshaping our connected world as the number of lightweight devices connected to the Internet is rapidly growing. Therefore, high-quality research on intrusion detection in the IoT domain is essential. To this end, network intrusion data sets are fundamental, as many attack detection strategies have to be trained and evaluated using such data sets. In this article, we introduce the description, statistical analysis, and machine learning evaluation of the novel ToN_IoT data set. A comparison to other recent IoT data sets shows the importance of heterogeneity within these data sets, and how differences between data sets may have a huge impact on detection performance. In a cross-training experiment, we show that the inclusion of different data collection methods and a large diversity of the monitored features are of crucial importance for IoT network intrusion data sets to be useful for the industry. We also explain that the practical application of IoT data sets in operational environments requires the standardization of feature descriptions and cyberattack classes. This can only be achieved with a joint effort from the research community.
AB - The Internet of Things (IoT) is reshaping our connected world as the number of lightweight devices connected to the Internet is rapidly growing. Therefore, high-quality research on intrusion detection in the IoT domain is essential. To this end, network intrusion data sets are fundamental, as many attack detection strategies have to be trained and evaluated using such data sets. In this article, we introduce the description, statistical analysis, and machine learning evaluation of the novel ToN_IoT data set. A comparison to other recent IoT data sets shows the importance of heterogeneity within these data sets, and how differences between data sets may have a huge impact on detection performance. In a cross-training experiment, we show that the inclusion of different data collection methods and a large diversity of the monitored features are of crucial importance for IoT network intrusion data sets to be useful for the industry. We also explain that the practical application of IoT data sets in operational environments requires the standardization of feature descriptions and cyberattack classes. This can only be achieved with a joint effort from the research community.
KW - Internet of Things (IoT)
KW - intrusion detection
KW - machine learning algorithms
KW - network security
KW - statistical analysis
UR - http://www.scopus.com/inward/record.url?scp=85107333911&partnerID=8YFLogxK
U2 - 10.1109/JIOT.2021.3085194
DO - 10.1109/JIOT.2021.3085194
M3 - Article
AN - SCOPUS:85107333911
SN - 2327-4662
VL - 9
SP - 485
EP - 496
JO - IEEE Internet of Things Journal
JF - IEEE Internet of Things Journal
IS - 1
ER -