TY - GEN
T1 - Performance analysis of hard clustering techniques for big IoT data analytics
AU - Ahmed, Mohiuddin
AU - Barkat, Abu
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/5/1
Y1 - 2019/5/1
N2 - Data analytics for Internet of Things (IoT) is an important task in today's connected environment. In particular, identification of infrequent patterns from a huge amount of data is certainly a challenging task. Clustering is a well established technique to divulge the patterns from any given dataset. However, one of the impediments for clustering is to provide the number of clusters that most of the clustering algorithm requires, for example the famous k-means requires the value of k (number of clusters to be produced). GenClust++ and x-means clustering algorithms can automatically identify the number of clusters unlike other hard clustering algorithms. In this paper, we investigate the effectiveness of these two algorithms to identify infrequent patterns or the anomalous clusters. We experimented with seven benchmark IoT datasets and it is evident that the performance of x-means in terms of TPR, FPR is better than GenClust++. In addition to that, in terms of the computational efficiency, x-means outperforms the GenClust++.
AB - Data analytics for Internet of Things (IoT) is an important task in today's connected environment. In particular, identification of infrequent patterns from a huge amount of data is certainly a challenging task. Clustering is a well established technique to divulge the patterns from any given dataset. However, one of the impediments for clustering is to provide the number of clusters that most of the clustering algorithm requires, for example the famous k-means requires the value of k (number of clusters to be produced). GenClust++ and x-means clustering algorithms can automatically identify the number of clusters unlike other hard clustering algorithms. In this paper, we investigate the effectiveness of these two algorithms to identify infrequent patterns or the anomalous clusters. We experimented with seven benchmark IoT datasets and it is evident that the performance of x-means in terms of TPR, FPR is better than GenClust++. In addition to that, in terms of the computational efficiency, x-means outperforms the GenClust++.
KW - Anomaly Detection
KW - Big Data
KW - Clustering
KW - IoT
UR - http://www.scopus.com/inward/record.url?scp=85073880629&partnerID=8YFLogxK
UR - https://www.mendeley.com/catalogue/8935112b-c07e-35ef-a40a-4efda7809a26/
U2 - 10.1109/CCC.2019.000-8
DO - 10.1109/CCC.2019.000-8
M3 - Conference contribution
SN - 9781728126005
T3 - Proceedings - 2019 Cybersecurity and Cyberforensics Conference, CCC 2019
SP - 62
EP - 66
BT - Proceedings - 2019 Cybersecurity and Cyberforensics Conference, CCC 2019
PB - IEEE, Institute of Electrical and Electronics Engineers
CY - United States
T2 - 2019 Cybersecurity and Cyberforensics Conference, CCC 2019
Y2 - 7 May 2019 through 8 May 2019
ER -