A saddle point-guided clustering algorithm for data with complex structure

Xiaogang Huang, Jun Jin, Dan Zhuang, Tiefeng Ma, Michel van de Velden, Shuangzhe Liu

    Research output: Contribution to journalArticlepeer-review

    Abstract

    Density-based clustering is a fundamental unsupervised learning technique with broad applications across diverse domains. However, most existing density-based clustering algorithms have difficulty in discovering clusters of different densities and sizes, especially in the case of significant variations in the density of points within the same cluster. In this paper, we propose a novel clustering algorithm that addresses this limitation by integrating a mode-seeking phase with a fragmented cluster fusion phase. In the mode-seeking phase, the dataset is partitioned into initial clusters, where each cluster center represents a point whose density is a local maximum, and the remaining points are assigned through an innovative majority voting mechanism. Natural clusters containing multiple local maxima of the underlying density function are thus partitioned into multiple initial clusters accordingly. To accurately capture the intrinsic cluster structure of the data, we introduce a saddle point-guided cluster similarity measure that quantifies the homogeneity between adjacent initial clusters; here, a saddle point is defined as the point of maximum density located on the boundary between the two clusters. With the cluster similarity, the obtained initial clusters are processed for merging using a fragmented cluster fusion technique based on minimum spanning forest (MSF). Extensive experiments on 31 benchmark datasets—including 20 synthetic and 11 real-world datasets—demonstrate the superior performance of the proposed algorithm. Specifically, it achieves the highest NMI scores on 23 datasets and the highest ARI scores on 21 datasets.

    Original languageEnglish
    Article number114726
    Pages (from-to)1-16
    Number of pages16
    JournalKnowledge-Based Systems
    Volume331
    DOIs
    Publication statusPublished - 3 Dec 2025

    Fingerprint

    Dive into the research topics of 'A saddle point-guided clustering algorithm for data with complex structure'. Together they form a unique fingerprint.

    Cite this