Abstract
Outliers pose a significant threat to the reliability of regression analysis. Unlike traditional robust methods that primarily rely on numerical optimization, this paper introduces Tri-SEM, a shape-aware robust regression framework that leverages the geometric and morphological structure of data through a flexible three-stage architecture: Split, Extraction, and Merge. In the Split stage, data are partitioned into chain-like segments using the Anderson-Darling test, projection analysis, and convex hull detection to isolate potential outliers, with clustering performed in a 2-D projected space for computational efficiency. In the Extraction stage, a subset of clean segments is selected by jointly considering their size and median squared residuals. In the Merge stage, reliable inliers are integrated using a histogram transition detector on 1-D residuals, capturing residual distribution patterns to construct the final regression estimate. Comprehensive experiments on diverse datasets demonstrate Tri-SEM's clear superiority in both prediction accuracy and estimation bias: it achieved the best overall rank and the highest prediction accuracy on 30 of the 35 datasets, while consistently outperforming the second-ranked method (MM-estimator) in estimation bias, achieving a relative improvement exceeding 90% on more than half (54.3%) of the datasets. Extensive ablation, sensitivity, convergence, and runtime analyses confirm the method's robustness, efficiency, and adaptability across a wide range of data scenarios.
| Original language | English |
|---|---|
| Article number | 113092 |
| Pages (from-to) | 1-13 |
| Number of pages | 13 |
| Journal | Pattern Recognition |
| Volume | 175 |
| DOIs | |
| Publication status | Published - 2026 |
Fingerprint
Dive into the research topics of 'Tri-SEM: A shape-aware robust regression method via chain-like segmentation and residual analysis'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver