Measuring and mitigating debugging effectiveness decay in code language models

Muntasir Adnan, Carlos C N Kuhn

    Research output: Contribution to journalArticlepeer-review

    Abstract

    The effectiveness of AI debugging follows a predictable exponential decay pattern; most models lose 60-80% of their debugging capability within just 2-3 attempts, despite iterative debugging being a critical capability for practical code generation systems. We introduce the Debugging Decay Index (DDI), a mathematical framework that quantifies when debugging becomes ineffective and predicts intervention points. Our strategic fresh start approach shifts from exploitation to exploration at strategic points in the debugging process, demonstrating that well-timed interventions can rescue the effectiveness of debugging. DDI reveals a fundamental limitation in current AI self-debugging and provides the first systematic metric to gauge LLM-based code generation.

    Original languageEnglish
    Article number44120
    Pages (from-to)1-11
    Number of pages11
    JournalScientific Reports
    Volume15
    Issue number1
    DOIs
    Publication statusPublished - 18 Dec 2025

    Fingerprint

    Dive into the research topics of 'Measuring and mitigating debugging effectiveness decay in code language models'. Together they form a unique fingerprint.

    Cite this