AI Coding Assistants Struggle to Pinpoint Bugs at Line Level, New Study Shows
AI Coding Tools Face Precision Challenge
When your coding assistant suggests a fix, how confident can you be that it's targeting the right line? A new study reveals that today's AI programming tools, while impressive at scanning files, often fail when it comes to precise line-level bug detection.
The SWE-Explore Benchmark
Researchers from Shanghai Jiao Tong University and collaborating institutions have developed SWE-Explore, a testing tool that separates code search from repair execution. This approach exposes a previously hidden limitation: AI assistants like Claude Code and OpenHands maintain just 14-19% accuracy when moving from file-level to line-level bug identification.

What the Numbers Reveal
The team analyzed successful solutions from top models including GPT-5.4 and Gemini3Pro across 10 programming languages. Their findings highlight a "minimum context threshold" - when AI tools see less than 50% of the critical code area, repair attempts typically fail. But boost visibility to 50-75%, and success rates climb sharply.
"It's not that these models can't write good patches," explains the research, "but rather that they struggle to identify exactly where to apply them." This insight comes as many development teams remain hesitant about fully adopting AI coding tools.
The Path Forward
The study suggests a shift from brute-force code generation to smarter search capabilities. By focusing on "less filtering, more reading," next-generation systems could dramatically improve their precision. This approach might finally bridge the gap between AI's potential and its practical utility in professional software development.
Key Points
- Precision problem: AI coding tools drop from file-level to 14-19% accuracy at line-level
- New benchmark: SWE-Explore separates search from repair to better evaluate AI capabilities
- Context threshold: Models need to see 50-75% of critical code areas for successful repairs
- Industry impact: Findings could accelerate development of more precise AI coding assistants