15 AI Projects Get Slingshot Funding to Tackle Evaluation Challenges
Laud Institute Backs 15 Teams to Revolutionize AI Evaluation
The Laud Institute made waves this week with the launch of its Slingshot AI funding program, selecting 15 promising projects that aim to crack one of artificial intelligence's most persistent challenges: how do we really know if an AI system works well?

Beyond Traditional Benchmarks
Unlike typical academic grants, Slingshot provides researchers with a rare combination of funding, computing power, and engineering support - resources that could help turn theoretical concepts into practical solutions faster. In return, teams must deliver concrete results, whether that means launching startups, open-source tools, or publishable research.
"We're seeing brilliant ideas stuck in academic papers because researchers lack the infrastructure to test them at scale," explained a Laud Institute spokesperson. "Slingshot removes those barriers."
The Projects Making Waves
Among the standout selections:
- Terminal Bench: A command-line coding benchmark already gaining traction among developers
- ARC-AGI: The latest iteration of a respected framework for evaluating general AI capabilities
- Formula Code: A Caltech/UT Austin collaboration testing how well AIs optimize existing code
- BizBench: Columbia University's ambitious attempt to create business decision-making standards for "white-collar" AI agents
John Boda Yang, co-creator of the influential SWE-Bench coding evaluation system, is leading a new project called CodeClash that takes inspiration from competitive programming. "Dynamic competition reveals different strengths than static benchmarks," Yang told TechCrunch. "But we need to ensure evaluation remains open - proprietary standards controlled by single companies could slow progress."
Why Evaluation Matters Now
As AI systems take on more complex tasks, traditional testing methods often fall short. Can an AI that aces coding challenges also make sound business decisions? How do we compare specialized models against general-purpose ones? The Slingshot projects explore diverse approaches including:
- Reinforcement learning frameworks
- Model compression techniques
- Real-world performance metrics
The initiative represents a significant investment in creating evaluation standards that keep pace with AI's rapid advancement - something many see as crucial for both technological progress and responsible development.
Key Points:
- 15 projects selected for inaugural Slingshot AI funding program
- Focused on developing better ways to evaluate AI systems across domains
- Mix of established benchmarks and novel approaches
- Combines academic research with industry-grade resources
- Aims to prevent proprietary standards from dominating the field