HKUST and Kuaishou's EvoSearch Tech Boosts Small AI Models for Art
A groundbreaking collaboration between the Hong Kong University of Science and Technology (HKUST) and Kuaishou Technology has yielded EvoSearch, an evolutionary search technology that defies conventional wisdom in AI-generated art. This innovation proves that smaller models can achieve superior results without the need for massive computing resources.
The technology delivers astonishing performance: an 865M-parameter Stable Diffusion 2.1 model equipped with EvoSearch generates higher quality output than GPT-4, while a 1.3B-parameter Wan model matches the capabilities of models ten times its size.
Rethinking AI Generation Strategies
Current AI generation models typically fall into two categories: diffusion models that gradually refine images by removing noise, and flow models that directly transform random noise into target images. The industry has traditionally improved performance through either scaling up model size (with enormous resource costs) or optimizing inference methods like Best-of-N sampling - both approaches with significant limitations.
EvoSearch introduces a revolutionary alternative inspired by Darwin's theory of evolution. It treats image generation as an evolutionary process where:
- Initial "populations" of random noise are generated
- Semi-finished products are scored through "fitness assessment"
- Superior candidates are selected
- New solutions emerge through specialized "mutation" operations
The mutation mechanism represents EvoSearch's key breakthrough. For initial noise, it adds controlled Gaussian noise; for intermediate denoising states, it introduces perturbations based on stochastic differential equation sampling. This approach maintains exploration while preserving valuable characteristics.
Performance That Redefines Expectations
Comprehensive testing across image and video generation tasks demonstrates EvoSearch's superiority over existing methods. In image generation, quality and text alignment improve steadily with increased computational load, while other methods plateau quickly. For complex prompts, EvoSearch shows better comprehension and produces more diverse outputs.
The video generation results prove even more impressive. Whether using the Wan1.3B or HunyuanVideo13B model, EvoSearch consistently outperforms baseline methods. Most remarkably, when given equivalent inference time budgets, the smaller Wan1.3B model with EvoSearch matches or surpasses the larger Wan14B model's output quality.
Human evaluations confirm these technical advantages, with EvoSearch-generated videos receiving higher ratings for visual quality, action fidelity, text alignment, and overall appeal.
Implications for the Future of AI Art
EvoSearch suggests several important directions for AI development:
- Investing in inference-phase optimization may yield better returns than endlessly scaling training resources
- Biological evolutionary principles can effectively enhance AI creative processes
- Understanding model architectures enables smarter performance enhancements
The research team acknowledges opportunities for further refinement, particularly in developing more sophisticated mutation strategies and balancing exploration with computational efficiency.
The project resources are publicly available:
Key Points
- EvoSearch enables smaller AI models to outperform larger competitors in art generation
- The technology applies evolutionary principles to image creation processes
- Testing shows superior results in both image and video generation tasks
- The approach could reduce reliance on massive computing resources for quality output