Skip to main content

Tencent's AI Painting Breakthrough Boosts Image Quality 300%

Tencent's AI Painting Breakthrough Delivers 300% Quality Improvement

Tencent has developed groundbreaking fine-tuning techniques that significantly enhance the quality of AI-generated images, achieving 300% improvements in human evaluation scores. The new methods address persistent challenges in diffusion models while enabling unprecedented control over output aesthetics.

The Challenge with Current Models

While existing diffusion models can optimize images through reward mechanisms, they face two critical limitations:

  1. Reward cheating: Models generate low-quality images that technically achieve high scores
  2. Inflexible adjustment: Offline reward models prevent real-time optimization

Image

Tencent's Innovative Solutions

The research team introduced two novel approaches:

Direct-Align Technology

This method allows the model to recover original images from any point in the generation process by pre-injecting noise. Key benefits include:

  • Reduces gradient explosion during backpropagation
  • Enables optimization throughout the entire diffusion process (not just final steps)
  • Improves training stability

Semantic Relative Preference Optimization (SRPO)

SRPO transforms reward signals into text-controlled parameters, allowing:

  • Style adjustments through simple prompt modifications (e.g., adding "bright" or "dark" prefixes)
  • No requirement for additional training data
  • Real-time customization of output characteristics

Performance Results

The FLUX.1-dev model trained with SRPO demonstrated remarkable improvements:

  • Realism excellent rate increased from 8.2% to 38.9%
  • Aesthetic quality excellent rate rose from 9.8% to 40.5%
  • Achieved natural textures while maintaining high visual appeal

The technology achieves these results with efficient training - converging in just 10 minutes using 32 H20 GPUs.

Future Implications

This advancement represents a significant leap forward for:

  • Professional digital art creation tools
  • Marketing and advertising content generation
  • Game asset development pipelines

The research paper is available at: https://arxiv.org/pdf/2509.06942

Key Points:

  • Tencent's new methods improve AI image quality by 300%
  • Direct-Align enables full-process optimization
  • SRPO allows text-based style control without extra data
  • Significant improvements in realism and aesthetics demonstrated
  • Technology converges rapidly with efficient GPU usage

Enjoyed this article?

Subscribe to our newsletter for the latest AI news, product reviews, and project recommendations delivered to your inbox weekly.

Weekly digestFree foreverUnsubscribe anytime

Related Articles

Ant Group's LLaDA2.0: A 100B-Parameter Leap in AI Language Models
News

Ant Group's LLaDA2.0: A 100B-Parameter Leap in AI Language Models

Ant Group has unveiled LLaDA2.0, a groundbreaking 100-billion-parameter diffusion language model that challenges conventional wisdom about scaling limitations. This innovative technology not only delivers faster processing speeds but also excels in complex tasks like code generation. By open-sourcing the model, Ant is inviting developers worldwide to explore its potential while pushing the boundaries of what diffusion models can achieve.

December 12, 2025
LLaDA2.0Diffusion ModelsAI Innovation
News

Stanford's AI Startup Inception Secures $50M to Rival GPT-5 with Faster Diffusion Tech

A Stanford professor's AI startup, Inception, has raised $50 million in seed funding to challenge giants like GPT-5 with its diffusion-based language model. Their Mercury model achieves blazing speeds of 1000 tokens per second for code generation, promising a more efficient alternative to current autoregressive approaches. Backed by heavyweights including Microsoft and NVIDIA, this innovation could reshape how we build AI systems.

November 10, 2025
AI InnovationDiffusion ModelsTech Startups
Radical Numerics Releases Open-Source 30B-Parameter Diffusion AI Model
News

Radical Numerics Releases Open-Source 30B-Parameter Diffusion AI Model

Radical Numerics has open-sourced RND1-Base, a groundbreaking 30B-parameter diffusion language model. The AI architecture leverages sparse expert mixtures and bidirectional attention for efficient parallel generation, outperforming predecessors in benchmarks while enabling faster inference.

October 13, 2025
Diffusion ModelsAI ResearchOpen Source AI
Ant Group's dInfer Boosts Diffusion Model Speed 10x
News

Ant Group's dInfer Boosts Diffusion Model Speed 10x

Ant Group has open-sourced dInfer, a high-performance inference framework for diffusion language models that achieves speeds 10.7x faster than NVIDIA's Fast-dLLM. The breakthrough enables 1011 tokens/second generation and outperforms traditional autoregressive models, marking a significant step toward practical AGI applications.

October 13, 2025
dInferDiffusion ModelsAI Acceleration
VideoFrom3D Transforms Rough Geometry into Realistic 3D Videos
News

VideoFrom3D Transforms Rough Geometry into Realistic 3D Videos

VideoFrom3D, a groundbreaking AI framework, revolutionizes 3D graphics design by generating realistic videos from rough geometries and reference images. Its dual-module architecture ensures visual consistency and smooth motion without requiring expensive datasets, making professional-grade 3D content creation accessible.

September 28, 2025
VideoFrom3DDiffusion ModelsAI Graphics
Shanghai AI Lab Unveils Lumina-DiMOO for Multimodal AI
News

Shanghai AI Lab Unveils Lumina-DiMOO for Multimodal AI

Shanghai AI Lab has introduced Lumina-DiMOO, a cutting-edge multimodal AI model. Utilizing a 'Fully Discrete Diffusion Architecture,' it enhances text, image, and audio processing efficiency. The model excels in generation and understanding tasks, promising broad applications in AI technology.

September 16, 2025
Multimodal AILumina-DiMOODiffusion Models