Tilde AI Launches Open-Source LLM for European Languages
Tilde AI Releases Open-Source Language Model for European Linguistic Diversity
Latvian language technology company Tilde has launched TildeOpen LLM, an open-source foundational large language model specifically designed to support European languages, with particular focus on underrepresented regional tongues. Released on September 3, 2025, this initiative represents a significant advancement in the EU's efforts toward language equity and digital sovereignty.

Technical Specifications and Training
The 3-billion-parameter dense decoder model operates under a permissive CC-BY-4.0 license, supporting languages ranging from Latvian and Lithuanian to Ukrainian and Turkish. Training occurred on European supercomputers LUMI (Finland) and JUPITER, utilizing 2 million GPU hours of computing resources provided by the European Commission's Large AI Prize Challenge.
Technical implementation used a GPT-NeoX script inspired by EleutherAI, with:
- 450,000 updates
- ~20 trillion processed tokens
- Three-stage sampling methodology:
- Uniform distribution across languages
- Enhancement of natural distribution for high-volume languages
- Final uniform scan for balance assurance
Key architectural features include:
- 60 layers with 6144 embedding dimension
- 48 attention heads
- 8192-token context window
- SwiGLU activation functions
- RoPE positional encoding
- RMSNorm layer normalization
Addressing Language Equity Challenges
Traditional LLMs often underperform with Baltic, Slavic, and other smaller European languages, producing grammatical errors and unnatural phrasing. TildeOpen introduces a "fair tokenizer" that:
- Represents all languages similarly in token space
- Reduces token count for efficiency gains
- Improves reasoning performance for less-represented languages
The model also enables organizations to self-host in local data centers or EU-compliant secure clouds, ensuring adherence to GDPR and other data protection regulations while addressing sovereignty concerns related to foreign hosting locations.
Future Development Roadmap
As a foundational model, TildeOpen will spawn specialized versions including:
- Instruction-tuned variants
- Enhanced translation models
The project positions Latvia as an emerging player in global AI development while championing linguistic diversity preservation.
Key Points
🌍 Multilingual Support: Specialized focus on underrepresented European languages 💻 EU-Based Training: Leveraged European supercomputers and advanced sampling techniques 🔒 Sovereignty Compliance: GDPR-aligned deployment options for organizations




