
Large language models (LLMs) have taken center stage in Europe’s digital sovereignty agenda with the launch of OpenEuroLLM, a new initiative aimed at developing truly open-source LLMs covering all European Union languages. This ambitious project aligns with Europe’s broader push to ensure greater control over critical digital infrastructure and artificial intelligence (AI) tools.
The OpenEuroLLM Initiative
OpenEuroLLM is a collaborative effort involving around 20 organizations, co-led by Jan Hajič, a computational linguist at Charles University in Prague, and Peter Sarlin, CEO of Finnish AI lab Silo AI, which was acquired by AMD for $665 million last year. The initiative seeks to create multilingual foundation models that support not only the 24 official EU languages but also languages from nations negotiating EU entry, such as Albania.
This project follows Europe’s broader strategy of fostering digital autonomy. Tech giants are already expanding local infrastructure to comply with EU data regulations, and AI players like OpenAI have introduced services that allow data processing and storage within Europe. Furthermore, the EU recently allocated $11 billion for a sovereign satellite constellation to rival SpaceX’s Starlink, reinforcing the bloc’s commitment to technological self-reliance.
Funding Challenges and Compute Infrastructure
The OpenEuroLLM initiative has secured an initial €37.4 million for model development, with €20 million coming from the EU’s Digital Europe Programme. While this funding is significant, it pales in comparison to the billions that major AI corporations invest in model training and infrastructure. However, additional funding has been earmarked for related projects, and the initiative benefits from access to the EuroHPC supercomputing network, which has a broader budget of €7 billion.
Despite these resources, some experts question whether a decentralized consortium of 20+ organizations can maintain the streamlined focus needed to deliver competitive models. Anastasia Stasenko, co-founder of LLM startup Pleias, pointed out that Europe’s recent AI breakthroughs have stemmed from smaller, highly focused teams like Mistral AI and LightOn, which have complete ownership over their projects and strategic direction.
Building on Existing AI Efforts
The OpenEuroLLM project does not start entirely from scratch. It builds on the High Performance Language Technologies (HPLT) initiative, which since 2022 has been developing datasets, models, and workflows using high-performance computing (HPC). Many of HPLT’s partners have transitioned into OpenEuroLLM, leveraging their previous research and infrastructure.
Hajič anticipates that the first OpenEuroLLM models will be released by mid-2026, with final iterations expected by 2028. However, given the scale and scope of the project, some remain skeptical about its ambitious timeline.
Defining “True” Open Source AI
A major point of contention in AI development is the definition of open source. While traditional software has clear guidelines under the Open Source Initiative (OSI), AI models introduce complexities regarding training data, pre-trained models, and access to computational resources. The OSI has recently attempted to define “open-source AI,” though debates persist.
The OpenEuroLLM team aims to be as transparent as possible but acknowledges potential limitations. While model weights and outputs will be open, some training data may remain proprietary due to European copyright laws. Regulators will likely require some datasets to be stored for audit purposes, rather than being freely distributed.
Parallel Projects and Future Collaboration
One challenge for OpenEuroLLM is its overlap with EuroLLM, another EU-backed initiative launched just months prior. EuroLLM shares many of OpenEuroLLM’s goals, including developing open-source multilingual AI models. Some observers have criticized the fragmentation of European AI efforts, arguing that better coordination is needed to avoid redundant projects and maximize efficiency.
Hajič has expressed a willingness to collaborate with EuroLLM but notes that EU funding rules limit participation to organizations within member states, excluding entities from the U.K. and Switzerland. The possibility of merging or aligning efforts remains uncertain.
Digital Sovereignty and the Bigger Picture
Ultimately, OpenEuroLLM is not just about competing with AI giants—it’s about ensuring that Europe maintains control over foundational AI models. As Sarlin emphasized, OpenEuroLLM’s primary goal is to build AI infrastructure that European companies and institutions can trust and rely upon, rather than simply developing a consumer-facing chatbot like ChatGPT.
By leveraging the expertise of both academia and industry, OpenEuroLLM aims to balance innovation with transparency, ensuring that its LLMs reflect Europe’s linguistic and cultural diversity. Whether this initiative will set a new standard for open-source AI or struggle under its ambitious scope remains to be seen. However, one thing is clear: Europe is making a concerted effort to carve out its own space in the global AI race.