2 December 2024

Ministry of Digital Affairs grants a total of 30.7 Million to ambitious Danish language model project

Language Models

The University of Copenhagen, together with the University of Southern Denmark, Aarhus University, and the Alexandra Institute, has received a total of 30.7 million kroner from the Ministry of Digitalisation to establish an ambitious R&D platform, Danish Foundation Models (DFM), for developing and applying language models and language technology in Denmark

Danish flag, computer and AI chatbot
The Department of Computer Science at the University of Copenhagen is one of the partners involved in developing language models and language technology in Denmark through the new platform Danish Foundation Models (DFM).

Rather than competing directly with global technology giants like Google and OpenAI, Denmark has chosen to pool its resources to solve well-defined tasks. A unified national effort can, on that basis, make a significant difference.

This approach ensures that Danish investments in AI are directed towards solutions that meet both critical and specific needs in Danish society and promote sustainable development, ensuring a fair digital economy.

The Danish Ministry of Digital Affairs has therefore just allocated a total of 30.7 million DKK to the project, which tests a range of use cases by involving public administration, the education and health sectors, as well as small and medium-sized enterprises.

Data integrity and safe AI usage

The main objective of the project is to establish a secure R&D (research & development) platform for training, fine-tuning, evaluating, and maintaining foundation models for use in Danish-language contexts. This platform will meet the highest standards for data integrity and documentation of safe AI usage.

– I’m excited that DIKU will contribute to developing Large Language Models that conform with the EU AI Act and GDPR regulation. I see a great deal work ahead in the areas of data curation, model development, and culturally relevant evaluation. It is important for the DFM consortium to focus on developing fully documented open-source models, as this can drive local innovation and development in Denmark, explains Associate Professor Desmond Elliott, co-PI of the project.

Bolette Sandford Pedersen, Professor and Deputy Head of Department at the Department of Nordic Studies and Linguistics at the University of Copenhagen, adds:

– In the consortium, we will also ensure that the language models are evaluated based on the cultural and societal context in which they will be used. This is ensured through a series of Danish-funded benchmarks that examine the models' general language understanding and incorporate knowledge of the Danish language and culture, she explains.

An interactive sandpit

At the same time, an innovative, open 'sandpit' is being established for ongoing collaboration on fine-tuning and adapting the foundation models. In the sandpit, domain experts across national projects will collaborate to design and improve specific use cases in a flexible and secure environment.

"DFM stems from a vision that community and inclusion must be the guiding principles for the development of Danish language technology. Our interactive sandpit will bring together researchers, developers, and users to quickly and flexibly create prototypes, and it will also provide a framework for collaboration to fine-tune solutions to a wide range of societal needs. By drawing on Danish cultural heritage and making culturally attuned adjustments to existing models, we aim to create language technology that reflects and respects the complexity of Danish society – past and present. Under these ambitions lies a commitment to openness and transparency in research and development work. In other words, we aim not only to contribute the best models but also tools and documentation that enable further development and reproduction of our work. DFM thus strives to bridge the digital language barrier and mature the norms for how we culturally responsibly develop AI," says Kristoffer Nielbo, Professor at Aarhus University, Center for Humanities Computing.

The entire platform is being developed using existing collaborations, supercomputing infrastructure, and data delivery agreements, significantly reducing establishment costs. The intention is to make the foundation and fine-tuned models more accessible with open source via the R&D platform, so they can also be used for commercial purposes. With this approach, both public and private actors can benefit from the models and the platform and contribute to their further development.

 

 

Read more about the DFM project in the official press release.

Topics

More stories