An Icelandic language technology project led by the University of Iceland in collaboration with Almannarómur (the Icelandic Centre for Language Technology) is one of eleven European projects to receive a Microsoft AI for Good LINGUA grant, awarded on 20 January this year. The grant aims to support languages that lack technical infrastructure (so-called low-resource languages) and to ensure that they are not left behind in the rapid development of artificial intelligence.
The project received a grant of approximately six million ISK and was considered particularly interesting by the evaluation panel, which noted that the project was not only technically strong but also had significant potential to promote greater equality in language technology for smaller languages.
The project, entitled “Icelandic AI Safety Benchmarks,” focuses on developing and localising well-established safety benchmarks for Icelandic-language language models in the Icelandic cultural context. Such safety evaluations have been lacking for Icelandic, and in some cases this has hindered the adoption of Icelandic in technological solutions developed by foreign companies, as they have found it difficult to monitor whether outputs from their systems in Icelandic are safe and compliant with their internal safety standards.
“These tests function as a standardised quality control for artificial intelligence in Icelandic. They measure whether language models respond appropriately to harmful prompts or produce toxic outputs,” says Hafsteinn Einarsson, associate professor of computer science at the University of Iceland and lead investigator of the project. According to him, research shows that the safety mechanisms of large models often perform worse in languages other than English. “By developing tests that take the Icelandic language and culture into account, we enable developers to assess the safety of their systems before they are deployed.”
As part of the project, three well-known safety benchmarks will be localised for Icelandic and Icelandic conditions: RTP-LX, Aya Red-Teaming and XSafety. All of them will be released under open licences. The benchmarks will then be run on all major language models, and the results will be published on a dedicated dashboard.
Projects such as this are expected to have practical applications in language competence, public services, industry, and culture, and to contribute to better technology that serves all languages, not only the largest ones.
About the grants
Microsoft’s AI for Good Lab announced last autumn, in connection with the European Day of Languages, that the company intended to fund projects supporting the digital future of European languages. Particular emphasis was placed on projects that would increase the availability of high-quality data for smaller language communities in order to strengthen the presence of those languages in artificial intelligence and technology.
On this occasion, eleven projects from ten countries received LINGUA grants. A list of the projects and further information about LINGUA can be found on Microsoft’s website.