Language resources and technology

Language resources and technology

Our research group focuses on how language works in specific fields such as economics, business, the media and law.

We study how language is used in the real world. To this end, we create and apply tools and resources to make language and communication research more effective. These tools – such as terminology databases and large text collections (corpora) – are available to the public and help people in many fields, from translators to policymakers to journalists.

We have built a unique hub for the study of specialised communication, not just in Norway, but across the Nordic region and Europe.

 

These are some of the key areas we work on:

  • Strategic and political communication: From political campaigns to corporate strategies, we study how communication is used to shape public messages and influence people.
  • Analysing data from texts: Using cutting-edge tools and methods to uncover patterns in large amounts of spoken and written discourse.
  • Lexical innovation: We study how new words take shape, acquire meanings and are borrowed from other languages.
  • Specialised corpora on topics: Exploring how language is used to discuss issues like monetary policy, sustainability issues, migration and poverty.
  • Making research accessible: Researching how to make science and news more accessible to the public through popular science and news media.
  • The Terminology Portal: A national resource we are contributing to, thus helping people find the right words for the right context.
  • Artificial intelligence: We have initiated research cooperation with external partners to explore, e.g., customer data in AI-based chatbots to help improve user applications.

Ultimately, our work is about making language tools and knowledge about language and communication accessible to everyone, whether you are a student, a professional, or just someone curious about how language shapes the world around us.

 

Ongoing research projects

  • Clarin

    Clarin

    CLARIN (Common Language Resources and Technology Infrastructure) is a Europe-wide research infrastructure that aims to provide easy and sustainable access, for scholars in the humanities and social sciences, to digital language data and advanced tools, independent of where they are located. We are a member of the CLARIN network through our active involvement in the CLARA and CLARINO projects.

    For more informaton:

    CLARIN

    Contact person: Gisle Andersen

  • Clarino and Termportalen

    Clarino and Termportalen

    CLARINO (Common Language Resources and Technology Infrastructure: Norway) is a national infrastructure for language resources in Norway. The project is funded by the Research Council of Norway and linked to the Europe-wide CLARIN project. Our main contribution has been the development of Termportalen – a national portal for terminology resources. This effort answers to the need for coordination of terminological efforts at national and international levels.

    Termportalen på Uib.no

    Clarino

    Contact person: Gisle Andersen

  • GLAD - GLOBAL ANGLICISM DATABASE NETWORK

    GLAD - GLOBAL ANGLICISM DATABASE NETWORK

    GLAD - GLOBAL ANGLICISM DATABASE is an international network of scholars working on linguistic and cultural Anglicization within and across language communities worldwide. Among its specific objectives are producing an online global database of Anglicisms and investigating theoretical issues related to language contact with English and related phenomena.

    For more information:

    GLAD

    Contact person: Gisle Andersen

  • ISO/TC 37 and SN/K 144 LANGUAGE AND TERMINOLOGY

    ISO/TC 37 and SN/K 144 LANGUAGE AND TERMINOLOGY

    ISO/TC 37 and SN/K 144 LANGUAGE AND TERMINOLOGY 

    We are a member of the ISO Technical Committee 37 “Language and Terminology” and its Norwegian mirror committee SN/K 144 “Språk og terminologi”. The committees contribute in standardisation of descriptions, resources, technologies and services related to terminology, translation, interpreting and other language-based activities in the multilingual information society.

    For more information:

    ISO/TC 27

    Contact person: Gisle Andersen

  • Lexical innovation: Word of the Year

    Lexical innovation: Word of the Year

    NHH contributes each year to awarding Norway’s official Word of the Year. This is a popular science project that uses digital technology and a large corpus to extract a long list of candidates based on frequency information. These are then studied manually, and a final list of ten words and one winner is produced with definitions and use descriptions. This work is done in cooperation with the Language Council of Norway and leads to considerable media interest each year.

    For more information: https://sprakradet.no/nyord-og-rettskrivingsendringer/arets-ord/

    Contact person: Gisle Andersen

  • Metaphors on climate change

    Metaphors on climate change

    This project investigates climate change metaphors through a corpus-assisted discourse study. It focuses on identifying the variables that correlate with different ways of framing climate change and exploring what these patterns can reveal about societal attitudes and priorities. A key part of the project is the comparison of academic and non-academic discourse, examining how different communities talk about climate change and how that shapes public understanding. By analyzing language across contexts, the project aims to shed light on the dynamics of climate communication and contribute to more effective and informed discourse.

    Contact person: Alida Røvik Langås

  • Natural Language Processing for Market Research

    Natural Language Processing for Market Research

    Natural language processing (NLP) is a field of artificial intelligence that combines computer science and linguistics to interpret and generate texts and speech automatically. This project aims to explore how the methods of NLP can help businesses stay up to date, follow the latest online trends, detect and understand the needs of their customers.

    Contact person: Rashid Mustafin

  • NHH Termbase for economic-administrative domains

    NHH Termbase for economic-administrative domains

    NHH’s termbase is a large terminological database covering economics and business administration and related fields. It consists of freely available and updated terminology in Norwegian and English and is a useful resource for anyone in need of updated terminology and concept definitions – from students and researchers, to translators, professionals and the general public.

    For more information: https://www.nhh.no/forskning/nhh-termbase-for-okonomisk-administrative-fag/

    Contact person: Gisle Andersen

  • NO-JU TERMBASE

    NO-JU TERMBASE

    The NO-JU TERMBASE is a termbase covering key legal terms in Norwegian Bokmål and German, drawn from the Norwegian and German legal systems. This freely accessible and regularly updated tool supports legal translation between Norwegian and German by providing definitions, relevant context, synonyms and near-synonyms, organised through domain-related classification.

    For more information; Norsk-tysk juridisk termbase (NOJU) | Termportalen

    Contact person: Ingrid Simonnæs

  • Popular science: The Forskning.no corpus

    Popular science: The Forskning.no corpus

    The Forskning.no corpus is a large Norwegian corpus of popular science from all fields of science collected from the news service Forskning.no. It is a valuable resource for research on popular science as a genre and is used for terminology and lexicography, as well as in applications such as term extraction.

    For more information: THE FORSKNING.NO CORPUS

    Contact person: Gisle Andersen

  • Sustainability communication in multinational corporations

    Sustainability communication in multinational corporations

    This corpus-based study examines how multinational corporations that engage in environmentally harmful operations in Colombia – through forms of resource extraction and community engagement that would be unlikely in their home countries – frame their sustainability discourses in the Colombian context compared to their domestic settings. Corpus-assisted discourse analysis and the automated retrieval of linguistic features enable a systematic analysis of the semantic and pragmatic dimensions of these companies’ green discourse, including the deployment of buzzwords, legitimation strategies, euphemisms, and the balance between abstract and concrete language. A key aim of the project is to assess whether discrepancies emerge in their persuasive strategies, claims and objectives when operating in a country with more flexible regulations on corporate social responsibility and human rights, as compared to their practices and discourses at home.

    Contact person: Margrete Dyvik Cardona

  • THE FOMC CORPUS

    THE FOMC CORPUS

     

    The Federal Open Market Committee (FOMC) Corpus is a text collection consisting of transcripts from the meetings of the board of the Fed. The task of the Fed is to determine the monetary policy of the United States. The corpus consists of annotated transcripts of all the board meetings from 1987 to 2018. This resource enables research into the deliberations on monetary policy, e.g. from an economics or communication/discourse analytic perspective.

    Contact person: Christian Langerfeld

    For more information: FOMC corpus in Sketch Engine

Completed research projects

  • Click here to see a selection of previous research project

    Click here to see a selection of previous research project

    • CLARA

      CLARA (Common languange resources and their applications) was an EU-funded project that carried out research on theoretical, methodological and technical topics relating to the task of harmonising language resources and terminology for professional domains.

      Professional knowledge domains, such as economy, energy and medicine, present special challenges to correct understanding, especially across languages.

      In this project a large corpus of English and Spanish Free Trade Agreements was compiled and experiments were performed with semi-automatic extraction of term candidates and specialised collocation candidates.

      The work focused specifically on parallel corpora, computational terminology and phraseology.

       

      KB-N KNOWLEDGE BANK OF NORWAY

      The KB-N project developed a concept-oriented text and term based knowledge management system for economic-administrative domains.

      The project included language technology applications for use primarily within translation, documentation and publishing. It incorporated a 30-million-words parallel corpus of Norwegian and English economic-administrative texts from about 30 subdomains, as well as a bilingual termbase of some 8,400 term records. As part of the project an automatic term extractor module for Norwegian was developed.

      The 3-year KB-N project (completed 2006) was funded by the Research Council of Norway within its KUNSTI programme for language technology.

       

      Maritim Ordbok

      MARITIM ORDBOK is an externally funded project that develops a comprehensive terminology for maritime domains. The resource contains a wide range of concepts in Norwegian, English and other languages and covers all maritime areas including fauna, flora, marine industries, products, tools and equipment.

      For more information: 

      MARITIM ORDBOK

      Contact person: Gisle Andersen

       

      Migration and the Media

      MIGRATION AND THE MEDIA is a corpus-based research project which studies how media outlets, particularly newspapers, linguistically frame migrants and migration. The automated retrieval of linguistic features allows for a reliable analysis of semantic and pragmatic properties assigned to migrants across different newspaper articles. One of the main aims of this project is to ascertain whether migrants are presented in a way that influences their process of integration into their host societies. 

      Contact person: Margrete Dyvik Cardona

      Mikroøkonomen

      The Mikroøkonomen terminology project developed a termbase consisting of 800 term records in the field of microeconomics (English-Norwegian).

      This has become a useful resource for students, lecturers and researchers of economics to bridge the gap between the literature written in English and the users’ need to acquire equal competence in Norwegian. Mikroøkonomen has since been further developed in the project NHH Termbase.

      The Mikroøkonomen project (completed 2008) received funding from the Research Council of Norway and NHH.

      For more information:

      Mikroøkonomen