Detecting Fraud through Textual Analysis

BAN439 Detecting Fraud through Textual Analysis

Spring 2024

Autumn 2024
  • Topics

    In recent years we have observed many data leaks such as the Luxleaks, the Panama papers, the Paradise papers, the Pandora papers, etc. These leaks represent enormous quantities of data. However, most of this data is in textual form, making it difficult to extract relevant information for the average analyst. In this course we want to teach skills relevant to reviewing text data with the purpose of uncovering evidence for illegal activity. We will focus on how to obtain raw data through web scraping. Then, we will format the raw data into a final dataset that can be used to answer relevant real-world questions.

    This course could be useful for:

    • Students who are interested in obtaining advance knowledge of the application in textual analysis
    • Fraud analysts who want to find evidence for links between different users
    • Journalists that are interested in obtaining skills in finding the story through the large data source

    This course is an extension to:

    BAN 432 Applied Textual Data Analysis for Business and Finance

    BUS 465 Detecting Corporate Crime

  • Learning outcome

    KNOWLEDGE - The candidate will…

    • know how to apply tools for obtaining relevant information from textual data

    SKILLS - The candidate will be able to…

    • employ different techniques in order to obtain textual data, e.g. web scraping
    • prepare textual data for analysis by pre-processing it
    • apply appropriate tools from Natural Language Processing with the aim of identifying corporate crime
    • write an on-point report on the findings

    COMPETENCE - The candidate will be able to...

    • investigate fraud using textual analysis
    • present evidence for fraud
    • understand the uses and limits of detection strategies
    • discern reliable information for building a case in the process of investigation

  • Teaching

    In this course, lectures are combined with applied examples in R. While central concepts of textual analysis and crime detection are presented in the lectures, the practical part will focus on their implementation in R.

  • Recommended prerequisites

    Previous knowledge of R.

    This course is thematically located at the intersection between BAN432 Applied Textual Data Analysis for Business and Finance and BUS465 Fraud Detection. It is an advantage if the student has completed at least one of these courses. The previous knowledge of the students in both areas is taken into account when putting together the groups for the group work, so that the necessary knowledge is available in each group.

  • Credit reduction due to overlap


  • Compulsory Activity


  • Assessment

    Group project, with group size 2-4 people. The group project is developed during the course and submitted one week afterwards.

  • Grading Scale


  • Computer tools

    R, R Studio

  • Literature

    Literature will be made available on Canvas


ECTS Credits
Teaching language

Autumn. Will be offered Autumn 2023.