BAN439 Detecting Fraud through Textual Analysis
In recent years we have observed many data leaks such as the Luxleaks, the Panama papers, the Paradise papers, the Pandora papers, etc. These leaks represent enormous quantities of data. However, most of this data is in textual form, making it difficult to extract relevant information for the average analyst. In this course we want to teach skills relevant to reviewing text data with the purpose of uncovering evidence for illegal activity. We will focus on how to obtain raw data through web scraping. Then, we will format the raw data into a final dataset that can be used to answer relevant real-world questions.
This course could be useful for:
- Students who are interested in obtaining advance knowledge of the application in textual analysis
- Fraud analysts who want to find evidence for links between different users
- Journalists that are interested in obtaining skills in finding the story through the large data source
This course is an extension to:
BAN 432 Applied Textual Data Analysis for Business and Finance
BUS 465 Detecting Corporate Crime
KNOWLEDGE - The candidate will…
- know how to apply tools for obtaining relevant information from textual data
SKILLS - The candidate will be able to…
- employ different techniques in order to obtain textual data, e.g. web scraping
- prepare textual data for analysis by pre-processing it
- apply appropriate tools from Natural Language Processing with the aim of identifying corporate crime
- write an on-point report on the findings
COMPETENCE - The candidate will be able to...
- investigate fraud using textual analysis
- present evidence for fraud
- understand the uses and limits of detection strategies
- discern reliable information for building a case in the process of investigation
In this course, lectures are combined with applied examples in R. While central concepts of textual analysis and crime detection are presented in the lectures, the practical part will focus on their implementation in R.
Previous knowledge of R.
This course is thematically located at the intersection between BAN432 Applied Textual Data Analysis for Business and Finance and BUS465 Fraud Detection. It is an advantage if the student has completed at least one of these courses. The previous knowledge of the students in both areas is taken into account when putting together the groups for the group work, so that the necessary knowledge is available in each group.
Credit reduction due to overlap
Group project, with group size 2-4 people. The group project is developed during the course and submitted one week afterwards.
R, R Studio
Literature will be made available on Canvas
- ECTS Credits
- Teaching language
Autumn. Will be offered Autumn 2023.