Insurance Analytics

BAN427 Insurance Analytics

Autumn 2020

  • Topics

    Across all industries, the ability to utilize data and data science methods is essential for gaining a competitive advantage. Insurance is of particular interest due to the abundance of data and the long tradition for advanced risk modelling. This course gives you an introduction to insurance economics and how data science typically is applied in order to solve real business problems. We will work with data from insurance and work hands on using Python and Jupyter Notebook. The topics we focus on will be:

    1. Introduction to Non-Life Insurance. Adverse selection and moral hazard - and how to measure this empirically.
    2. Big data in insurance/finance. How to establish a Customer TimeLine. The importance of event data and how to utilize event data in real life predictions.
    3. Prediction methods with applications to insurance. Introduction to the standard "ML tool set" (logit regression, regression trees, random forest, ensemble methods and more).
    4. Prediction versus causation. Causal models, combining ML and causal methods.
    5. How to use randomized experiments in order to improve business processes.
    6. From predictive modelling to production. How to deploy and maintain many prediction models in a business environment? Keywords: Microservices, Streaming data, on-the-fly scoring.

  • Learning outcome

    After completing the course students:


    • Know how big data and machine learning techniques is used in the insurance industry
    • Know how to build, deploy and test models and treatments using randomized experiments


    • Can bring insurance problem into a statistical model
    • Can analyze and predict important insurance outcomes using machine learning techniques

    General Competence

    • Have general knowledge about measuring adverse selection and moral hazard from insurance data
    • Know how domain knowledge can be used to extract "causal" knowledge from observational data
    • Have knowledge about correlation vs causality - and how to empirically address causality using domain knowledge

  • Teaching

    7 lectures of 2 x 45 minutes. Anonymous data will be provided for applications of ML methods in the insurance business.

  • Recommended prerequisites

    Econometrics - for example ECN402, BUS444 or BAN431.

    Knowledge with Python and Jupyter Notebook is useful, but not a requirement. The assignment given at the end of the course will involve data and prediction modelling. Feel free to use whatever code language you prefer for the assignment (R, Python, Stata, SAS).

  • Requirements for course approval

    Mandatory participation in all lectures.

  • Assessment

    Assignment (group work, 2-3 students in each group).

  • Grading Scale


  • Computer tools

    Python combined Jupyter Notebook, R is optional.

  • Literature

    Suggested general background/supportive literature (more details given during the course):

    Einav & Finkelstein "Selection in Insurance Markets",

    Aarbu (2015) - "Asymmetric Information in the Home Insurance Market".

    Zhang, Bradlow & Small (2013) : "New measures of clumpiness for incidence data

    Varian (2014):  "Big Data: New Tricks for Econometrics"

    Mullaainathan & Spiess (2017): "Machine Learning: An Applied Econometric Approach"

    Breiman (2001), "Statistical Modelling": The Two Cultures http://

    Sutton & Barto, "Reinforcement Learning: An Introduction (chapter 1 and 2)


ECTS Credits
Teaching language

Autumn. First week of the Autumn semester. Offered Autumn 2020.

Please note: Due to the present corona situation, please expect parts of this course description to be changed before the autumn semester starts. Particularly, but not exclusively, this relates to teaching methods, mandatory requirements and assessment.

Course responsible

Adjunct Associate Professor Karl Ove Aarbu, Department of Business and Management Science.