Marketing Analytics - A Machine Learning Perspective (CANCELLED)

MBM434 Marketing Analytics - A Machine Learning Perspective (CANCELLED)

Spring 2024

  • Topics

    Many businesses and agencies are eager to make use of the data they increasingly accumulate from their own business processes. These data bases, however, are not only large in terms of the number of observations, but also in their degree of detail, i.e., the number of attributes for each observation. This not only creates computational challenges, it also makes dimension reduction and model selection a delicate, but essential, task. Many machine learning techniques at least partially incorporate and automatize aspects of this, and thus greatly support the analyst in "making sense of the data". This course is tailored at providing students with a solid base of knowledge and skills to apply these methods in a competent way.

    The topics of this course are:

    • Basic concepts of data analysis by means of machine learning: Prediction and inference; supervised and unsupervised learning; classification and regression; and other topics.
    • Supervised learning 1: The linear model

                             High dimensional regression: Subset selection and shrinkage                                                                   Classification: Logistic regression

    • Model assessment: Accuracy and errors; cross-validation
    • Supervised learning 2: Regression trees, random forests, and gradient boosting
    • Unsupervised learning: Principal component analysis and clustering
    • Advanced topics: Alternative methods; Kaggle and Amazon EC2; and other topics  

  • Learning outcome

    This course forms an introduction to data analysis by means of computational methods and machine learning, with a particular emphasis on marketing applications. It is designed to provide both a sound methodological underpinning ("know what you do, and why") and a practical dimension that enables students to run their own analyses ("know how to do it").

    Knowledge: The students know and understand

    • The key concepts of machine learning for data analysis (such as supervised vs. unsupervised learning, classification vs. regression, etc.).
    • How the modern machine learning approaches covered in the course relate to "classical" statistics, and why dimensionality reduction and (partially) automated model selection is indispensable.
    • The various forms of accuracy and error metrics that need to be considered when assessing the quality and validity of a model.

    Skills: The students can

    • Apply a selection of machine learning methods to realistically sized data sets, using the R environment.
    • Interpret, present, and communicate the output of trained machine learning models.
    • Analyze the prediction quality using various error definitions and cross-validation.

    General Competence:

    • The students can judge the potential, and the limits, of currently available machine learning techniques, and their implementations, for data analysis.
    • When confronted with a managerial/marketing problem and a data base, the students can assess the feasibility of addressing the problem using a data driven approach.
    • The students can weigh the pros and cons of various machine learning approaches, given a problem and a data base.

  • Teaching

    In-class lectures and discussions; lab sessions; assignment work with peer feedback and in-class student presentations.

  • Recommended prerequisites

    Basic competence in statistics/econometrics. Basic skills in computer programming, preferably using R or Python (with pandas).

  • Required prerequisites


  • Credit reduction due to overlap


  • Compulsory Activity

    Peer feedback on assignments (approved/not approved)

  • Assessment

    Three assignments and one final term paper, to be worked on in groups of 2-4 students. Overall grade: 50% assignments, 50% term paper; one grade per group.

    The assignments and final term paper must be written in English.

  • Grading Scale


  • Computer tools

    R, RStudio, Jupyter While the implementations used in the lab sessions and in the main text are done in R, the use of Python with pandas for the student group work is also accepted.

  • Literature

    The main text for the course is: James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An Introduction to Statistical Learning. Springer Science & Business Media.

    Selected chapters and further reading: Hastie, T., Tibshirani, R., & Friedman, J. (2013). The Elements of Statistical Learning. New York, NY: Springer Science & Business Media.

    A list of scientific papers and optional readings will be distributed at the beginning of the semester.


ECTS Credits
Teaching language

Not offered.

Course responsible

Associate Professor Gregor Reich, Departement of Strategy and Management