BAN436 Data analysis in Python
Python has in recent years become one of the most popular programming languages, and it has found many applications in both business and scientific research. Unlike many other programming languages used in scientific research, Python is not developed specifically for statistical analysis. Instead, it is a general-purpose programming language.
This one-week intensive seminar will focus on getting you started with using Python for data analysis. Data analysis is an important task for both businesses and researchers. However, the data that we need to analyze is often organized in a way that is unsuitable for analysis. The course will focus on how to use Python to convert raw data into tidy data sets that we can use for data analysis. You will also learn how to use Python to summarize and communicate the information in tidy data sets through basic data analysis and visualization.
The course will start with a general introduction to Python. The rest of the course will focus on how to use Python for cleaning and analyzing data. The course is intended for students without any prior knowledge of Python, and for students with some prior knowledge of Python and who wish to learn how to use Python for data analysis.
The course consists of three modules:
- Getting started - introduction to Python and Jupyter Notebook
- Importing and cleaning data
- Analyzing and visualizing data
After successful completion of the course, you will be able to perform data analysis in Python. The course will also give you the foundation that you need for continuing to learn Python and how you can use it to solve a large variety of problems encountered in your academic and professional life.
During the course, students will learn how to do basic programming in Python, and how to tidy and analyze data in Python. After successful completion of the course, students will have:
the practical skills to:
- write, modify and execute Python code in Jupyter Notebook.
- distinguish between the different data types and structures in Python (e.g. list, dictionary, array, data frame).
- write functions and loops.
- load, manipulate and save data using the pandas package.
- perform simple data analysis (e.g. descriptive statistics, correlation analysis).
- visualize data using the matplotlib package.
and the general knowledge to
- identify the appropriate format of data sets with regards to data analysis (i.e. tidy data).
- conduct reproducible research using Jupyter Notebook.
- search package documentation and online sources for help with coding.
This is a one-week intensive course that consists of daily lectures and in-class assignments to be solved in groups.
It will be possible to follow the course digitally.
The course introduces the students to Python, and therefore it requires no previous knowledge of Python or programming.
However, basic statistical knowledge as provided by MET2 is helpful.
Credit reduction due to overlap
Requirements for course approval
Group term paper (2-4 students in each group) in which the group will demonstrate the tools and concepts learned in the course by tidying and analyzing a data set of their choosing.
The group assignment will be given at the end of the course, and the students will have two weeks to complete the assignment.
Python, Jupyter Notebook. I recommend downloading the Anaconda distribution for Python. More details regarding the required software will be provided at the beginning of the course.
To be announced in Canvas.
- ECTS Credits
- Teaching language
Spring. Offered Spring 2021, last week of the semester (first time).
Assistant Professor Isabel Hovdahl, Department of Business and Management Science