Aller à : contenu haut bas recherche
EN     FR
Vous êtes ici:   UNIL > HEC Inst. > HEC App. > SYLLABUS


Data Mining and Machine Learning

  • Teacher(s):  
  • Course given in: English
  • ECTS Credits:
  • Schedule: Autumn Semester 2021-2022, 4.0h. course (weekly average)
  •  sessions
  • site web du cours course website
  • Related programme: Master of Science (MSc) in Information Systems
  • Permalink:




Today, enterprises collect troves of data about their clients: historical purchases, responses to marketing events, web search logs, etc. In today’s data-driven economy, data can assist us in better understanding our customers, and in taking more informed decisions about our business. Some of questions that we will answer in this class are:

  1. How do we represent different types of data?
  2. How do we perform exploratory data analysis and how can we effectively visualize data?
  3. How do we extract useful and actionable information from this data to add value to our business?

Goal: Understand the basic terminology of data science and machine learning (regression, classification, visualization, clustering, text analytics), comprehend the potential pitfalls, get a general understanding of how to address real-world problems using Python code.


Topics that we will cover in the course include:

  • Introduction: Data Mining and Machine Learning, Concepts and Terminology. Applications: Targeted Marketing, and Customer Modeling
  • Data Preparation and cleaning for knowledge discovery
  • Exploratory Data Analysis
  • Data Visualization
  • Predicting numerical values with Linear Regression
  • Predicting categorical values. Classification. Decision Trees, Nearest Neighbor Classification, Logistic Regression
  • Evaluation of a predictive model
  • Feature Engineering and Dimensionality reduction
  • Unsupervised Learning (Clustering: kMeans, Hierarchical Clustering)
  • Text Analytics


These are recommended but not required textbooks.

- Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking, Foster Provost, Tom Fawcett, ISBN-13: 978-1449361327

- Data Mining: Practical Machine Learning Tools and Techniques, Ian H. Witten, Eibe Frank, Second Edition, 2005, ISBN: 0-12-088407-0


- Good knowledge of Python and object-oriented-programming (OOP) in Python


First attempt

Without exam (cf. terms)  

Your grade will be based on work that you do during the semester and depends on the following components:

  • Personal Work (Quizzes + Assignments): 70%
  • Group project in Python: 30%

For this course, class participation is important. You are expected to share your thoughts, help your colleagues and participate in the discussions in class and on the class forums.


Written 2h00 hours

The retake exam (on Moodle) will replace the personal work and class participation and will represent 70% of the class grade (the remaining 30% from your semester group project). If there are only few participants, we may do the retake as oral (20mins). This will still account for 70% of the grade and the remaining 30% from the semester group project.

[» go back]           [» courses list]

Internef - CH-1015 Lausanne - Suisse  -   Tél. +41 21 692 33 00  -   Fax +41 21 692 33 05
Swiss University