Aller à : contenu haut bas recherche
EN     FR
Vous êtes ici:   UNIL > HEC Inst. > HEC App. > SYLLABUS

Data Mining and Machine Learning

  • Enseignant(s):   M.Vlachos  
  • Titre en français: Data mining et méthodes d'apprentissage
  • Cours donné en: anglais
  • Crédits ECTS: 6 crédits
  • Horaire: Semestre d'automne 2020-2021, 4.0h. de cours (moyenne hebdomadaire)
  •  séances
  • site web du cours site web du cours
  • Formation concernée: Maîtrise universitaire ès Sciences en systèmes d'information



Today, enterprises collect troves of data about their clients: historical purchases, responses to marketing events, web search logs, etc. In today’s data-driven economy, data can assist us in better understanding our customers, and in taking more informed decisions about our business. Some of questions that we will answer in this class are:

  1. How do we represent different types of data?
  2. How do we perform exploratory data analysis and how can we effectively visualize data?
  3. How do we extract useful and actionable information from this data to add value to our business?

Goal: Understand the basic terminology of data science and machine learning (regression, classification, visualization, clustering, text analytics, recommender systems), comprehend the potential pitfalls, get a general understanding of how to address real-world problems using Python code.


Topics that we will cover in the course include:

  • Introduction: Data Mining and Machine Learning, Concepts and Terminology. Applications: Targeted Marketing, and Customer Modeling
  • Data Preparation and cleaning for knowledge discovery
  • Exploratory Data Analysis
  • Data Visualization
  • Predicting numerical values with Linear Regression
  • Predicting categorical values. Classification. Decision Trees, Nearest Neighbor Classification, Logistic Regression
  • Evaluation of a predictive model
  • Feature Engineering and Dimensionality reduction
  • Unsupervised Learning (Clustering: kMeans, Hierarchical Clustering)
  • Text Analytics
  • Recommender Systems


These are recommended but not required textbooks.

- Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking, Foster Provost, Tom Fawcett, ISBN-13: 978-1449361327

- Data Mining: Practical Machine Learning Tools and Techniques, Ian H. Witten, Eibe Frank, Second Edition, 2005, ISBN: 0-12-088407-0


- Good knowledge of Python and object-oriented-programming (OOP) in Python


1ère tentative

Sans examen (cf. modalités)  

Your grade will be based on work that you do during the semester and depends on the following components:

  • Personal Work (Quizzes + Assignments): 60%
  • Group project in Python: 30%
  • Class participation: 10%

For this course, class participation is important. You are expected to share your thoughts, help your colleagues and participate in the discussions in class and on the class forums.


Ecrit 2h00 heures

The retake exam (on Moodle) will replace the personal work and class participation and will represent 70% of the class grade (the remaining 30% from your semester group project).

If there are only few participants, we will do the retake as oral (20mins).

[» page précédente]           [» liste des cours]

Internef - CH-1015 Lausanne - Suisse  -   Tél. +41 21 692 33 00  -   Fax +41 21 692 33 05
Swiss University