Aller à : contenu haut bas recherche
EN     FR
Vous êtes ici:   UNIL > HEC Inst. > HEC App. > SYLLABUS

Big-Scale Analytics

  • Teacher(s):   M.Vlachos  
  • Course given in: English
  • ECTS Credits: 6 credits
  • Schedule: Spring Semester 2020-2021, 4.0h. course (weekly average)
  •  sessions
  • site web du cours course website
  • Related programme: Master of Science (MSc) in Information Systems
  • Permalink:




Many of the world's biggest discoveries and decisions in science, technology, business, medicine, politics, and society as a whole, are now being made on the basis of analyzing large data sets. This course provides a practical introduction to big data: data analysis techniques including databases, data mining, machine learning, text analytics and data visualization. During the class we will mostly be using Python and cloud services from Google and Microsoft.

A big part of the evaluation will be a semester-wide group project for which you will have to collect the data, annotate it, build the predictive model and deploy your algorithm.


This class represents a continuation of the “Data Mining and Machine Learning” course of the last quarter. We will continue our exploration of how to use algorithms and tools for analyzing data and extracting insights.

We will explore Big Data frameworks, SQL, text analytics, noSQL, visualization, search, association rules and other topics related to Big Data Analytics.

Each weekly 4h block will consist of 2h lecture and 2h lab session. In the lab sessions you will get hands-on exercises in Python and you will also be exposed to cloud-based services.

Some of questions that we will answer in this class are:

  1. How to use cloud-based services?
  2. How to query large amounts of text data.
  3. What are some of the popular algorithms and techniques for processing data?
  4. How can we use big data to extract insights for our business?

Goal: Understand the basic terminology of big data, comprehend the potential pitfalls, get a general understanding of how to address real-world problems using Python code. You will also learn how to create and deploy APIs for the tools that you will build using Flask, Docker, and cloud-based services from Google and Microsoft, as well SQL querying with BigQuery and text search with Elasticsearch.


The material for this course will consist of slides given by the instructor and articles from various sources that you will have to read and discuss.

Books (recommended reading but not required)

  • Mining Massive Datasets (by J. Leskovec, A. Rajaraman, J. Ullman)
  • Data Mining (by Charu Agrawal)


  • You should have attended and passed the MScIS “Data Mining and Machine Learning” course.
  • Knowledge of Python.
  • Knowledge of calculus and statistics.
Students from other departments who wish to attend, should show proof of other Machine Learning classes they attended, because many notions will be assumed as known (contact the Professor to ask permission to attend).


First attempt

Without exam (cf. terms)  
  • Group project in multiple milestones during the semester (50%)
  • In-class quiz(es) (Moodle) (20%)
  • Personal programming assignment(s) (30%)


Oral 0h20 minutes

If a retake (rattrapage) is needed, the retake exam (on Moodle) will replace the in-class quizzes and personal assignments (the remaining 50% from your semester group project). If there are only few participants, we may do the retake as oral exam (20mins).

[» go back]           [» courses list]

Internef - CH-1015 Lausanne - Suisse  -   Tél. +41 21 692 33 00  -   Fax +41 21 692 33 05
Swiss University