Aller à : contenu haut bas recherche
 
 
EN     FR
Vous êtes ici:   UNIL > HEC Inst. > HEC App. > SYLLABUS
 
 

Big-Scale Analytics

  • Enseignant(s):   M.Vlachos  
  • Titre en français: Analyse de la balance Web
  • Cours donné en: anglais
  • Crédits ECTS: 6 crédits
  • Horaire: Semestre de printemps 2020-2021, 4.0h. de cours (moyenne hebdomadaire)
  •  séances
  • site web du cours site web du cours
  • Formation concernée: Maîtrise universitaire ès Sciences en systèmes d'information
  • Permalink:



       

 

Objectifs

Many of the world's biggest discoveries and decisions in science, technology, business, medicine, politics, and society as a whole, are now being made on the basis of analyzing large data sets. This course provides a practical introduction to big data: data analysis techniques including databases, data mining, machine learning, text analytics and data visualization. During the class we will mostly be using Python and cloud services from Google and Microsoft.

A big part of the evaluation will be a semester-wide group project for which you will have to collect the data, annotate it, build the predictive model and deploy your algorithm.

Contenus

This class represents a continuation of the “Data Mining and Machine Learning” course of the last quarter. We will continue our exploration of how to use algorithms and tools for analyzing data and extracting insights.

We will explore Big Data frameworks, SQL, text analytics, noSQL, visualization, search, association rules and other topics related to Big Data Analytics.

Each weekly 4h block will consist of 2h lecture and 2h lab session. In the lab sessions you will get hands-on exercises in Python and you will also be exposed to cloud-based services.

Some of questions that we will answer in this class are:

  1. How to use cloud-based services?
  2. How to query large amounts of text data.
  3. What are some of the popular algorithms and techniques for processing data?
  4. How can we use big data to extract insights for our business?

Goal: Understand the basic terminology of big data, comprehend the potential pitfalls, get a general understanding of how to address real-world problems using Python code. You will also learn how to create and deploy APIs for the tools that you will build using Flask, Docker, and cloud-based services from Google and Microsoft, as well SQL querying with BigQuery and text search with Elasticsearch.

Références

The material for this course will consist of slides given by the instructor and articles from various sources that you will have to read and discuss.

Books (recommended reading but not required)

  • Mining Massive Datasets (by J. Leskovec, A. Rajaraman, J. Ullman)
  • Data Mining (by Charu Agrawal)

Pré-requis

  • You should have attended and passed the MScIS “Data Mining and Machine Learning” course.
  • Knowledge of Python.
  • Knowledge of calculus and statistics.

Students from other departments who wish to attend, should show proof of other Machine Learning classes they attended, because many notions will be assumed as known (contact the Professor to ask permission to attend).

Evaluation

1ère tentative

Examen:
Sans examen (cf. modalités)  
Evaluation:
  • Group project in multiple milestones during the semester (50%)
  • In-class quiz(es) (Moodle) (20%)
  • Personal programming assignment(s) (30%)

Rattrapage

Examen:
Oral 0h20 minutes
Documentation:
Autorisée
Calculatrice:
Autorisée
Evaluation:

If a retake (rattrapage) is needed, the retake exam (on Moodle) will replace the in-class quizzes and personal assignments (the remaining 50% from your semester group project). If there are only few participants, we may do the retake as oral exam (20mins).



[» page précédente]           [» liste des cours]
 
Recherche


Internef - CH-1015 Lausanne - Suisse  -   Tél. +41 21 692 33 00  -   Fax +41 21 692 33 05
Swiss University