Aller à : contenu haut bas recherche
 
 
EN     FR
Vous êtes ici:   UNIL > HEC Inst. > HEC App. > SYLLABUS
 
 

Big-Scale Analytics

  • Teacher(s):   M.Vlachos  
  • Course given in: English
  • ECTS Credits: 6 credits
  • Schedule: Spring Semester 2021-2022, 4.0h. course (weekly average)
  •  sessions
  • site web du cours course website
  • Related programme: Master of Science (MSc) in Information Systems
  • Permalink:



       

 

Objectives

Keywords: Google Cloud Service (GCP), AutoML, BigQuery, ElasticSearch, Internet-of-Things (Raspberry Pi, Arduino)

Many of the world's biggest discoveries and decisions in science, technology, business, medicine, politics, and society as a whole, are now being made on the basis of analyzing large data sets. This course provides a practical introduction to big data and to data clouds. We will be using Python and cloud services from Google to perform data storage, search and machine learning. In addition to using data clouds we will also experiment with actual devices (internet of things), how to collect measurements and store them on the cloud. A big part of the evaluation will be a semester-wide group project.

Contents

This class represents a continuation of the “Data Mining and Machine Learning” course of the last quarter. We will continue our exploration of how to use algorithms and tools for analyzing data and extracting insights.

We will explore Big Data frameworks, SQL, text analytics, noSQL, search, association rules, Internet-of-Things, and other topics related to Big Data Analytics.

Each weekly 4h block will consist of 2h lecture and 2h lab session. In the lab sessions you will get hands-on exercises in Python and you will also be exposed to cloud-based services.

Some of questions that we will answer in this class are:

  1. How to use cloud-based services?
  2. How store and retrieve data from the cloud.
  3. How to perform automatic machine learning using the cloud services.
  4. How to use entity resolution, recommender systems, IoT, graph analytics, neural networks other advanced data analytics concepts.

Goal: Understand the basic terminology of big data, comprehend the potential pitfalls, get a general understanding of how to address real-world problems using Python code. You will also learn how to create and deploy APIs for the tools that you will build using Flask, Docker, and cloud-based services from Google, as well SQL querying with BigQuery and text search with Elasticsearch. We will also have hands-on experience with IoT devices and how to program them in Python.

References

The material for this course will consist of slides given by the instructor and articles from various sources that you will have to read and discuss.

Books (recommended reading but not required)

  • Mining Massive Datasets (by J. Leskovec, A. Rajaraman, J. Ullman)
  • Data Mining (by Charu Agrawal)

Pre-requisites

  • You should have attended and passed the MScIS “Data Mining and Machine Learning” course.
  • Knowledge of Python.
  • Knowledge of calculus and statistics.

Students from other departments who wish to attend, should show proof of other Machine Learning classes they attended, because many notions will be assumed as known (contact the Professor to ask permission to attend).

Evaluation

First attempt

Exam:
Without exam (cf. terms)  
Evaluation:
  • Group project in multiple milestones during the semester (50%)
  • In-class quiz(es) (Moodle) (20%)
  • Personal programming assignment(s) (30%)

Retake

Exam:
Written 1h30 hours
Documentation:
Allowed with restrictions
Calculator:
Allowed with restrictions
Evaluation:

If a retake (rattrapage) is needed, the retake exam (on Moodle) will replace the in-class quizzes and personal assignments (the remaining 50% from your semester group project). If there are only few participants, we may do the retake as oral exam.



[» go back]           [» courses list]
 
Search


Internef - CH-1015 Lausanne - Suisse  -   Tél. +41 21 692 33 00  -   Fax +41 21 692 33 05
Swiss University