Teaching – Vincent Leroy

M2 Research
2017-2018

Introduction to Map-Reduce
In-memory processing with Spark
Stream processing
Spark lab session (Tweet analysis) and code/data files
Spark assignment (Flickr analysis) deadline 15th of December
NoSQL databases
Neo4j lab session with additional dataset
Recommender systems
Pattern mining
Recommender systems lab session, code & data

Previous years

Data management in large scale distributed systems: Introduction
Distributed DBMS Architecture
Distributed Database design: fragmentation & allocation
Distributed Query Evaluation
Background Transactions
Distributed Transactions
Replication
Introduction to MapReduce
In-Memory Processing with Spark
Practical work MapReduce: subject and eclipse project (rename to zip)
NoSQL databases
Recommender systems 1 2 3 4
Clustering 1 2 3
Frequent Itemset Mining
Spark practical work and skeleton of Spark project (rename it in .zip to decompress)

Mastère Spécialisé Big Data

Introduction à Map-Reduce
Sujet TP Hadoop Map Reduce et squelette de projet
Introduction à Spark
Streaming
TP Spark – streaming et squelette de projet
Recommender systems: sujet, code et données

./fileProducer.sh ../1Mtweets_en.txt .1 | bin/kafka-console-producer.sh --broker-list localhost:9092 --topic batch_tweets
M2 MIAGE / M2 PGI
fileproducer.sh:
#!/bin/sh
while true
do
while read p; do
echo $p
sleep $2
done <$1
done

Ensimag ISI

Introduction à Map-Reduce
Sujet TP Hadoop Map Reduce et squelette de projet et échantillon 1M
In-memory processing with Spark
Stream processing
TP Spark (Tweet analysis) et code/data. Si possible, faites la partie “getting started” avant la séance de TP pour gagner du temps.

x (x)