Exercises
From info319
Outline of the exercises. Because the exercises are new this year, it is hard to plan exactly, so this is likely to change a bit!
- Exercise 1: Getting started with Apache Spark and Processing tweets with Spark.
- Exercise 2: Streaming tweets with Twitter API
- Exercise 3: Streaming tweets with Kafka and Spark
- Exercise 4:
- Create Spark cluster
- Install HDFS and YARN on the cluster
- Install Spark on the cluster
- Install Kafka on the cluster - not completely finished:
- Exercise 5: Cloud management. We will automate upgrading and scaling of clusters using Terraform and Ansible.
I also hope to be able to do something with Docker, Docker Swarm and/or Kubernetes.
