Exercises

From info319
Revision as of 06:49, 7 September 2022 by Sinoa (talk | contribs)

Outline of the exercises. Because the exercises are new this year, it is hard to plan exactly, so this is likely to change a bit!

  • Exercise 1: Getting started with Apache Spark and Processing tweets with Spark.
  • Exercise 2: Streaming tweets. Simple Spark streaming. Streaming tweets with Twitter API
  • Exercise 3: Streaming Spark. We will continue to analyse tweets or other streaming types of information.
  • Exercise 4: Spark in the cloud. We will run Apache Spark on a cluster of virtual machines in the OpenStack cloud.
  • Exercise 5: Cloud management. We will automate upgrading and scaling of clusters using Terraform and Ansible.

I also hope to be able to do something with Docker, Docker Swarm and/or Kubernetes.