Exercises: Difference between revisions

Revision as of 09:45, 17 September 2022

Outline of the exercises. Because the exercises are new this year, it is hard to plan exactly, so this is likely to change a bit!

Exercise 1: Getting started with Apache Spark and Processing tweets with Spark.
Exercise 2: Streaming tweets with Twitter API
Exercise 3: Streaming tweets with Kafka and Spark
Exercise 4: Spark in the cloud. We will run Apache Spark on a cluster of virtual machines in the OpenStack cloud.
Exercise 5: Cloud management. We will automate upgrading and scaling of clusters using Terraform and Ansible.

I also hope to be able to do something with Docker, Docker Swarm and/or Kubernetes.

@@ Line 2: / Line 2: @@
 * Exercise 1: [[Getting started with Apache Spark]] and [[Processing tweets with Spark]].
 * Exercise 2: [[Streaming tweets with Twitter API]]
-* Exercise 3: '''Streaming Spark.''' We will continue to analyse tweets or other streaming types of information. <!-- Kafka -->
+* Exercise 3: [[Streaming tweets with Kafka and Spark]]
 * Exercise 4: '''Spark in the cloud.''' We will run Apache Spark on a cluster of virtual machines in the OpenStack cloud. <!-- Docker -->
 * Exercise 5: '''Cloud management.''' We will automate upgrading and scaling of clusters using Terraform and Ansible.
 I also hope to be able to do something with Docker, Docker Swarm and/or Kubernetes.