Exercises: Difference between revisions

From info319
No edit summary
No edit summary
 
(20 intermediate revisions by the same user not shown)
Line 1: Line 1:
Outline of the exercises. Because the exercises are new this year, it is hard to plan exactly, so this is likely to change a bit!
Outline of the exercises. Because the exercises are new this year, it is hard to plan exactly, so this is likely to change a bit!
* Exercise 1: '''Getting started with Apache Spark.''' <!-- [[Apache Spark]] [[Spark Streaming Twitter]] [[Sentiment analysis using Spark Streaming]] -->
* Exercise 1: [[Getting started with Apache Spark]] and [[Processing tweets with Spark]].
* Exercise 2: '''Streaming Spark.''' We will continue analysing tweets or other streaming types of information.
* Exercise 2: [[Streaming tweets with Twitter API]]
* Exercise 3: '''Cloud.'''  We will run Apache Spark on a cluster of virtual machines in the OpenStack cloud.
* Exercise 3: [[Streaming tweets with Kafka and Spark]]
* Exercise 4: '''TBA.''' <!-- Docker -->
* Exercise 4:
* Exercise 5: '''TBA.''' <!-- Cassandra -->
** [[Create Spark cluster]]
** [[Install HDFS and YARN on the cluster]]
** [[Install Spark on the cluster]]
** [[Install Kafka on the cluster]]
* Exercise 5:  
** [[Create Spark cluster using Terraform]]
** [[Configure Spark cluster using Ansible]]

Latest revision as of 13:56, 31 October 2022