Readings
From info319
Books
We will use two text books:
- Rob Kitchin. The Data Revolution - Big Data, Open Data, Data Infrastructures & Their Consequences. Sage, 2014.
- At least chapters 1-5 and some later chapters are mandatory.
- Bill Chambers and Matei Zaharia: Spark: The Definitive Guide - Big Data Processing Made Simple. O'Riley, 2018. File:Spark-TheDefinitiveGuide.pdf
- At least chapters 1-9 and some later chapters are mandatory.
Technical introductions
Selected web pages will become available here, including:
- Spark 3.3.0 Overview and Quick Start (with Python examples)
- NREC and OpenStack, the following sections/pages: Introduction, Project application, Logging in, The dashboard, Create a Linux virtual machine (skip: Windows), Using SSH, Working with Security Groups, Create and manage volumes, Create and manage snapshots (skip: images), Instance console
- TerraForm and NREC part I, part II, and part III
- How Ansible Works and the Ansible Community portal
- Kafka Introduction
Additional non-mandatory materials will be made available to support the exercises further.
Papers
Selected papers will become available here, including:
- Gallofré, M., Opdahl, A. L., Stoppel, S., Tessem, B., & Veres, C. (2021). The News Angler Project: Exploring the Next Generation of Journalistic Knowledge Platforms. In Proceedings of Norsk IKT-konferanse for forskning og utdanning. Short Paper [Poster]
Supplementary:
- Section 1 in Opdahl, A. L., & Nunavath, V. (2020). Big Data. Big Data in Emergency Management: Exploitation Techniques for Social and Mobile Data, 15-29. Paper
- Opdahl, A. L., & Tessem, B. (2021). Ontologies for finding journalistic angles. Software and Systems Modeling, 20(1), 71-87. Paper
Lecture Slides
See the Session page for lecture slides.
Readings for each session and exercise
The Session page contains specific readings for each session.
