Readings
From info319
Books
Text books:
- Rob Kitchin. The Data Revolution - Big Data, Open Data, Data Infrastructures & Their Consequences. Sage, 2014.
- chapters 1, 3-5, 12-19 are mandatory (I will make some of the latest chapters supplementary, perhaps 14, 18, and 19 - TBA)
- Bill Chambers and Matei Zaharia: Spark: The Definitive Guide - Big Data Processing Made Simple. O'Riley, 2018. File:Spark-TheDefinitiveGuide.pdf
- chapters 1-9, 12, 15, 20-21 are mandatory (chapter 10 on SQL is also highly relevant)
Papers
Selected papers will become available here, including:
- Section 1 in Opdahl, A. L., & Nunavath, V. (2020). Big Data. Big Data in Emergency Management: Exploitation Techniques for Social and Mobile Data, 15-29. Book chapter
- Paper on big-data architecture (TBA)
- Gallofré, M., Opdahl, A. L., Stoppel, S., Tessem, B., & Veres, C. (2021). The News Angler Project: Exploring the Next Generation of Journalistic Knowledge Platforms. In Proceedings of Norsk IKT-konferanse for forskning og utdanning. Short Paper and oster: File:A1-Poster-NIKT2021.pdf
Supplementary:
- Opdahl, A. L., & Tessem, B. (2021). Ontologies for finding journalistic angles. Software and Systems Modeling, 20(1), 71-87. Paper
- Berven, A., Christensen, O. A., Moldeklev, S., Opdahl, A. L., & Villanger, K. J. (2020). A knowledge-graph platform for newsrooms. Computers in Industry, 123, 103321. Paper
Technical introductions
Selected web pages will become available here, including:
- Spark 3.3.0 Overview and Quick Start (with Python examples)
- Twitter API v2
- Tweepy: Twitter for Python
- Tweepy Documentation
- NREC and OpenStack, the following sections/pages: Introduction, Project application, Logging in, The dashboard, Create a Linux virtual machine (skip: Windows), Using SSH, Working with Security Groups, Create and manage volumes, Create and manage snapshots (skip: images), Instance console
- TerraForm and NREC part I, part II, and part III
- How Ansible Works and the Ansible Community portal
- Kafka Introduction
Additional non-mandatory materials will be made available to support the exercises further.
Lecture slides
See the Session page for lecture slides after each session.
Readings for each sessionx
The Sessions page will suggest specific readings for each session and its associated exercise.
