Readings: Difference between revisions
From info319
No edit summary |
No edit summary |
||
| Line 13: | Line 13: | ||
* Spark 3.3.0 [https://spark.apache.org/docs/latest/index.html Overview] and [https://spark.apache.org/docs/latest/quick-start.html Quick Start (with Python examples)] | * Spark 3.3.0 [https://spark.apache.org/docs/latest/index.html Overview] and [https://spark.apache.org/docs/latest/quick-start.html Quick Start (with Python examples)] | ||
<!-- * [https://docs.nrec.no/intro.html NREC Introduction - The Norwegian Research and Education Cloud] --> | <!-- * [https://docs.nrec.no/intro.html NREC Introduction - The Norwegian Research and Education Cloud] --> | ||
* OpenStack | * [https://docs.nrec.no/index.html NREC and OpenStack], the following sections/pages: Introduction, Project application, Logging in, The dashboard, Create a Linux virtual machine (skip: Windows), Using SSH, Working with Security Groups, Create and manage volumes, Create and manage snapshots (skip: images), Instance console | ||
* TerraForm | * [https://docs.nrec.no/terraform-part1.html TerraForm and NREC part I], [https://docs.nrec.no/terraform-part2.html part II], and [https://docs.nrec.no/terraform-part3.html part III] | ||
* Ansible | * [https://www.ansible.com/overview/how-ansible-works How Ansible Works] and [https://docs.ansible.com/ansible_community.html the Ansible Community portal] | ||
* [https://kafka.apache.org/intro Kafka Introduction] | * [https://kafka.apache.org/intro Kafka Introduction] | ||
Additional non-mandatory materials will be made available to support the exercises. | Additional non-mandatory materials will be made available to support the exercises further. | ||
<!-- Twitter, tweepy. GDELT --> | <!-- Twitter, tweepy. GDELT --> | ||
| Line 37: | Line 37: | ||
See the [[Sessions|Session page]] for lecture slides. | See the [[Sessions|Session page]] for lecture slides. | ||
== | ==Readings for each session and exercise== | ||
The [[Sessions|Session page]] contains specific readings for each session. | The [[Sessions|Session page]] contains specific readings for each session. | ||
Revision as of 10:19, 17 August 2022
Books
We will use two text books:
- Rob Kitchin. The Data Revolution - Big Data, Open Data, Data Infrastructures & Their Consequences. Sage, 2014.
- At least chapters 1-5 and some later chapters are mandatory.
- Bill Chambers and Matei Zaharia: Spark: The Definitive Guide - Big Data Processing Made Simple. O'Riley, 2018. File:Spark-TheDefinitiveGuide.pdf
- At least chapters 1-9 and some later chapters are mandatory.
Technical introductions
Selected web pages will become available here, including:
- Spark 3.3.0 Overview and Quick Start (with Python examples)
- NREC and OpenStack, the following sections/pages: Introduction, Project application, Logging in, The dashboard, Create a Linux virtual machine (skip: Windows), Using SSH, Working with Security Groups, Create and manage volumes, Create and manage snapshots (skip: images), Instance console
- TerraForm and NREC part I, part II, and part III
- How Ansible Works and the Ansible Community portal
- Kafka Introduction
Additional non-mandatory materials will be made available to support the exercises further.
Papers
Selected papers will become available here, including:
- Gallofré, M., Opdahl, A. L., Stoppel, S., Tessem, B., & Veres, C. (2021). The News Angler Project: Exploring the Next Generation of Journalistic Knowledge Platforms. In Proceedings of Norsk IKT-konferanse for forskning og utdanning. Short Paper [Poster]
Supplementary:
- Section 1 in Opdahl, A. L., & Nunavath, V. (2020). Big Data. Big Data in Emergency Management: Exploitation Techniques for Social and Mobile Data, 15-29. Paper
- Opdahl, A. L., & Tessem, B. (2021). Ontologies for finding journalistic angles. Software and Systems Modeling, 20(1), 71-87. Paper
Lecture Slides
See the Session page for lecture slides.
Readings for each session and exercise
The Session page contains specific readings for each session.
