Readings: Difference between revisions

From info319
No edit summary
No edit summary
Line 13: Line 13:
* Spark 3.3.0 [https://spark.apache.org/docs/latest/index.html Overview] and [https://spark.apache.org/docs/latest/quick-start.html Quick Start (with Python examples)]
* Spark 3.3.0 [https://spark.apache.org/docs/latest/index.html Overview] and [https://spark.apache.org/docs/latest/quick-start.html Quick Start (with Python examples)]
<!-- * [https://docs.nrec.no/intro.html NREC Introduction - The Norwegian Research and Education Cloud] -->
<!-- * [https://docs.nrec.no/intro.html NREC Introduction - The Norwegian Research and Education Cloud] -->
* OpenStack
* [https://docs.nrec.no/index.html NREC and OpenStack], the following sections/pages: Introduction, Project application, Logging in, The dashboard, Create a Linux virtual machine (skip: Windows), Using SSH, Working with Security Groups, Create and manage volumes, Create and manage snapshots (skip: images), Instance console
* TerraForm
* [https://docs.nrec.no/terraform-part1.html TerraForm and NREC part I], [https://docs.nrec.no/terraform-part2.html part II], and [https://docs.nrec.no/terraform-part3.html part III]
* Ansible
* [https://www.ansible.com/overview/how-ansible-works How Ansible Works] and [https://docs.ansible.com/ansible_community.html the Ansible Community portal]
* [https://kafka.apache.org/intro Kafka Introduction]
* [https://kafka.apache.org/intro Kafka Introduction]


Additional non-mandatory materials will be made available to support the exercises.
Additional non-mandatory materials will be made available to support the exercises further.


<!-- Twitter, tweepy. GDELT -->
<!-- Twitter, tweepy. GDELT -->
Line 37: Line 37:
See the [[Sessions|Session page]] for lecture slides.
See the [[Sessions|Session page]] for lecture slides.


==Suitable readings==
==Readings for each session and exercise==
The [[Sessions|Session page]] contains specific readings for each session.
The [[Sessions|Session page]] contains specific readings for each session.

Revision as of 10:19, 17 August 2022

Books

We will use two text books:

  • Rob Kitchin. The Data Revolution - Big Data, Open Data, Data Infrastructures & Their Consequences. Sage, 2014.
    • At least chapters 1-5 and some later chapters are mandatory.
  • Bill Chambers and Matei Zaharia: Spark: The Definitive Guide - Big Data Processing Made Simple. O'Riley, 2018. File:Spark-TheDefinitiveGuide.pdf
    • At least chapters 1-9 and some later chapters are mandatory.


Technical introductions

Selected web pages will become available here, including:

Additional non-mandatory materials will be made available to support the exercises further.


Papers

Selected papers will become available here, including:

  • Gallofré, M., Opdahl, A. L., Stoppel, S., Tessem, B., & Veres, C. (2021). The News Angler Project: Exploring the Next Generation of Journalistic Knowledge Platforms. In Proceedings of Norsk IKT-konferanse for forskning og utdanning. Short Paper [Poster]

Supplementary:

  • Section 1 in Opdahl, A. L., & Nunavath, V. (2020). Big Data. Big Data in Emergency Management: Exploitation Techniques for Social and Mobile Data, 15-29. Paper
  • Opdahl, A. L., & Tessem, B. (2021). Ontologies for finding journalistic angles. Software and Systems Modeling, 20(1), 71-87. Paper


Lecture Slides

See the Session page for lecture slides.

Readings for each session and exercise

The Session page contains specific readings for each session.