Sessions: Difference between revisions
No edit summary |
No edit summary |
||
Line 20: | Line 20: | ||
== Session 2 - More about Spark == | == Session 2 - More about Spark == | ||
* Chambers & Zaharia, chapters 4-9 | * Chambers & Zaharia, chapters 4-9 | ||
* | |||
Supplementary: | |||
* Twitter API, tweepy | |||
== Session 3 - Streaming Spark. Kafka == | == Session 3 - Streaming Spark. Kafka == | ||
* Chambers & Zaharia, chapters 20-21 | * Chambers & Zaharia, chapters 20-21 | ||
* Introduction | * [https://kafka.apache.org/intro Kafka Introduction] | ||
== Session 4 - Big data architecture. Cloud, NREC and Openstack == | == Session 4 - Big data architecture. Cloud, NREC and Openstack == | ||
Line 39: | Line 41: | ||
== Session 6 - Societal issues. Privacy. GDPR == | == Session 6 - Societal issues. Privacy. GDPR == | ||
* Kitchin, chapters 12-19 | * Kitchin, chapters 12-19 | ||
* | * [https://gdpr.eu/what-is-gdpr/ What is GDPR, the EU’s new data protection law?] | ||
== Session 7 - Essay presentations == | == Session 7 - Essay presentations == | ||
== Session 8 - Project demonstrations == | == Session 8 - Project demonstrations == |
Revision as of 16:41, 16 August 2022
Tentative themes for each session
- Thursday August 18th: Introduction meeting
- Thursday September 1st: Session 1 - Introduction to big data. Big-data processing. Spark
- Thursday September 15th: Session 2 - More about Spark
- Thursday September 29th: Session 3 - Streaming Spark. Kafka
- Thursday October 13th: Session 4 - Big data architecture. Cloud, NREC and Openstack
- Thursday October 27th:Session 5 - Cloud management. Terraform and Ansible. Docker
- Thursday November 10th: Session 6 - Societal issues. Privacy. GDPR
- Thursday November 24th: Session 7 - Essay presentations
- Thursday December 8th: Session 8 - Project demonstrations
Session 1 - Introduction to big data. Big-data processing. Spark
- Kitchin, chapters 1, 4-5
- Chambers & Zaharia, chapters 1-3, 12, 15
Supplementary:
- Section 1 in Opdahl, A. L., & Nunavath, V. (2020). Big Data. Big Data in Emergency Management: Exploitation Techniques for Social and Mobile Data, 15-29. Paper
- Spark 3.3.0 [Overview and Quick Start (with Python examples)
Session 2 - More about Spark
- Chambers & Zaharia, chapters 4-9
Supplementary:
- Twitter API, tweepy
Session 3 - Streaming Spark. Kafka
- Chambers & Zaharia, chapters 20-21
- Kafka Introduction
Session 4 - Big data architecture. Cloud, NREC and Openstack
- NREC. OpenStack
- Gallofré, M., Opdahl, A. L., Stoppel, S., Tessem, B., & Veres, C. (2021). The News Angler Project: Exploring the Next Generation of Journalistic Knowledge Platforms. In Proceedings of Norsk IKT-konferanse for forskning og utdanning. Short Paper [Poster]
Supplementary:
- Opdahl, A. L., & Tessem, B. (2021). Ontologies for finding journalistic angles. Software and Systems Modeling, 20(1), 71-87. Paper
Session 5 - Cloud management. Terraform and Ansible. Docker
- Terraform, Ansible
- Docker
Session 6 - Societal issues. Privacy. GDPR
- Kitchin, chapters 12-19
- What is GDPR, the EU’s new data protection law?