Readings: Difference between revisions
From info319
| (14 intermediate revisions by the same user not shown) | |||
| Line 1: | Line 1: | ||
== Books == | == Books == | ||
Text books: | Text books: | ||
* Rob Kitchin. ''The Data Revolution - Big Data, Open Data | * Rob Kitchin. ''The Data Revolution - A Critical Analysis of Big Data, Open Data and Data Infrastructures'', 2nd Edition. Sage, 2021. | ||
** chapters 1, 3-5, | ** chapters 1, 3-5, 13-14, 17-19 are mandatory (12 and 15-16 are supplementary) | ||
* Bill Chambers and Matei Zaharia: '' | |||
* Bill Chambers and Matei Zaharia: ''Sprk: The Definitive Guide - Big Data Processing Made Simple''. O'Riley, 2018. [[File:Spark-TheDefinitiveGuide.pdf]] | |||
** chapters 1-9, 12, 15, 20-21 are mandatory (chapter 10 on SQL is also highly relevant) | ** chapters 1-9, 12, 15, 20-21 are mandatory (chapter 10 on SQL is also highly relevant) | ||
** [https://github.com/databricks/Spark-The-Definitive-Guide GitHub repository with code and data examples] | |||
== Papers == | == Papers == | ||
Selected papers will become available here, including: | Selected papers will become available here, including: | ||
* [https://arxiv.org/pdf/2012.09109 Section 1] in Opdahl, A. L., & Nunavath, V. (2020). Big Data. Big Data in Emergency Management: Exploitation Techniques for Social and Mobile Data, 15-29. Book chapter | * [https://arxiv.org/pdf/2012.09109 Section 1] in Opdahl, A. L., & Nunavath, V. (2020). Big Data. Big Data in Emergency Management: Exploitation Techniques for Social and Mobile Data, 15-29. Book chapter | ||
* Gallofré, M., Opdahl, A. L., Stoppel, S., Tessem, B., & Veres, C. (2021). The News Angler Project: Exploring the Next Generation of Journalistic Knowledge Platforms. In Proceedings of Norsk IKT-konferanse for forskning og utdanning. [https://ojs.bibsys.no/index.php/NIK/article/view/939/792 Short Paper] and poster: [[File:A1-Poster-NIKT2021.pdf]] | * Gallofré, M., Opdahl, A. L., Stoppel, S., Tessem, B., & Veres, C. (2021). The News Angler Project: Exploring the Next Generation of Journalistic Knowledge Platforms. In Proceedings of Norsk IKT-konferanse for forskning og utdanning. [https://ojs.bibsys.no/index.php/NIK/article/view/939/792 Short Paper] and poster: [[File:A1-Poster-NIKT2021.pdf]] | ||
<!-- Architecture stuff: | <!-- Architecture stuff: | ||
| Line 17: | Line 18: | ||
* Sigma: Cassavia, N., & Masciari, E. (2021, March). Sigma: a scalable high performance big data architecture. In 2021 29th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP) (pp. 236-239). IEEE. [https://bibsys-almaprimo.hosted.exlibrisgroup.com/primo-explore/openurl?sid=google&auinit=N&aulast=Cassavia&atitle=Sigma:%20a%20scalable%20high%20performance%20big%20data%20architecture&id=doi:10.1109%2FPDP52278.2021.00044&vid=UBB&institution=UBB&url_ctx_val=&url_ctx_fmt=null&isSerivcesPage=true Paper] | * Sigma: Cassavia, N., & Masciari, E. (2021, March). Sigma: a scalable high performance big data architecture. In 2021 29th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP) (pp. 236-239). IEEE. [https://bibsys-almaprimo.hosted.exlibrisgroup.com/primo-explore/openurl?sid=google&auinit=N&aulast=Cassavia&atitle=Sigma:%20a%20scalable%20high%20performance%20big%20data%20architecture&id=doi:10.1109%2FPDP52278.2021.00044&vid=UBB&institution=UBB&url_ctx_val=&url_ctx_fmt=null&isSerivcesPage=true Paper] | ||
* Maamouri, A., Sfaxi, L., & Robbana, R. (2021, December). Phi: A Generic Microservices-Based Big Data Architecture. In European, Mediterranean, and Middle Eastern Conference on Information Systems (pp. 3-16). Springer, Cham. [https://link.springer.com/chapter/10.1007/978-3-030-95947-0_1 Paper] | * Maamouri, A., Sfaxi, L., & Robbana, R. (2021, December). Phi: A Generic Microservices-Based Big Data Architecture. In European, Mediterranean, and Middle Eastern Conference on Information Systems (pp. 3-16). Springer, Cham. [https://link.springer.com/chapter/10.1007/978-3-030-95947-0_1 Paper] | ||
Marc: | |||
You found the other Phi architecture. 😃 The one I meant was: https://ieeexplore.ieee.org/abstract/document/8712381 But both have interesting contributions. The one you found considers the training part which it is not instantiated in the others. | |||
This is the "original publication" of Lambda: http://nathanmarz.com/blog/how-to-beat-the-cap-theorem.html , it is a blog entry. | |||
--> | --> | ||
| Line 30: | Line 36: | ||
* Opdahl, A. L., & Tessem, B. (2021). Ontologies for finding journalistic angles. Software and Systems Modeling, 20(1), 71-87. [https://scholar.google.com/scholar?output=instlink&q=info:pKELE6iBzpAJ:scholar.google.com/&hl=en&as_sdt=0,5&as_ylo=2021&scillfp=4299025271368542631&oi=lle Paper] | * Opdahl, A. L., & Tessem, B. (2021). Ontologies for finding journalistic angles. Software and Systems Modeling, 20(1), 71-87. [https://scholar.google.com/scholar?output=instlink&q=info:pKELE6iBzpAJ:scholar.google.com/&hl=en&as_sdt=0,5&as_ylo=2021&scillfp=4299025271368542631&oi=lle Paper] | ||
* Berven, A., Christensen, O. A., Moldeklev, S., Opdahl, A. L., & Villanger, K. J. (2020). A knowledge-graph platform for newsrooms. Computers in Industry, 123, 103321. [https://scholar.google.com/scholar?output=instlink&q=info:0K5dB1_9nusJ:scholar.google.com/&hl=en&as_sdt=0,5&as_ylo=2018&scillfp=11776208952974186557&oi=lle Paper] | * Berven, A., Christensen, O. A., Moldeklev, S., Opdahl, A. L., & Villanger, K. J. (2020). A knowledge-graph platform for newsrooms. Computers in Industry, 123, 103321. [https://scholar.google.com/scholar?output=instlink&q=info:0K5dB1_9nusJ:scholar.google.com/&hl=en&as_sdt=0,5&as_ylo=2018&scillfp=11776208952974186557&oi=lle Paper] | ||
* [https://www.jstor.org/stable/25148625#metadata_info_tab_contents Design Science in Information Systems Research] by Alan R. Hevner, Salvatore T. March, Jinsoo Park and Sudha Ram. MIS Quarterly 28(1):75-105, March 2004. ''(You need to be on UiB's network to access the link - I have uploaded it under Files in mitt.uib.no, but it may soon be deleted from there...)'' | |||
* Hevner, A. R. (2007). A three cycle view of design science research. Scandinavian journal of information systems, 19(2), 4. [[File:Hevner2007-ThreeCycleView-SJIS.pdf]] | |||
<!-- Architectures: kappa, lambda, phi, Liquid --> | <!-- Architectures: kappa, lambda, phi, Liquid --> | ||
| Line 37: | Line 45: | ||
== Technical introductions == | == Technical introductions == | ||
Selected web pages will become available here, including: | Selected web pages will become available here, including: | ||
* [https://kafka.apache.org/intro Kafka Introduction] | |||
* [https://docs.nrec.no/index.html NREC and OpenStack], the following sections/pages: Introduction, Project application, Logging in, The dashboard, Create a Linux virtual machine (skip: Windows), Using SSH, Working with Security Groups, Create and manage volumes, Create and manage snapshots (skip: images), Instance console | |||
* [https://docs.nrec.no/terraform-part1.html TerraForm and NREC part I], [https://docs.nrec.no/terraform-part2.html part II], and [https://docs.nrec.no/terraform-part3.html part III] | |||
* [https://www.ansible.com/overview/how-ansible-works How Ansible Works] and [https://docs.ansible.com/ansible_community.html the Ansible Community portal] | |||
* Docker Docs: [https://docs.docker.com/get-started/overview/ Docker overview] and [https://docs.docker.com/get-started/overview/ Get started] | |||
* [https://kubernetes.io/docs/tutorials/kubernetes-basics/ Learn Kubernetes basics], modules 1-6 | |||
* [https://gdpr.eu/what-is-gdpr/ What is GDPR, the EU’s new data protection law?] | |||
Supplementary: | |||
* Spark 3.3.0 [https://spark.apache.org/docs/latest/index.html Overview] and [https://spark.apache.org/docs/latest/quick-start.html Quick Start (with Python examples)] | * Spark 3.3.0 [https://spark.apache.org/docs/latest/index.html Overview] and [https://spark.apache.org/docs/latest/quick-start.html Quick Start (with Python examples)] | ||
* [https://developer.twitter.com/en/docs/twitter-api Twitter API v2] | * [https://developer.twitter.com/en/docs/twitter-api Twitter API v2] | ||
* [https://github.com/tweepy/tweepy Tweepy: Twitter for Python] | * [https://github.com/tweepy/tweepy Tweepy: Twitter for Python] | ||
* [https://docs.tweepy.org/en/latest/ Tweepy Documentation] | * [https://docs.tweepy.org/en/latest/ Tweepy Documentation] | ||
* [https:// | * [https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html Structured Streaming Spark Programming Guide] | ||
* [https:// | * Apache Spark [https://spark.apache.org/docs/latest/api/python/reference/pyspark.ss/index.html Structured Streaming API] | ||
* [https:// | * [https://kafka-python.readthedocs.io/en/master/ kafka-python API] | ||
* [https:// | * EU's [https://gdpr-info.eu/ General Data Protection Regulation (GDPR)] - the official legal text | ||
<!-- | <!-- GDELT --> | ||
== Lecture slides== | == Lecture slides== | ||
See the [[Sessions|Session page]] for lecture slides after each session. | See the [[Sessions|Session page]] for lecture slides after each session. | ||
==Readings for each | ==Readings for each session== | ||
The [[Sessions|Sessions page]] will suggest specific readings for each session and its associated exercise. | The [[Sessions|Sessions page]] will suggest specific readings for each session and its associated exercise. | ||
Latest revision as of 13:04, 2 December 2022
Books
Text books:
- Rob Kitchin. The Data Revolution - A Critical Analysis of Big Data, Open Data and Data Infrastructures, 2nd Edition. Sage, 2021.
- chapters 1, 3-5, 13-14, 17-19 are mandatory (12 and 15-16 are supplementary)
- Bill Chambers and Matei Zaharia: Sprk: The Definitive Guide - Big Data Processing Made Simple. O'Riley, 2018. File:Spark-TheDefinitiveGuide.pdf
- chapters 1-9, 12, 15, 20-21 are mandatory (chapter 10 on SQL is also highly relevant)
- GitHub repository with code and data examples
Papers
Selected papers will become available here, including:
- Section 1 in Opdahl, A. L., & Nunavath, V. (2020). Big Data. Big Data in Emergency Management: Exploitation Techniques for Social and Mobile Data, 15-29. Book chapter
- Gallofré, M., Opdahl, A. L., Stoppel, S., Tessem, B., & Veres, C. (2021). The News Angler Project: Exploring the Next Generation of Journalistic Knowledge Platforms. In Proceedings of Norsk IKT-konferanse for forskning og utdanning. Short Paper and poster: File:A1-Poster-NIKT2021.pdf
Supplementary:
- Opdahl, A. L., & Tessem, B. (2021). Ontologies for finding journalistic angles. Software and Systems Modeling, 20(1), 71-87. Paper
- Berven, A., Christensen, O. A., Moldeklev, S., Opdahl, A. L., & Villanger, K. J. (2020). A knowledge-graph platform for newsrooms. Computers in Industry, 123, 103321. Paper
- Design Science in Information Systems Research by Alan R. Hevner, Salvatore T. March, Jinsoo Park and Sudha Ram. MIS Quarterly 28(1):75-105, March 2004. (You need to be on UiB's network to access the link - I have uploaded it under Files in mitt.uib.no, but it may soon be deleted from there...)
- Hevner, A. R. (2007). A three cycle view of design science research. Scandinavian journal of information systems, 19(2), 4. File:Hevner2007-ThreeCycleView-SJIS.pdf
Technical introductions
Selected web pages will become available here, including:
- Kafka Introduction
- NREC and OpenStack, the following sections/pages: Introduction, Project application, Logging in, The dashboard, Create a Linux virtual machine (skip: Windows), Using SSH, Working with Security Groups, Create and manage volumes, Create and manage snapshots (skip: images), Instance console
- TerraForm and NREC part I, part II, and part III
- How Ansible Works and the Ansible Community portal
- Docker Docs: Docker overview and Get started
- Learn Kubernetes basics, modules 1-6
- What is GDPR, the EU’s new data protection law?
Supplementary:
- Spark 3.3.0 Overview and Quick Start (with Python examples)
- Twitter API v2
- Tweepy: Twitter for Python
- Tweepy Documentation
- Structured Streaming Spark Programming Guide
- Apache Spark Structured Streaming API
- kafka-python API
- EU's General Data Protection Regulation (GDPR) - the official legal text
Lecture slides
See the Session page for lecture slides after each session.
Readings for each session
The Sessions page will suggest specific readings for each session and its associated exercise.
