New pages
- 11:25, 29 October 2022 Configure Spark cluster using Ansible (hist | edit) [9,911 bytes] Sinoa (talk | contribs) (Created page with "== Install and configure Ansible == On your local host, install [https://docs.ansible.com/ansible/latest/index.html Ansible], for example: sudo apt install ansible To prepar...")
- 10:53, 29 October 2022 Create Spark cluster using Terraform (hist | edit) [12,156 bytes] Sinoa (talk | contribs) (Created page with "== Install OpenStack == On your local machine, create an ''exercise-5'' folder, for example a subfolder of ''info319-exercises'', and '''cd''' into it. Install ''openstackcli...")
- 11:05, 12 October 2022 Install Kafka on the cluster (hist | edit) [7,469 bytes] Sinoa (talk | contribs) (Created page with "== Install Zookeeper == On each instance, go to https://zookeeper.apache.org/releases.html#download . Download and unpack a recent binary distribution to ''~/volume''. For exa...")
- 11:04, 12 October 2022 Install Spark on the cluster (hist | edit) [3,611 bytes] Sinoa (talk | contribs) (Created page with "== Install Spark == Go to [https://spark.apache.org/downloads.html Apache Spark Downloads]. Download and unpack a recent Spark binary. For example, on each instance: cd volum...")
- 11:02, 12 October 2022 Install HDFS and YARN on the cluster (hist | edit) [7,028 bytes] Sinoa (talk | contribs) (Created page with "== Install Java == You need Java on each instance. We will use and older, stable version in case some of the tools are not upgraded to more recent versions: sudo apt install...")
- 10:58, 12 October 2022 Create Spark cluster (hist | edit) [12,123 bytes] Sinoa (talk | contribs) (Created page with "== Create Spark cluster == === Create SSH key pair === You need a new SSH key pair for your Spark cluster. Do not use a key pair you already have, because the virtual machines...")
- 09:46, 17 September 2022 Streaming tweets with Kafka and Spark (hist | edit) [12,661 bytes] Sinoa (talk | contribs) (Created page with "== Streaming tweets with Kafka and Spark == === Install Kafka === This exercise will assume you are running Ubuntu Linux, either natively or through WSL 2. See [https://kafk...")
- 06:49, 7 September 2022 Streaming tweets with Twitter API (hist | edit) [6,797 bytes] Sinoa (talk | contribs) (Created page with "== Streaming tweets with Twitter API ==")
- 14:12, 1 September 2022 Processing tweets with Spark (hist | edit) [589 bytes] Sinoa (talk | contribs) (Created page with "== Processing tweets with Spark === Continuing the examples from [Getting started with Apache Spark]: * load the tweets in ‘tweet-id-text-345/’ as JSON objects * collect...")
- 15:36, 22 August 2022 Getting started with Apache Spark (hist | edit) [7,850 bytes] Sinoa (talk | contribs) (Created page with "=Getting started with Apache Spark= ==Purpose== * Getting up and running with Apache Spark * Getting experience with non-trivial Linux installation * Using VS Code (or another...")