Practical session, using Spark for emergency datasources: Difference between revisions

Latest revision as of 09:22, 18 August 2020

Practical session: In every practical session, I expect you to install all needed software beforehand to focus on the tasks.

2) Apache Spark cluster setup in Azure HDInsight and data processing.
- Apache Spark cluster setup [1]
- Load data and run queries on sensor data [2]

Solve Variables error: File:Variables.pdf

@@ Line 1: / Line 1: @@
 '''Practical session:'''
+In every practical session, I expect you to install all needed software beforehand to focus on the tasks.
 ===Tasks:===
-* '''Running Apache Hadoop and MapReduce:'''
+*'''1) Running Apache Hadoop and MapReduce:'''
 **[[Running Hadoop | Getting started with Hadoop]]
-* '''Querying analyzing open data source with Apache park.'''
+* '''2) Apache Spark cluster setup in Azure HDInsight and data processing.'''
+** Apache Spark cluster setup [https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-hadoop-provision-linux-clusters]
+** Load data and run queries on sensor data [https://docs.microsoft.com/en-us/azure/hdinsight/spark/apache-spark-load-data-run-query]
-**'''Steps:'''
+'''Solve Variables error:''' [[:File:Variables.pdf]]
-*** 1) Download data from: https://data.sfgov.org/Public-Safety/Fire-Department-Calls-for-Service/nuek-vuh3. or Session file: [[:File:data.zip]].
-*** 2) Setup an account in Data bricks: https://databricks.com/try-databricks.
-*** 3) Create a cluster in Databricks.
-*** 4) Import files from zip folder to workspace.
-*** 5) Open Fire incidents exploration - RunMe file in cloud.databricks browser.