Practical session, using Spark for emergency datasources: Difference between revisions
No edit summary |
No edit summary |
||
Line 3: | Line 3: | ||
===Tasks:=== | ===Tasks:=== | ||
* 1)'''Running Apache Hadoop and MapReduce:''' | * 1) '''Running Apache Hadoop and MapReduce:''' | ||
**[[Running Hadoop | Getting started with Hadoop]] | **[[Running Hadoop | Getting started with Hadoop]] | ||
* 2)'''Querying analyzing open data source with Apache park.''' | * 2) '''Querying analyzing open data source with Apache park.''' | ||
**'''Steps:''' | **'''Steps:''' | ||
*** 1) Download data from: https://data.sfgov.org/Public-Safety/Fire-Department-Calls-for-Service/nuek-vuh3. or Session file: [[:File:data.zip]]. | *** 1) Download data from: https://data.sfgov.org/Public-Safety/Fire-Department-Calls-for-Service/nuek-vuh3. or Session file: [[:File:data.zip]]. |
Revision as of 12:56, 17 September 2018
Practical session:
Tasks:
- 1) Running Apache Hadoop and MapReduce:
- 2) Querying analyzing open data source with Apache park.
- Steps:
- 1) Download data from: https://data.sfgov.org/Public-Safety/Fire-Department-Calls-for-Service/nuek-vuh3. or Session file: File:data.zip.
- 2) Setup an account in Data bricks: https://databricks.com/try-databricks.
- 3) Create a cluster in Databricks.
- 4) Import files from zip folder to workspace.
- 5) Open Fire incidents exploration - RunMe file in cloud.databricks browser.
- Steps: