Practical session, using Spark for emergency datasources: Difference between revisions
No edit summary |
No edit summary |
||
Line 6: | Line 6: | ||
**[[Running Hadoop | Getting started with Hadoop]] | **[[Running Hadoop | Getting started with Hadoop]] | ||
* '''2) Querying analyzing open data source with Apache park.''' | * '''2) Querying and analyzing open data source with Apache park.''' | ||
**'''Steps:''' | **'''Steps:''' | ||
*** 1) Download data from: https://data.sfgov.org/Public-Safety/Fire-Department-Calls-for-Service/nuek-vuh3. or Session file: [[:File:data.zip]]. | *** 1) Download data from: https://data.sfgov.org/Public-Safety/Fire-Department-Calls-for-Service/nuek-vuh3. or Session file: [[:File:data.zip]]. |
Revision as of 13:18, 17 September 2018
Practical session:
Tasks:
- 1) Running Apache Hadoop and MapReduce:
- 2) Querying and analyzing open data source with Apache park.
- Steps:
- 1) Download data from: https://data.sfgov.org/Public-Safety/Fire-Department-Calls-for-Service/nuek-vuh3. or Session file: File:data.zip.
- 2) Setup an account in Data bricks: https://databricks.com/try-databricks.
- 3) Create a cluster in Databricks.
- 4) Import files from zip folder to workspace.
- 5) Open Fire incidents exploration - RunMe file in cloud.databricks browser.
- Steps: