Wednesday, June 4, 2014

Spark Tutorial

http://mbonaci.github.io/mbo-spark/

Getting started with Spark

This tutorial was written in October 2013.
At the time, the current development version of Spark was 0.9.0.
The tutorial covers Spark setup on Ubuntu 12.04:
  • installation of all Spark prerequisites
  • Spark build and installation
  • basic Spark configuration
  • standalone cluster setup (one master and 4 slaves on a single machine)
  • running the math.PI approximation job on a standalone cluster

My setup

Before installing Spark:

  • Ubuntu 12.04 LTS 32-bit
  • OpenJDK 1.6.0_27
  • Scala 2.9.3
  • Maven 3.0.4
  • Python 2.7.3 (you already have this)
  • Git 1.7.9.5 (and this, I presume)

[david@david-centos6 spark]$ MASTER=spark://david-centos6:7077 bin/spark-shell




No comments: