I2 analyst notebook 8 quick start guide

MICHELLE  |  CRUISE

Hero
…
- Hero

I2 analyst notebook 8 quick start guide

The main abstraction Spark provides is a resilient distributed dataset (RDD), which is a collection of elements partitioned across the nodes of the cluster that can be operated on in parallel.

At a high level, every Spark application consists of a driver program that runs the user’s main function and executes various parallel operations on a cluster.