How to Install Apache Spark on Ubuntu 16.04

r00t April 26, 2018

Install Apache Spark on Ubuntu 16.04

In this tutorial we’ll learn how to install Apache Spark on Ubuntu 16.04. We will also install and configure its prerequisites. Apache Spark is a flexible and fast solution for large scale data processing. It is an open source distributed engine suitable for large scale data processing. Apache spark was founded by an Apache Software Foundation. It can be run on HBase, Hadoop, Cassandra, Hive, Apache Mesos, Amazon EC2 Cloud, HDFS, etc. It can run using its standalone cluster mode as well as on various cloud platforms.

I recommend to use a minimal Ubuntu server setup as a basis for the tutorial, that can be a virtual or a root server image with an Ubuntu 16.04 minimal install from a web hosting company or you use our minimal server tutorial to install a server from scratch.

Install Apache Spark on Ubuntu 16.04

Step 1. First, ensure your system and apt package lists are fully up-to-date by running the following:

apt-get update -y
apt-get upgrade -y

Step 2. Installing Java.

As Spark is based on Java, we need to install it on our machine:

apt-get -y install openjdk-8-jdk-headless

Step 3. Installing Apache Spark on Ubuntu 16.04.

First, Download latest Apache Spark release from here:

wget http://www-us.apache.org/dist/spark/spark-2.3.0/spark-2.3.0-bin-hadoop2.7.tgz

Extract the Apache Spark Tarball:

tar xvzf spark-2.3.0-bin-hadoop2.7.tgz

Run this command to make a symbolic link:

ln -s spark-2.3.0-bin-hadoop2.7 spark

Next, Adding Spark to Path:

nano ~/.bashrc

Add these lines to the end of the .bashrc file so that path can contain the Spark executable file path:

SPARK_HOME=/LinuxHint/spark
export PATH=$SPARK_HOME/bin:$PATH

To activate these changes, run the following command:

source ~/.bashrc

Then, verify the installation, close the Terminal already opened, and Open Terminal again. Run the following command:

./spark/bin/spark-shell

Install Apache Spark on Ubuntu 16.04

We can see in the console that Spark has also opened a Web Console on port 4040. Let’s give it a visit:

Install Apache Spark on Ubuntu 16.04

Congratulation’s! You have successfully install and configure Apache Spark on your Ubuntu 16.04 server. Thanks for using this tutorial for installing Apache Spark on Ubuntu system.

The Tags:

Leave a Comment

Comments are closed.