How To Install Apache Flink on Ubuntu Server

Advertisement

Apache Flink is a big data processing engine which can run in both streaming & batch mode. data Artisans is the company who is the original creator of Flink. It started as a project called Stratosphere, which was forked, and became Apache Flink. Flink can be deployed on local machine, on cluster (it can run on YARN), or can be deployed in the cloud. Core of Apache Flink is a distributed streaming dataflow engine. It written in Java and Scala. Flink natively supports execution of iterative algorithms. Programs to run with Flink can be written in Java, Scala, Python, SQL. Flink has no own data storage system and provides data source and sink connectors to Apache Kafka, HDFS, Apache Cassandra, ElasticSearch, Amazon Kinesis etc. Apache Beam is a shared programming model for which Flink is backend.

Apache Flink is considered as powerful competitor of Apache Spark. Spark is based on resilient distributed datasets (RDDs). Flink is optimized for cyclic or iterative processes by using iterative transformations on collections. Flink is also a strong tool for batch processing.

However, this article is not about comparison. Let Us Move to the Steps on How To Install Apache Flink on Ubuntu Server. We said “Ubuntu Server” to point “no GUI”, you may use a local machine or even Windows 10 Ubuntu Bash to test.

An Apache Hadoop installation is not mandatory to use Flink. Hadoop version needed if you plan to run Flink in YARN or process data stored in HDFS.

Advertisement

---

Steps To Install Apache Flink on Ubuntu Server

Let us update, upgrade as root user :

Vim

1

apt update-y&&apt upgrade-y

Wait (do not run the next commands till we say to start). Normally we have to install the Java runtime (JRE) with this command :

Vim

1

apt install default-jre

And next we will install Java development environment (JDK) :

Vim

1

apt install default-jdk

Next we set JAVA_HOME in the bash file with the following command:

Vim

1

export JAVA_HOME=$(readlink-f/usr/bin/java|sed"s:bin/java::")

Then check with below command :

Vim

1

echoJAVA_HOME

If you already ran the above steps on machine, you need not to run the below commands. In this example, we can add webupd8team PPA for empty new machine :

Vim

1

2

3

apt install python-software-properties

sudo add-apt-repository ppa:webupd8team/java

apt update-y&&apt upgrade-y

and then run :

Vim

1

apt install oracle-java7-installer

Download the binary distribution of Apache Flink from here :

Vim

1

http://flink.apache.org/downloads.html

Flink has binary releases marked with a Hadoop version which come bundled with binaries for that Hadoop version. The binary release without bundled Hadoop can be used without Hadoop or with a Hadoop version that is installed in the environment. So read that webpage carefully.

This is as example, without Hadoop (notice the file name flink-1.5.0-bin-scala_2.11.tgz) :

Go to http://localhost:8081 and make sure everything is up and running. The web frontend should report a single available TaskManager instance. The version you installed has own official tutorials with examples :