Install and Configure Apache Flink on Ubuntu
Free Flink course with real-time projects Start Now!!
In this Flink tutorial, we will learn the Apache Flink installation on Ubuntu. Apache Flink is stream data flow engine which processes data at lightening fast speed, to understand what is Flink follow this Flink introduction guide.
In this Flink deployment tutorial, we will see how to install Apache Flink in standalone mode and how to run sample programs.
Apache Flink Installation on Ubuntu
i. Platform
a. Platform Requirements
Operating system: Ubuntu 14.04 or later, we can also use other Linux flavors like CentOS, Redhat, etc.
In this Apache Flink Installation on ubuntu tutorial, we will install Apache Flink 1.x
b. Configure & Setup Platform
If you are using Windows/Mac OS you can create virtual machine and install Ubuntu using VMWare Player, alternatively, you can create virtual machine and install Ubuntu using Oracle Virtual Box
ii. Install Java
Apache Flink requires Java to be installed as it runs on JVM. So, let’s begin by installing Java.
a. Install Python Software Properties
$ sudo apt-get install python-software-properties
b. Add Repository
$ sudo add-apt-repository ppa:webupd8team/java
c. Update the source list
$ sudo apt-get update
d. Install Java
$ sudo apt-get install oracle-java7-installer
On executing above command Java will be automatically downloaded and installed.
e. Verify Java Installation
To check whether installation procedure gets successfully completed or not and to know the version of Java installed we can use the below command:
$ java -version
iii. Install Apache Flink
a. Download the Apache Flink
You can download Flink from official Apache website, use this link to download Apache Flink Click here.
b. Untar the setup file
Move the downloaded setup file in home directory and run below command to extract Flink:
dataflair@ubuntu:~$ tar xzf flink-1.1.3-bin-hadoop26-scala_2.11.tgz
c. Rename the installation Directory
dataflair@ubuntu:~$ mv flink-1.1.3/ flink
d. Change the working directory to Flink Home
To start Flink services, run sample program and play with it, change the directory to flink by using below command
dataflair@ubuntu:~$ cd flink
e. Start Flink
Start Apache Flink in a local mode use this command
[php]dataflair@ubuntu:~/flink$ bin/start-local.sh [/php]
f. Check status
Check the status of running services
dataflair@ubuntu:~/flink$ jps
Output should be 6740 Jps 6725 JobManager
g. Apache Flink Web UI
To start Web UIÂ use the following URL
localhost:8081
iv. Run Wordcount example on Flink
To run Wordcount example on flink use the following command
Before that make an input file in a home directory with some data as a sample and save it as input.txt
dataflair@ubuntu:~/flink$ bin/flink run examples/batch/WordCount.jar -input /home/dataflair/input.txt -output /home/dataflair/output.txt
Did you know we work 24x7 to provide you best tutorials
Please encourage us - write a review on Google
It seems like the “non”-Hadoop version works also just on OSx! it just needs java! (and python?)
Do I have to have the hadoop vm installed already? I used java 8 , and the new version of flink would that work?
Yes, You can run Apache Flink without Hadoop installation. Though Apache Flink also provides native support to interact with Hadoop, and Flink can use Hadoop input format as well as output format. Apache Flink needs Java / Scala to run applications. Java 7 as well as Java 8 is supported with Flink, you can use any of the Java version.
Greatly explained how to install apache flink on ubuntu. Thanks for updating me with latest technology!!