Run Apache Flink Wordcount Program in Eclipse

1. Objective

In our previous guides, we discussed how to install Apache Flink on ubuntu. In this tutorial, we will understand how to develop and run Apache Flink wordcount program in Java in eclipse.
We can also use Scala language to write wordcount program in Apache Flink. To learn Scala get the best Scala Books from here.

Run Apache Flink Wordcount Program in Eclipse

Run Apache Flink Wordcount Program in Eclipse

2. Platform

  1. Operating system: You can run the code in Windows / Mac / Linux
  2. Java 7.x or higher
  3. Eclipse – Latest version

Have a look at Best Flink Features

3. Steps to make project

  1. Make a new java project
  2. Add the following JAR in the build path. You can find the jar files in the lib directory in Flink home:
    flink-dist_2.11-1.0.3
    flink-python_2.11-1.0.3
    log4j-1.2.17
    slf4j-log4j12-1.7.7

4. Apache Flink Wordcount program

import org.apache.flink.api.common.functions.FlatMapFunction;
import org.apache.flink.api.java.DataSet;
import org.apache.flink.api.java.ExecutionEnvironment;
import org.apache.flink.api.java.tuple.Tuple2;
import org.apache.flink.util.Collector;
public class FlinkProgram {
public static void main(String[] args) throws Exception {
final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
DataSet<String> rawdata = env.readTextFile("E:\\readme.txt"); //change path with your filepath of text file
DataSet <Tuple2<String, Integer>> result = rawdata
.flatMap(new Splitter())
.groupBy(0)
.sum(1);
//To print result we can call print method
result.print();
}
public static class Splitter implements FlatMapFunction<String, Tuple2<String, Integer>> {
@Override
public void flatMap(String line, Collector<Tuple2<String, Integer>> out) {
for (String wordToken : line.split(" ")) {
out.collect(new Tuple2<String, Integer>(wordToken, 1));
}
}
}
}

Do you know about the Flink Use Cases
The execution environment provides methods to control the job execution and to access the data from other Environment.
DataSet represents the collection of elements of a specific type. The type can be String, Integer, Long and tuple like:

<Tuple2<String, Integer>>

In this Apache Flink wordcount program, we are using FlatMap APIs. In the flatMap function, we can write our custom business logic. It takes one element as an input and produces zero, one or more elements.
We have seen the practical implementation of Wordcount program in Apache Flink using eclipse IDE. You can run this program directly in eclipse using run option. You can also refer this link to understand What is Apache Flink?

5. Conclusion

Hence, in this Apache Flink tutorial, we have discussed the Apache Flink Wordcount program. Moreover, we saw the steps and platform to make the project. Still, if you have any problem in running the Apache Flink Wordcount Program, ask in the comment tab.

Reference for Flink

No Responses

  1. prakash kvs says:

    Please send me the details to my mail id

Leave a Reply

Your email address will not be published. Required fields are marked *

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.