Daily Archives: January 27, 2018


RDD lineage in Spark: ToDebugString Method 1

1. Objective Basically, in Spark all the dependencies between the RDDs will be logged in a graph, despite the actual data. This is what we call as a lineage graph in Spark. This document holds the concept of RDD lineage in Spark logical execution plan. Moreover, we will get to know that how to get RDD Lineage Graph by toDebugString method in detail. Before all, let’s also learn about Spark RDDs. 2. Introduction to Spark RDD Spark RDD is nothing but an acronym […]

Spark RDD Lineage - Introduction

R and Hadoop Integration | R Integration with Hadoop

1. R and Hadoop Integration In this blog, we will do a study R and Hadoop Integration. Also, will learn when to use R and Hadoop combination. Moreover, will study the implementation of R integration with Hadoop. I recommend you to go through Hadoop and R Programming So lets start with integrating r and hadoop for big data analysis. 2. Introduction to R With Hadoop Integration a. Introduction to R Programming Language R is an open source programming language. It is best […]