Free Online Certification Courses – Learn Today. Lead Tomorrow. › Forums › Apache Spark › Define journaling in Apache Spark.
- This topic has 1 reply, 1 voice, and was last updated 5 years, 6 months ago by DataFlair Team.
-
AuthorPosts
-
-
September 20, 2018 at 10:03 pm #6435DataFlair TeamSpectator
What is write ahead log(journaling)?
-
September 20, 2018 at 10:03 pm #6436DataFlair TeamSpectator
Write ahead log(journaling)
For suppose any driver node fails, it resulted in all the data that was received and replicated in memory will be lost. It directly affects the result of the stateful transformation. Hence, to avoid this frequent loss of data, Write-ahead logs are introduced in Apache Spark 1.2. That helps to save received data to fault-tolerant storage. All before the data can be processed by Spark Streaming, it is written to write ahead logs.
We use Write ahead logs in the database as well as in file system. It guarantees the durability of any data operations. Internally, It works as at first the intention of the operation is written down in the durable log. Afterwards, the operation is applied to the data. Through this process, even if the system fails in the middle of applying the operation, it is possible to recover lost data easily. It is possible by reading the log and also by reapplying the data it has intended to do.
To learn more about Journaling, follow the link: Spark Streaming write-ahead logs
-
-
AuthorPosts
- You must be logged in to reply to this topic.