How is security achieved in Hadoop?

Viewing 2 reply threads
  • Author
    Posts
    • #5403
      DataFlair TeamDataFlair Team
      Spectator

      How security is achieved in Apache Hadoop?
      How to achieve security in Hadoop?

    • #5407
      DataFlair TeamDataFlair Team
      Spectator

      Apache Hadoop provide security to users in following ways:

      1) SASL/GSSAPI was used to implement Kerberos. It is also used to mutually authenticate users, their processes, and Hadoop services on RPC connections.

      2) Implementers of web applications and web consoles could implement their own authentication mechanism for HTTP connections this includes HTTP SPNEGO authentication

      3) Access control to files in HDFS could be enforced by the NameNode based on file permissions – Access Control Lists (ACLs) of users and groups.

      4) Delegation tokens are used in communication with the NameNode for subsequent authenticated access without using the Kerberos Server.

      5) When access to Data Blocks was needed, the NameNode would make an access control decision based on HDFS file permissions and would issue Block access tokens (using HMAC-SHA1) that could be sent to the DataNode for block access requests.

      6) Hadoop web consoles are configured to use HTTP SPNEGO Authentication, an implementation of Kerberos for web consoles.

      7) Connections utilizing SASL can be configured to use a Quality of Protection (QoP) of confidential, enforcing encryption at the network level.

    • #5408
      DataFlair TeamDataFlair Team
      Spectator

      Apache Hadoop support for HDFS Encryption.

      The first step in securing an Apache Hadoop cluster is to enable encryption in transit and at rest.

      Authentication and Kerberos rely on secure communications, so before you even go down the road of enabling authentication and Kerberos you must enable encryption of data-in-transit.

      To achieve secure communications in Hadoop we need to enable the secure version of protocols used.

      1) RPC/SASL
      We need to enable SASL to protect RPC data in transit. SASL is enabled by setting the hadoop.rpc.protection property in the core-site.xml file.

      It looks like

      <property>
       <name>hadoop.rpc.protection</name>
      <value>privacy</value>
      </property>

      2) Access to files in HDFS :
      Access Control Lists for file permission

Viewing 2 reply threads
  • You must be logged in to reply to this topic.