This topic contains 2 replies, has 1 voice, and was last updated by  dfbdteam3 1 year, 2 months ago.

Viewing 3 posts - 1 through 3 (of 3 total)
  • Author
    Posts
  • #5481

    dfbdteam3
    Moderator

    What is the procedure to create users in HDFS and how to allocate Quoata to them?

    #5482

    dfbdteam3
    Moderator

    HDFS (Hadoop Distributed Filesystem) is the world’s most reliable storage system. HDFS is a Filesystem of Hadoop, designed for storing very large files running on a cluster of commodity hardware.

    Steps to create a New user in HDFS are as follows:

    Step 1

    For Ubuntu
    sudo adduser –ingroup

    For RedHat variants

    useradd -g

    passwd
    Then enter the user details and password.

    Step 2

    we need to change the permission of a directory in HDFS where hadoop stores its temporary data.

    Open the core-site.xml file

    Find the value of hadoop.tmp.dir.

    In my core-site.xml, it is /app/hadoop/tmp. In the proceeding steps, I will be using /app/hadoop/tmp as my directory for storing hadoop data ( ie value of hadoop.tmp.dir).

    Then from the superuser account do the following step.
    hadoop fs –chmod -R 1777 /app/hadoop/tmp/mapred/staging
    Step 3

    The next step is to give write permission to our user group on hadoop.tmp.dir (here /app/hadoop/tmp. Open core-site.xml to get the path for hadoop.tmp.dir). This should be done only in the machine(node) where the new user is added.

    chmod 777 /app/hadoop/tmp

    Step 4

    The next step is to create a directory structure in HDFS for the new user.

    For that from the superuser, create a directory structure.

    Eg: hadoop fs –mkdir /user/username/

    Step 5

    With this, we will not be able to run MapReduce programs, because the ownership of the newly created directory structure is with superuser. So change the ownership of newly created directory in HDFS to the new user.
    hadoop fs –chown –R username:groupname
    Eg: hadoop fs –chown –R username:groupname /user/username/

    Step 6

    login as the new user and perform hadoop jobs.

    su – username

    For more details, please follow: Installation of cloudera Hadoop in CDH5 ubuntu

    #5484

    dfbdteam3
    Moderator

    In Apache Hadoopone can run and store different tasks data in HDFS .
    If several users are doing tasks using the same user account, it will be difficult to trace the jobs and track the tasks done by each user.
    And other issue is with Data security. If everyone uses only single user account, they can be able to read/modify other’s data.
    To overcome this, we need to create multiple user accounts, so that: Directories/files of a particular user cannot be modified/used by other users.
    In short, data is safe and is accessible only to the assigned user and the superuser.

    Steps for creating a user:

    1) Create a new OS user:

    For Ubuntu,
    sudo adduser –ingroup <groupname> <username>
    2) Then Change the permission of a directory in HDFS where hadoop stores its temporary data.

    Open the core-site.xml file, find the value of hadoop.tmp.dir.

    3) Then from the superuser account do the following step,

    hadoop fs –chmod -R 777 <hadoop.tmp.dir>

    4) The next step is to give write permission to our user group on hadoop.tmp.dir . This should be done only in the machine(node) where the new user is added.
    chmod 777 <hadoop.tmp.dir>
    5) The next step is to create a directory structure in HDFS for the new user, which is user’s home directory
    For that from the superuser, create a directory structure.
    hadoop fs –mkdir /user/<username>/
    6) With this, we will not be able to run MapReduce programs, because the ownership of the newly created directory structure is with superuser.
    So change the ownership of newly created directory in HDFS to the new user.

    hadoop fs –chown –R username:groupname /user/<username>/

    7) log in as the new user and perform hadoop jobs,

    su – username

Viewing 3 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic.