Friday, April 22, 2016

List of frequently used hadoop commands

  1. Print the Hadoop version 
    hadoop version

  2. List the contents of the root directory in HDFS
    hadoop fs -ls /hadoop fs -ls hdfs:/

  3. Create a new directory named “hadoop” below the /user/training directory in HDFS.
    hadoop fs -mkdir /user/training/hadoop
  4. Delete a file ‘customers’ from the “retail” directory.
    hadoop fs -rm hadoop/retail/customers
  5. Delete all files from the “retail” directory using a wildcard.
    hadoop fs -rm hadoop/retail/*
  6. Remove the entire retail directory and all of its contents in HDFS.
    hadoop fs -rm -r hadoop/retail
  7. To empty the trash
    hadoop fs -expunge

  8. Add the purchases.txt file from the local directory named “/home/training/” to the hadoop directory you created in HDFS
    hadoop fs -copyFromLocal c:/purchases.txt /hadoop/
  9. Add the purchases.txt file from “hadoop” directory which is present in HDFS directory to the directory “data” which is present in your local directory
    hadoop fs -copyToLocal /hadoop/purchases.txt  D:/home/training/data
  10. ‘-get’ command can be used alternaively to ‘-copyToLocal’ command
    hadoop fs -get hadoop/sample.txt /home/training/
  11. Add a sample text file from the local directory named “data” to the new directory
    hadoop fs -put c:/sample.txt /user/training/hadoop
  12. Add the entire local directory called “retail” to the  /user/training directory in HDFS.
    hadoop fs -put c:/data/retail /user/training/hadoop
  13.  cp is used to copy files between directories present in HDFS
    hadoop fs -cp /user/training/*.txt /user/training/hadoop
  14.  Move a directory from one location to other present in HDFS
    hadoop fs -mv hadoop apache_hadoop

  15. To view the contents of your text file purchases.txt which is present in your hadoop directory.
    hadoop fs -cat /hadoop/purchases.txt
  16. Display last kilobyte of the file “purchases.txt” to stdout.
    hadoop fs -tail hadoop/purchases.txt
  17.  
  18. Default file permissions are 666 in HDFS Use ‘-chmod’ command to change permissions of a file
    hadoop fs -ls hadoop/purchases.txt
    sudo -u hdfs hadoop fs -chmod 600 hadoop/purchases.txt
  19. Default names of owner and group are training,training
    # Use ‘-chown’ to change owner name and group name simultaneously
    hadoop fs -ls hadoop/purchases.txt
    sudo -u hdfs hadoop fs -chown root:root hadoop/purchases.txt
  20. Default name of group is training
    # Use ‘-chgrp’ command to change group name
    hadoop fs -ls hadoop/purchases.txt
    sudo -u hdfs hadoop fs -chgrp training hadoop/purchases.txt
  21.  
  22. Default replication factor to a file is 3.
    # Use ‘-setrep’ command to change replication factor of a file
    hadoop fs -setrep -w 2 apache_hadoop/sample.txt
  23.  
  24. Copy a directory from one node in the cluster to another node
    # Use -distcp command to copy,
    # -overwrite option to overwrite in an existing files
    # -update command to synchronize both directories
    hadoop fs -distcp hdfs://namenodeA/apache_hadoop hdfs://namenodeB/hadoop
  25.  
  26. Command to make the name node leave safe mode
    hadoop fs -expunge
    sudo -u hdfs dfsadmin -safemode leave
  27.  
  28. See how much space this directory occupies in HDFS.
    hadoop fs -du -s -h hadoop/retail
  29. Report the amount of space used and available on currently mounted filesystem
    hadoop fs -df hdfs:/
  30.  
  31. Count the number of directories,files and bytes under the paths that match the specified file pattern
    hadoop fs -count hdfs:/
  32.  
  33. Run a DFS filesystem checking utility
    hadoop fsck /
  34.  
  35. Format the namenode:
    hadoop namenode -format
  36. Starting Secondary namenode:
    hadoop secondarynamenode
  37. Run namenode:
    hadoop namenode
  38. Run data node:
    hadoop datanode
  39. Cluster Balancing:
    hadoop balancer
  40.  
  41. Last but not least, always ask for help!
    hadoop fs -help 
  42.  
  43. List all the hadoop file system shell commands
    hadoop fs