List of frequently used hadoop commands
- Print the Hadoop version
hadoop version
- List the contents of the root directory in HDFS
hadoop fs -ls /hadoop fs -ls hdfs:/
- Create a new directory named “hadoop” below the /user/training directory in HDFS.
hadoop fs -mkdir /user/training/hadoop
- Delete a file ‘customers’ from the “retail” directory.
hadoop fs -rm hadoop/retail/customers
- Delete all files from the “retail” directory using a wildcard.
hadoop fs -rm hadoop/retail/*
- Remove the entire retail directory and all of its contents in HDFS.
hadoop fs -rm -r hadoop/retail
- To empty the trash
hadoop fs -expunge
- Add the purchases.txt file from the local directory named “/home/training/” to the hadoop directory you created in HDFS
hadoop fs -copyFromLocal c:/purchases.txt /hadoop/
- Add the purchases.txt file from “hadoop” directory which is present in HDFS directory to the directory “data” which is present in your local directory
hadoop fs -copyToLocal /hadoop/purchases.txt D:/home/training/data
- ‘-get’ command can be used alternaively to ‘-copyToLocal’ command
hadoop fs -get hadoop/sample.txt /home/training/
- Add a sample text file from the local directory named “data” to the new directory
hadoop fs -put c:/sample.txt /user/training/hadoop
- Add the entire local directory called “retail” to the /user/training directory in HDFS.
hadoop fs -put c:/data/retail /user/training/hadoop
- cp is used to copy files between directories present in HDFS
hadoop fs -cp /user/training/*.txt /user/training/hadoop
- Move a directory from one location to other present in HDFS
hadoop fs -mv hadoop apache_hadoop
- To view the contents of your text file purchases.txt which is present in your hadoop directory.
hadoop fs -cat /hadoop/purchases.txt
- Display last kilobyte of the file “purchases.txt” to stdout.
hadoop fs -tail hadoop/purchases.txt
-
- Default file permissions are 666 in HDFS Use ‘-chmod’ command to change permissions of a file
hadoop fs -ls hadoop/purchases.txt
sudo -u hdfs hadoop fs -chmod 600 hadoop/purchases.txt
- Default names of owner and group are training,training
# Use ‘-chown’ to change owner name and group name simultaneously
hadoop fs -ls hadoop/purchases.txt
sudo -u hdfs hadoop fs -chown root:root hadoop/purchases.txt
- Default name of group is training
# Use ‘-chgrp’ command to change group name
hadoop fs -ls hadoop/purchases.txt
sudo -u hdfs hadoop fs -chgrp training hadoop/purchases.txt
-
- Default replication factor to a file is 3.
# Use ‘-setrep’ command to change replication factor of a file
hadoop fs -setrep -w 2 apache_hadoop/sample.txt
-
- Copy a directory from one node in the cluster to another node
# Use -distcp command to copy,
# -overwrite option to overwrite in an existing files
# -update command to synchronize both directories
hadoop fs -distcp hdfs://namenodeA/apache_hadoop hdfs://namenodeB/hadoop
-
- Command to make the name node leave safe mode
hadoop fs -expunge
sudo -u hdfs dfsadmin -safemode leave
-
- See how much space this directory occupies in HDFS.
hadoop fs -du -s -h hadoop/retail
- Report the amount of space used and available on currently mounted filesystem
hadoop fs -df hdfs:/
-
- Count the number of directories,files and bytes under the paths that match the specified file pattern
hadoop fs -count hdfs:/
-
- Run a DFS filesystem checking utility
hadoop fsck /
-
- Format the namenode:
hadoop namenode -format
- Starting Secondary namenode:
hadoop secondarynamenode
- Run namenode:
hadoop namenode
- Run data node:
hadoop datanode
- Cluster Balancing:
hadoop balancer
-
- Last but not least, always ask for help!
hadoop fs -help
-
- List all the hadoop file system shell commands
hadoop fs