This is an article which is made to solve problems on searching files or folders containing large files. After finding out that a server has its own space storage slowly running out, it is important to retrieve or to reclaim some space by searching unused or unimportant files, folders scattered in the server.
The method used for searching those files and folders is by executing the sort command from the top level directory and then re-execute it in the lower level directory which is indicated containing larger size folder or files.
The command used to execute the sorting process is shown below :
du -h / | sort -nr | grep [0-9]G | head -20 Description : du : It is the command used to estimate space usage. -h : It is the additional parameter for 'du' command used to display the size in a human-readable format. The 'h' itself can be assumed as the abbreviation of human-readable. / : The location specified to be viewed, in this context it is the root partition. | : The pipe sign, whatever output generated before the pipe sign will be directed to any media, command available after the pipe sign. sort : It is the command used to sort lines of text files -n : It is the additional parameter for 'sort' command used to display by comparing to string numerical value accordingly. -r : It is the additional paramter for 'sort' command which is reversing the result of comparison. grep : It is a command which is used to filter and print lines which is matching with the pattern specified. [0-9]G : It is the pattern specified as parameter for the grep command. It will display any line containing number followed with 'G' character. head : It is a command which is used to output the first part of files only. -20 : It is the additional parameter of the head command which is printing only the first 20 lines available.
Below is the first execution on root partition (/) using the above command pattern :
[root@hostname ~]# du -h / | sort -nr | grep [0-9]G | head -20 du: cannot access ‘/proc/60569/task/60569/fd/4’: No such file or directory du: cannot access ‘/proc/60569/task/60569/fdinfo/4’: No such file or directory du: cannot access ‘/proc/60569/fd/4’: No such file or directory du: cannot access ‘/proc/60569/fdinfo/4’: No such file or directory 76G / 26G /home 25G /opt/alfresco-community 25G /opt 23G /var/lib/mysql 23G /var/lib 23G /var 23G /opt/alfresco-community/tomcat 23G /home/user 22G /opt/alfresco-community/tomcat/logs 13G /var/lib/mysql/webapps1 10G /var/lib/mysql/webapps2 1.6G /home/dev 1.4G /home/guest 1.3G /usr 1.3G /home/apps/bin 0 /run/udev/links/\x2fdisk\x2fby-id\x2fdm-uuid-LVM-cgvkcZiNjYDj9VbmK3a8UFeZxpJvlSv0VWzCwli6RvphmtK67GRQ9q6KLRTuqQjo [root@hostname ~]#
As shown in the above output, the largest files located in root partition (/) which has the size of 76G. But if the output is read further there is a location which consistently has large size eventhough it is going deeper on its level. It can be shown as follows :
25G /opt/alfresco-community ... 23G /opt/alfresco-community/tomcat ... 22G /opt/alfresco-community/tomcat/logs ...
In the /opt/alfresco-community specifically /opt/alfresco-community/tomcat and moreover in detail it is located in /opt/alfresco-community/tomcat/logs has a 22G folder size in total.
To be able to search further, run the sort command again but in this time, it will be executed in /opt/alfresco-community/tomcat/logs :
[root@hostname logs]# du -h * | sort -nr | grep [0-9]G | head -20 5.1G localhost_access_log2016-07-19.txt 3.3G localhost_access_log2016-07-20.txt 1.9G localhost_access_log2016-08-02.txt 1.7G localhost_access_log2016-08-01.txt 1.6G localhost_access_log2016-07-31.txt 1.5G localhost_access_log2016-07-30.txt 1.3G localhost_access_log2016-07-29.txt 1.2G localhost_access_log2016-07-28.txt [root@hostname logs]#
Based on the above output command, there are several files has large size even surpass 1G. But not only that, to be able to view files which are smaller than 1G, below is the command execution by modifying grep into grep [0-9]M where M is for searching Megabytes size files :
du -h * | sort -nr | grep [0-9]M | head -20
[root@hostname logs]# du -h * | sort -nr | grep [0-9]M | head -20 1018M localhost_access_log2016-07-27.txt 875M localhost_access_log2016-07-26.txt 788M localhost_access_log2016-08-03.txt 734M localhost_access_log2016-07-25.txt 593M localhost_access_log2016-07-24.txt 451M localhost_access_log2016-07-23.txt 310M localhost_access_log2016-07-22.txt 169M localhost_access_log2016-07-21.txt 62M catalina.out [root@hostname logs]#
As expected, although the size is under 1G, but there are several files have the size of hundreds MB which are significant to be erased since those files are logs files.
Removing log files by executing the following command :
[root@hostname logs]# rm -rf localhost_access_log2016-0*
The above explanation described how to claim some space by iteratively checked and searched using the command combination for large files and folder. Hope it helps for further reference.
One thought on “Sort Files by Size Linux Command Line”