Download Recursively using wget via Command Line

Posted on

This is an article which is used to download files recursively via command line using a popular tool called ‘wget’. The tool itself normally exist as part of package or utility provided mainly in Unix or Linux operating system distribution.

To be able to use it, check it first in the operating system installed in host, workstation or server whether the tool itself has already installed or not. Below is the execution of the command used to check whether it has already installed or not in Ubuntu :

apt --installed list | grep package_name

Description : 

apt : It is the command used to manage package in Ubuntu or any Debian variant of Linux operating system distribution 
--installed list : It is an additional parameter to the apt command used to display all installed packages in list
| : It is a sign which is used to redirect the output of whatever it is generated before the '|' sign  
grep package_name :  It is a command used to filter any output with the text named package_name.

Below is the execution of the command stated above :

root@hostname:/# apt --installed list | grep wget

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

wget/xenial-updates,now 1.17.1-1ubuntu1.1 amd64 [installed]

The output above stated, ‘wget’ utility has already installed. On the other hand, below is the command which can be performed to check whether wget itself has already been installed or not in CentOS :

[user@hostname ~]$ yum list installed | grep wget
Repodata is over 2 weeks old. Install yum-cron? Or run: yum makecache fast
wget.x86_64               1.14-10.el7_0.1                 @base
[user@hostname ~]$
yum list installed | grep wget

Having ‘wget’ already installed in host, workstation or server which is used, try to execute the tool or utility of ‘wget’ in the command line as shown below :

root@soulreaper:/# wget
wget: missing URL
Usage: wget [OPTION]... [URL]...

Try `wget --help' for more options.

To be able to use it properly, ‘wget’ tool itself must be provided with the correct URL for further downloading purpose. But ‘wget’ itself not only downloading single file, it can also be utilized to download files recursively.

In this case, wget utility can be described further by following the scenario of downloading files of Linux operating system repository package and utilities represented by certain URL.  Supposed If a local repository is needed to be built temporarily to save or to spare internet bandwidth usage so that clients using Linux operating system can directly contact that local Linux operating system repository server to install or update certain packages, the first step is to download the whole files and packages exist in the repository.

wget -r -nH --cut-dirs=2 --no-parent --reject="index.html*"

Description : 

wget : The command utility for downloading files 
-r : The additional command for recursive download if wget stumbled upon a folder 
-nH : The additional command to erase or to remove the main URL to be included host address
--cut-dirs=2 : The additional command to erase or to remove 2 level of directory from the main URL to be included
--reject="index.html*" : It is the URL used for the main base download

The above command will eventually download all of files and folder recursively exist within without having to create the host name and also the two level directory ‘/el/7’ inside the main root folder.

One thought on “Download Recursively using wget via Command Line

Leave a Reply