The power of wget

So you need to download huge files on a slow connection. Your browser isn’t doing a very good job. You need partial file support and resume support. You hate fancy GUI’s that eat memory. You need to mirror an entire blog or site. You need a powerful download manager which does all this and more. *Phew!*

If you answered yes to any of the above questions then wget is definitely for you.

Firstly the syntax –

wget [OPTION]… [URL]…

The URL is the address of the file that you will be downloading. wget has a large number of options that you will be entering in the [options] part of the syntax. So lets start with the things that wget can do for you –

Tip 1 : Direct download a file

To download a file using wget we need the address of the file (offcourse). The file will be saved in the current directory from where the command was invoked. The example below downloads a file sample.avi. So issue to following command –

wget http://www.mysite.com/sample.avi

As soon as the command is issued the download starts and the file gets downloaded into the current directory.

Tip 2 : Resume large file downloads

Now suppose that the file we used in the previous example (sample.avi) is a huge file and we have a slow connection. So we need to resume the download during each session. How do we do this? No, we do not crap our pants 🙂 we instead issue the following command –

wget -c –output-document=mysample.avi “http://www.mysite.com/sample.avi”

What the above command does is save the file http://www.mysite.com/sample.avi as mysample.avi. Also, the -c option is used to resume a previously terminated download.

Tip 3 : Mirror or backup an entire site or blog

I lot of people use free services to create a blog. The problem here is most of these services do not give you the ability to backup your blog. Or there might be a case where you come across a really good site and want to make an offline copy of that site. In either case you can issue the following command –

wget -m http://www.mysite.com

What the above command does is save all the pages present on http://www.mysite.com into a folder of the same name. Once the site has been mirrored you can compress the folder created by wget using the tar cvfz mysitebkp.tar.gz command.

Tip 4 : Save all pages on a site + pages of the links that the site carries –


If I wanted to save all the pages of a site along with all the pages that the site has links to i’d issue the following command –

wget -H -r –level=1 -k -p http://www.mysite.com

Note that the above command is dangerous and can quickly eat up valuable disk space is the site has too many links. The -r indicates recursive save and –level=1 indicates the depth of the save.

To see the rest of the options available to you try the following command –

wget –help

Summary –

Ok I’m interested you say? How do I download and install wget you ask? Well, here’s how –

Most linux distros have wget by default. You could do an apt-get install wget on the terminal if you don’t have wget. Windows users can get wget from here. Mac users can get it from here.

The online manual for wget is available here.

Advertisements

About synapse
Programming, motorcycles and photography. Want to do more, but only have time for so much!

One Response to The power of wget

  1. dkd903 says:

    The mirroring command you have given just mirrors the site relative to the server`s location, to make it mirror relative to your local links, you should use the command with a k switch.

    $ wget -mk http://www.mysite.com

    also, u would like to pity on the server which is serving the pages for you, by adding another switch w , and this will send the mirror requests at some interval, thus reducing the load of the host`s server 😉

    $ wget -mk -w 20 http://www.mysite.com

    for a 20 sec interval before each request is sent. Cheers.!! Nice Blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s