Mirror your Blog using ‘wget’

Sometimes you want to have your “How to…” section as offline resource with you, so that you can take and view it without internet access.
As wget was my preferred choice since late 1990 and early 2000’s I wanted to ensure I do save my brain once again.

Simple way is:

wget --mirror http://your-site.whatever

The more powerful & “I am no longer a greenhorn” way is to do it like this:

wget --mirror --convert-links --adjust-extension --page-requisites --no-parent http://your-site.whatever

As I need an explanation of the various options:

  • --mirror – the download will be recursive
  • --convert-links – The links to files that have been downloaded by wget will be changed to refer to the file they point to as a relative link
  • --adjust-extension – If a file of type application/xhtml+xml or text/html is downloaded and the URL does
    not end with the regexp \.[Hh][Tt][Mm][Ll]?, this option will cause the suffix .html to be appended to the local filename, same as for files of type text/css end in the suffix .css
  • --page-requisites – This option causes wget to download all the files that are necessary to properly display a given HTML page. This includes such things as inlined images, sounds, and referenced stylesheets
  • --no-parent – it guarantees that only the files below a certain hierarchy will be downloaded.

