mirror single page with httrack

Question

mirror single page with httrack

31.1k views Asked by BMBM At 28 December 2009 at 07:55

I am trying to use httrack (http://www.httrack.com/) in order to download a single page, not the entire site. So, for example, when using httrack in order to download www.google.com it should only download the html found under www.google.com along with all stylesheets, images and javascript and not follow any links to images.google.com, labs.google.com or www.google.com/subdir/ etc.

I tried the -w option but that did not make any difference.

What would be the right command?

EDIT

I tried using httrack "http://www.google.com/" -O "./www.google.com" "http://www.google.com/" -v -s0 --depth=1 but then it wont copy any images.

What I basically want is just downloading the index file of that domain along with all assets, but not the content of any external or internal links.

Original Q&A

There are 5 answers

Gregory Pakosz On 28 December 2009 at 08:01

The purpose of HTTTrack is to follow links. Try setting --ext-depth=0.

torger On 28 December 2009 at 08:03

Looking at the example:

httrack "http://www.all.net/" -O "/tmp/www.all.net" "+*.all.net/*" -v

The last part is a regex. Just make a completely matching regex.

httrack "http://www.google.com.au/" -O "/tmp/www.google.com.au" "+*.google.com.au/*" -v ---depth=2 --ext-depth=2

I had to localise, otherwise I get a redirect page. You should localise to whichever google you get directed to.

Sourav Ghosh On 19 January 2015 at 22:00

httrack "http://www.google.com/" -O "./www.google.com" "http://www.google.com/" -v -s0  --depth=1 -n

-n option (or --near) will download images on a webpage no matter where it is located.

Say images are located in google.com/foo/bar/logo.png. as, you are using s0(stay on same directory), it will not download the image unless you specify --near

Lucas Bustamante On 05 May 2017 at 13:21

Click on "Set Options"
Go to the tab "Limits"
Set "Maximum external depth" to 0

**Kevin Reid** · Accepted Answer · 2009-12-28T12:57:44+00:00

Kevin Reid On 28 December 2009 at 12:57 BEST ANSWER

Could you use wget instead of httrack? wget -p will download a single page and all of its “prerequisites” (images, stylesheets).

TechQA.

mirror single page with httrack

There are 5 answers

Related Questions in HTTP

Related Questions in COMMAND-LINE

Related Questions in HTTRACK

Popular Questions

Popular Tags

Trending Questions