httrack

修改浏览器id

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)

 

暂停,恢复任务

Subject: Re: Pause, continue and stop via command line
Author: Xavier Roche
Date: 10/22/2005 10:35
 
> You do it by pressing the ctrl + c keys. This pauses
> HTTrack. You also get the options to quit,
> interrupt, run as background task, run as blind
> background task or to cancel the mirroring. 

And if you want to control a background job:

To pause, you can create a file named hts-stop.lock in the project directory
(at the same level of hts-log.txt). HTTrack will erase it, wait for pending
transfers to finish, create a new file named hts-paused.lock, and will wait
until this file is erased. The mirror will continue from this point.

 

https://forum.httrack.com/readmsg/12607/12604/index.html

 

网页版UI

命令 webhttrack

 

最大链接数

Subject: Re: Too many URLs, giving up..(>100000)
Author: Christian
Date: 04/18/2015 09:44
 
Soimosan and Russ:

I have the same questions you two had. The DEv here are doing this AMAZING
project as a "charity" to average people like us, and have limited time to
answer questions. So I am going to tell you hot to fix this easily and finish
your project:

I know this is a kind of old thread and despite having been using HTTrack for
almost 7 years, I hit the frustrating "Panic: too many links" error that
prevents pages from being mirrored all the way this week (April 2015)!

But I KNOW there are average users like me that still have this problem and I
am here to help. :)

***I am going to post here the fix to this problem, easy step-by-step for
n00bs and experts alike. :) ***

I confess, I almost went crazy with frustration. HTTrack works so dang well, I
NEVER look at the error log! So for the first time I hit the "Too many URLs,
giving up" error. 

I tried everything: adding that line (example: -#L1000000) to "Rules" to no
avail. Tried the line in the "add url" and "web addresses" to no avail. I
tried deleting the dang .tmp files, not work either. I spent hours creating +*
Rules for each major link artery... nothing. 

Well... After banging my head against the wall over 3 days, I checked error
logs carefully. Ta-da!

So here is the easy way to fix this. The whole "use #L option for more links
(example: -#L1000000)" was not making sense. 

So here is what I did:

1) Do NOT delete hts-cache or .tmp files like I did! Otherwise, the download
will take a long time again from scratch. Go back to the project and open the
existing project. Select action: "Continue interrupted download";

2. Click on "Set Options" --> Limits (tab) --> "Maximum Number of links"
(bottom field) --> Just go for highest available of threads or write
999999999.

But be warned! You need to constantly check the In Progress/action download
screen to be sure you are not copying the entire Internet! lol  One way to do
that is to put "0" under the "Maximum external depth" field (Limits tab) and
checkmark "No external pages" under "Build" tab. I know this can cause you to
miss some files, but that can be fixed manually under "Rules".

Hope this helps! :) Be blessed mirroring!

THANK YOU to the crew at Httrack for this awesome project! :) I cannot write
code, but I am happy to donate yearly to keep the site/host going. :) Thanks
all. :)

Christian
 
Reply Create subthread

 



All articles

Subject Author Date
Too many URLs, giving up..(>100000)

Soimosan Roger

05/22/2001 10:33
Re: Too many URLs, giving up..(>100000)

Xavier Roche

05/22/2001 11:01
Re: Too many URLs, giving up..(>100000)

russ

01/29/2010 11:05
Re: Too many URLs, giving up..(>100000)

Christian

04/18/2015 09:44
Re: Too many URLs, giving up..(>100000)

muf

03/08/2019 13:09
Re: Too many URLs, giving up..(>100000)

duff

06/01/2019 22:21
Re: Too many URLs, giving up..(>100000)

Homer

07/14/2019 02:26

 

https://forum.httrack.com/readmsg/33944/310/index.html

 

httrack -s0 --mirrorlinks --go-everywhere https://www.ffff.com  -O "/save/dir"  -* +*www.ffff.com*tutorials*  +*.bmp +*.jpg +*.png +*.tif +*.gif +*.pcx +*.tga +*.exif +*.fpx +*.svg +*.psd +*.cdr +*.pcd +*.dxf +*.ufo +*.eps +*.ai +*.raw +*.WMF +*.webp -*facebook.com* -*mix.com* -*.microsoft.com -*.amazon.com -*google.com* -*twitter.com* -*linkedin.com* -*.reddit.com* -*tumblr.com*  --can-go-up-and-down  -%v 

 

 

"C:\Program Files\WinHTTrack\httrack.exe" -s0 -Y -e https://www.x.com -O d:\out\dir -* +*https://www.x.com/reference* +*.bmp +*.jpg +*.png +*.tif +*.gif +*.pcx +*.tga +*.exif +*.fpx +*.svg +*.psd +*.cdr +*.pcd +*.dxf +*.ufo +*.eps +*.ai +*.raw +*.WMF +*.webp +*.css +*.js -*facebook.com* -*mix.com* -*.microsoft.com -*.amazon.com -*google.com* -*twitter.com* -*linkedin.com* -*.reddit.com* -*tumblr.com* -B -%v

 

posted @ 2019-10-25 17:18  fndefbwefsowpvqfx  阅读(361)  评论(0编辑  收藏  举报