修改浏览器id
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
暂停,恢复任务
Subject: Re: Pause, continue and stop via command line |
Author: Xavier Roche |
Date: 10/22/2005 10:35 |
|
> You do it by pressing the ctrl + c keys. This pauses
> HTTrack. You also get the options to quit,
> interrupt, run as background task, run as blind
> background task or to cancel the mirroring.
And if you want to control a background job:
To pause, you can create a file named hts-stop.lock in the project directory
(at the same level of hts-log.txt). HTTrack will erase it, wait for pending
transfers to finish, create a new file named hts-paused.lock, and will wait
until this file is erased. The mirror will continue from this point.
|
|
https://forum.httrack.com/readmsg/12607/12604/index.html
网页版UI
命令 webhttrack
最大链接数
Subject: Re: Too many URLs, giving up..(>100000) |
Author: Christian |
Date: 04/18/2015 09:44 |
|
Soimosan and Russ:
I have the same questions you two had. The DEv here are doing this AMAZING
project as a "charity" to average people like us, and have limited time to
answer questions. So I am going to tell you hot to fix this easily and finish
your project:
I know this is a kind of old thread and despite having been using HTTrack for
almost 7 years, I hit the frustrating "Panic: too many links" error that
prevents pages from being mirrored all the way this week (April 2015)!
But I KNOW there are average users like me that still have this problem and I
am here to help. :)
***I am going to post here the fix to this problem, easy step-by-step for
n00bs and experts alike. :) ***
I confess, I almost went crazy with frustration. HTTrack works so dang well, I
NEVER look at the error log! So for the first time I hit the "Too many URLs,
giving up" error.
I tried everything: adding that line (example: -#L1000000) to "Rules" to no
avail. Tried the line in the "add url" and "web addresses" to no avail. I
tried deleting the dang .tmp files, not work either. I spent hours creating +*
Rules for each major link artery... nothing.
Well... After banging my head against the wall over 3 days, I checked error
logs carefully. Ta-da!
So here is the easy way to fix this. The whole "use #L option for more links
(example: -#L1000000)" was not making sense.
So here is what I did:
1) Do NOT delete hts-cache or .tmp files like I did! Otherwise, the download
will take a long time again from scratch. Go back to the project and open the
existing project. Select action: "Continue interrupted download";
2. Click on "Set Options" --> Limits (tab) --> "Maximum Number of links"
(bottom field) --> Just go for highest available of threads or write
999999999.
But be warned! You need to constantly check the In Progress/action download
screen to be sure you are not copying the entire Internet! lol One way to do
that is to put "0" under the "Maximum external depth" field (Limits tab) and
checkmark "No external pages" under "Build" tab. I know this can cause you to
miss some files, but that can be fixed manually under "Rules".
Hope this helps! :) Be blessed mirroring!
THANK YOU to the crew at Httrack for this awesome project! :) I cannot write
code, but I am happy to donate yearly to keep the site/host going. :) Thanks
all. :)
Christian
|
|
|
|
|
https://forum.httrack.com/readmsg/33944/310/index.html
httrack -s0 --mirrorlinks --go-everywhere https://www.ffff.com -O "/save/dir" -* +*www.ffff.com*tutorials* +*.bmp +*.jpg +*.png +*.tif +*.gif +*.pcx +*.tga +*.exif +*.fpx +*.svg +*.psd +*.cdr +*.pcd +*.dxf +*.ufo +*.eps +*.ai +*.raw +*.WMF +*.webp -*facebook.com* -*mix.com* -*.microsoft.com -*.amazon.com -*google.com* -*twitter.com* -*linkedin.com* -*.reddit.com* -*tumblr.com* --can-go-up-and-down -%v
"C:\Program Files\WinHTTrack\httrack.exe" -s0 -Y -e https://www.x.com -O d:\out\dir -* +*https://www.x.com/reference* +*.bmp +*.jpg +*.png +*.tif +*.gif +*.pcx +*.tga +*.exif +*.fpx +*.svg +*.psd +*.cdr +*.pcd +*.dxf +*.ufo +*.eps +*.ai +*.raw +*.WMF +*.webp +*.css +*.js -*facebook.com* -*mix.com* -*.microsoft.com -*.amazon.com -*google.com* -*twitter.com* -*linkedin.com* -*.reddit.com* -*tumblr.com* -B -%v