I'm trying to download more than 500 pages from a website that disappeared over 10 years ago. Flashgot removes the web.archive.org part of the URL.
This is what I want http://web.archive.org/web/20020206062707/http://www.herbweb.com/herbage/1-A.htm
This is what I get http://www.herbweb.com/herbage/1-A.htm
Is there any way to get this working?
web.archive.org files not downloading correctly
web.archive.org files not downloading correctly
Last edited by msjs on Mon Feb 03, 2014 11:52 am, edited 1 time in total.
Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Firefox/24.0
Re: web.archive.org files not downloading correctly
I don't think that
is a valid URL.
Code: Select all
http://web.archive.org/web/20020206062707/http://www.herbweb.com/herbage/1-A.htm
======
Thrawn
------------
Religion is not the opium of the masses. Daily life is the opium of the masses.
True religion, which dares to acknowledge death and challenge the way we live, is an attempt to wake up.
Thrawn
------------
Religion is not the opium of the masses. Daily life is the opium of the masses.
True religion, which dares to acknowledge death and challenge the way we live, is an attempt to wake up.
Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:26.0) Gecko/20100101 Firefox/26.0
Re: web.archive.org files not downloading correctly
Click on it, you will see it is!
Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Firefox/24.0
Re: web.archive.org files not downloading correctly
about:config
flashgot.redir.generic.enabled
&
flashgot.redir.generic.exceptions
Toggling the first will then make things work.
But likely will be overly broad.
Setting an exception, preferred method, should also work I would think.
Just not sure just what to set it to at this time?
(I'm sure barbaz will come up with the correct string
.)
---
I'll note that it saved the html "page", but not the associated picture...
Would expect ? a File | Save As (outside of FlashGot) to save "everything".
Likewise a "spider" type program might do similar in an automated fashion.
flashgot.redir.generic.enabled
&
flashgot.redir.generic.exceptions
Toggling the first will then make things work.
But likely will be overly broad.
Setting an exception, preferred method, should also work I would think.
Just not sure just what to set it to at this time?
(I'm sure barbaz will come up with the correct string

---
I'll note that it saved the html "page", but not the associated picture...
Would expect ? a File | Save As (outside of FlashGot) to save "everything".
Likewise a "spider" type program might do similar in an automated fashion.
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.19) Gecko/20110420 SeaMonkey/2.0.14 Pinball NoScript FlashGot AdblockPlus
Mozilla/5.0 (Windows NT 5.1; rv:26.0) Gecko/20100101 SeaMonkey/2.23
Re: web.archive.org files not downloading correctly
Not without documentation... is this like NoScript's AddressMatcher, is it one big regexp, or what?therube wrote:(I'm sure barbaz will come up with the correct string.)
Probably worth adding web.archive.org as a default exception anyway.
In my case it didn't. Standalone Wget was also uselesstherube wrote:Would expect ? a File | Save As (outside of FlashGot) to save "everything".

Then again, I was trying to download an archive of a partly script-generated page, so some resources weren't called directly by the HTML source.
*Always* check the changelogs BEFORE updating that important software!
Mozilla/5.0 (X11; Linux i686; rv:30.0) Gecko/20100101 Firefox/30.0 SeaMonkey/2.27a1
Re: web.archive.org files not downloading correctly
OK, I looked at the source and the pref is a list of space-separated regexes. This addition should work:
However, there appears to be a bug in FlashGot such that this pref, if set, controls where *to* apply the redirect fixing...
The bug can be fixed by changing line 1123 of RedirectContext.js to
but I did only basic testing so I'm not sure that doesn't screw something else up...
EDIT Decided to do further testing, and looks like the patch I had originally posted actually disables the whole feature.
Corrected above, sorry about that.
Code: Select all
^https?://web\.archive\.org/web/\d+
The bug can be fixed by changing line 1123 of RedirectContext.js to
Code: Select all
var m = (!context.genericExceptionsRx || !context.genericExceptionsRx.test(url)) &&
EDIT Decided to do further testing, and looks like the patch I had originally posted actually disables the whole feature.

Corrected above, sorry about that.
*Always* check the changelogs BEFORE updating that important software!
Mozilla/5.0 (X11; Linux i686; rv:30.0) Gecko/20100101 Firefox/30.0 SeaMonkey/2.27a1
Re: web.archive.org files not downloading correctly
And you can temporarily use this value for flashgot.redir.generic.exceptions until this bug gets fixed
Code: Select all
^https?://(?!web\.archive\.org/web/\d+)
*Always* check the changelogs BEFORE updating that important software!
Mozilla/5.0 (X11; Linux i686; rv:30.0) Gecko/20100101 Firefox/30.0 SeaMonkey/2.27a1
Re: web.archive.org files not downloading correctly
Haven't had time to get back to this til now. Thanks for the help.
Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Firefox/24.0
- Giorgio Maone
- Site Admin
- Posts: 9524
- Joined: Wed Mar 18, 2009 11:22 pm
- Location: Palermo - Italy
- Contact:
Re: web.archive.org files not downloading correctly
Please check latest development build 1.5.5.97rc2, thank you.
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:27.0) Gecko/20100101 Firefox/27.0