How can I access hidden dates on an archived website?
up vote
-3
down vote
favorite
To preface this, I have absolutely zero knowledge in regards to programming. What I want to do is access the 17 dates that are not shown on this archived website: https://web.archive.org/web/20031002051647/http://www.avengedsevenfold.com:80/tourdates/tourdates.php (and the hidden dates on older/newer versions of this website, of course)
When I click on "Next" or "Show All Dates", it simply reloads the website. Is there a way to access the dates? I have skimmed through the source code, but didn't find anything. But the dates have to be somewhere, right?
website
add a comment |
up vote
-3
down vote
favorite
To preface this, I have absolutely zero knowledge in regards to programming. What I want to do is access the 17 dates that are not shown on this archived website: https://web.archive.org/web/20031002051647/http://www.avengedsevenfold.com:80/tourdates/tourdates.php (and the hidden dates on older/newer versions of this website, of course)
When I click on "Next" or "Show All Dates", it simply reloads the website. Is there a way to access the dates? I have skimmed through the source code, but didn't find anything. But the dates have to be somewhere, right?
website
add a comment |
up vote
-3
down vote
favorite
up vote
-3
down vote
favorite
To preface this, I have absolutely zero knowledge in regards to programming. What I want to do is access the 17 dates that are not shown on this archived website: https://web.archive.org/web/20031002051647/http://www.avengedsevenfold.com:80/tourdates/tourdates.php (and the hidden dates on older/newer versions of this website, of course)
When I click on "Next" or "Show All Dates", it simply reloads the website. Is there a way to access the dates? I have skimmed through the source code, but didn't find anything. But the dates have to be somewhere, right?
website
To preface this, I have absolutely zero knowledge in regards to programming. What I want to do is access the 17 dates that are not shown on this archived website: https://web.archive.org/web/20031002051647/http://www.avengedsevenfold.com:80/tourdates/tourdates.php (and the hidden dates on older/newer versions of this website, of course)
When I click on "Next" or "Show All Dates", it simply reloads the website. Is there a way to access the dates? I have skimmed through the source code, but didn't find anything. But the dates have to be somewhere, right?
website
website
asked Nov 26 at 2:45
Seelentau
42
42
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
up vote
1
down vote
accepted
They're not archived.
The Internet Archive does not have access to the server-side logic of a website and cannot fully replicate the behavior of dynamic pages (such as PHP in this case); the best it can do is to follow links and download each known URL as an independent, static page.
The crawler can follow and archive straightforward links such as <a href="news.php?page=2">
. However, your website's "next"/"show all" are not regular links – they're an unholy combination of JavaScript actions and POST-based forms, either of which alone would have already made the crawler fail to recognize them as a link:
<a href="#" onclick="JavaScript:nextPage()"><img src=...></a>
Although the archiver can store a copy of the client-side JavaScript code, it does not interpret that code nor otherwise understand what nextPage() does here, and so must skip these JS-based buttons entirely. (You can see that IA only has this one URL archived.)
(Even if the archiver could discover what the JS code does, it wouldn't be allowed to touch this specific form anyway – the usage of POST implies that each request may cause some changes on the server. Only GET requests are safe to crawl automatically.)
So when you click the "next" button, the browser still runs nextPage() and sends a request with page=2 or such, but there is no corresponding server-side code to process that request anymore – the Archive can only respond with the same static data as before.
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
accepted
They're not archived.
The Internet Archive does not have access to the server-side logic of a website and cannot fully replicate the behavior of dynamic pages (such as PHP in this case); the best it can do is to follow links and download each known URL as an independent, static page.
The crawler can follow and archive straightforward links such as <a href="news.php?page=2">
. However, your website's "next"/"show all" are not regular links – they're an unholy combination of JavaScript actions and POST-based forms, either of which alone would have already made the crawler fail to recognize them as a link:
<a href="#" onclick="JavaScript:nextPage()"><img src=...></a>
Although the archiver can store a copy of the client-side JavaScript code, it does not interpret that code nor otherwise understand what nextPage() does here, and so must skip these JS-based buttons entirely. (You can see that IA only has this one URL archived.)
(Even if the archiver could discover what the JS code does, it wouldn't be allowed to touch this specific form anyway – the usage of POST implies that each request may cause some changes on the server. Only GET requests are safe to crawl automatically.)
So when you click the "next" button, the browser still runs nextPage() and sends a request with page=2 or such, but there is no corresponding server-side code to process that request anymore – the Archive can only respond with the same static data as before.
add a comment |
up vote
1
down vote
accepted
They're not archived.
The Internet Archive does not have access to the server-side logic of a website and cannot fully replicate the behavior of dynamic pages (such as PHP in this case); the best it can do is to follow links and download each known URL as an independent, static page.
The crawler can follow and archive straightforward links such as <a href="news.php?page=2">
. However, your website's "next"/"show all" are not regular links – they're an unholy combination of JavaScript actions and POST-based forms, either of which alone would have already made the crawler fail to recognize them as a link:
<a href="#" onclick="JavaScript:nextPage()"><img src=...></a>
Although the archiver can store a copy of the client-side JavaScript code, it does not interpret that code nor otherwise understand what nextPage() does here, and so must skip these JS-based buttons entirely. (You can see that IA only has this one URL archived.)
(Even if the archiver could discover what the JS code does, it wouldn't be allowed to touch this specific form anyway – the usage of POST implies that each request may cause some changes on the server. Only GET requests are safe to crawl automatically.)
So when you click the "next" button, the browser still runs nextPage() and sends a request with page=2 or such, but there is no corresponding server-side code to process that request anymore – the Archive can only respond with the same static data as before.
add a comment |
up vote
1
down vote
accepted
up vote
1
down vote
accepted
They're not archived.
The Internet Archive does not have access to the server-side logic of a website and cannot fully replicate the behavior of dynamic pages (such as PHP in this case); the best it can do is to follow links and download each known URL as an independent, static page.
The crawler can follow and archive straightforward links such as <a href="news.php?page=2">
. However, your website's "next"/"show all" are not regular links – they're an unholy combination of JavaScript actions and POST-based forms, either of which alone would have already made the crawler fail to recognize them as a link:
<a href="#" onclick="JavaScript:nextPage()"><img src=...></a>
Although the archiver can store a copy of the client-side JavaScript code, it does not interpret that code nor otherwise understand what nextPage() does here, and so must skip these JS-based buttons entirely. (You can see that IA only has this one URL archived.)
(Even if the archiver could discover what the JS code does, it wouldn't be allowed to touch this specific form anyway – the usage of POST implies that each request may cause some changes on the server. Only GET requests are safe to crawl automatically.)
So when you click the "next" button, the browser still runs nextPage() and sends a request with page=2 or such, but there is no corresponding server-side code to process that request anymore – the Archive can only respond with the same static data as before.
They're not archived.
The Internet Archive does not have access to the server-side logic of a website and cannot fully replicate the behavior of dynamic pages (such as PHP in this case); the best it can do is to follow links and download each known URL as an independent, static page.
The crawler can follow and archive straightforward links such as <a href="news.php?page=2">
. However, your website's "next"/"show all" are not regular links – they're an unholy combination of JavaScript actions and POST-based forms, either of which alone would have already made the crawler fail to recognize them as a link:
<a href="#" onclick="JavaScript:nextPage()"><img src=...></a>
Although the archiver can store a copy of the client-side JavaScript code, it does not interpret that code nor otherwise understand what nextPage() does here, and so must skip these JS-based buttons entirely. (You can see that IA only has this one URL archived.)
(Even if the archiver could discover what the JS code does, it wouldn't be allowed to touch this specific form anyway – the usage of POST implies that each request may cause some changes on the server. Only GET requests are safe to crawl automatically.)
So when you click the "next" button, the browser still runs nextPage() and sends a request with page=2 or such, but there is no corresponding server-side code to process that request anymore – the Archive can only respond with the same static data as before.
edited Nov 26 at 7:02
answered Nov 26 at 5:28
grawity
229k35481541
229k35481541
add a comment |
add a comment |
Thanks for contributing an answer to Super User!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f1378353%2fhow-can-i-access-hidden-dates-on-an-archived-website%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown