Has this situation ever happened to you? You enter search keywords in Google for a very specific topic. In the resulting screen, you see the title of that perfect article with exactly what you were seeking. Hopeful, you click the link and receive a 404-error message saying that the page does not exist. This scenario sadly happens to everyone countless times. Fortunately, there are two ways to view these once accessible pages.
One of the features that set Google apart from other search engines is the Google Cache. As the Googlebot indexes web pages into the central database, it also saves the HTML portion. The HTML portion is basically the text and layout without the pictures. When searching in Google, you've probably noticed the "Cached" link.
If you haven't tried clicking on that link, visit it. You will be directed to the saved version of that specific web page when the Googlebot last cached it. This is the first method to try when you can't download the actual page.
Google Cache Hacks
Some people like to "hack" the Google cache to display any page from the past. This is relatively easy to do if you look at the URL of a Google cached page. This is the URL of my website's cache:
It’s pretty easy to decipher the URL. The "126.96.36.199" is just the IP address for "google.com." The "search?" means that it is passing some commands to the search application. The "q" is the variable for query, or request. The "cache" tells the search application that it is looking for the cached version of the web page. The rest of the text after "cache" is the URL of the original page in a strange encoded format.
If we take the information from the original URL above, we can make our own customized URL for any page. Use this:
Just replace "URL" with the URL of the page that you want to view in its cached version. You can even create your own Google Cache Generator like this:
Enter the URL of the Page that You Want to See Cached:
Though most pages are cached, it is pretty impossible for all pages on the Internet to be included. Google only saves the pages that it crawls. If a page is not in the Google search, it will not be in the Google cache.
The Internet Archive
An alternative to the Google Cache is The Internet Archive. The Internet Archive is a more extensive database of old web pages. With the Google Cache, newer ones overwrite older pages. However with The Internet Archive, the crawler keeps every page that it archives. Sometimes it even retains the pictures and content. The only drawback is that the crawler archives fewer pages than the Google Cache does. The Googlebot saves many pages while the Internet Archive generally saves the main pages of noteworthy websites. Take a look at the websites from blue-chip companies today. It's interesting to see the evolution of each one. Look at the first Pizza Hut homepage as compared to the one today. From 1996, it's pretty scary!