« Balance | Main | Science on Screen »

Web Memory

Laziness has kept me from implementing / installing a feature related to the web that I've wanted for some time. I want to record and save every bit of traffic I send over HTTP. I don't care if the data was sent securely or not, everything should be saved. I assume that the storage mechanism will provide safe guards against unauthorized access. Unlike the cache that a browser uses which eventually gets filled, nothing in my store would be removed. I'd also want to save all of the meta-data associated with the data such as the raw HTTP headers etc. Everything, including those pesky ads should be saved.

Why would I want such a thing? Despite the use of such lovely features as bookmarks and history in Firefox combined with del.icio.us and other such sites, I don't always capture what I want to in a way that makes it easy to find it again. Case in point. I was recently involved in a discussion of higher education, the rising costs, and benefits. I know over the few month I read of series of articles in the Economist about that which I previously forwarded to someone (while I still had the physical magazine to easily look up the issue number) along with a couple of NY Times articles (which I only read online).

Trying to go back to either of those web sites and use their search engines usually leaves me frustrated. The primary reason is that using one of their search engines doesn't incorporate one of the most important filters. The content that I've read on their site. I'll call this a remembrance search. If I'm doing an exploratory search, I want the search to consider all of the content it has at its disposal (and hopefully just return the best matches). If instead I'm doing a remembrance search, I only want the search to consider content that I've looked at apply my remembrance key (aka search term) against that subset of content.

I consider the problem of trying to have every site I visit implement a remembrance search next to impossible. However, since my computer is receiving all of the content from these sites (the subset of information that I've looked at) I have an easy way to do that remembrance search. For example I could setup Google's Desktop or Apple's Spotlight to search the collection of content I've viewed.

Given the ubiquity of proxy servers, I think implementing such a system would be easy, but as I mentioned at the start of this post, my own laziness has prevented me from doing it.