21 years aweb

I completely failed to notice that the 20th anniversary of my web presence passed last year.

I don’t have my early sites’ source; I only put them under version control in January 2000. All I have is what I find in online archives. The only public archive around at the time was archive.org, aka the wayback machine (I’m grateful to them for that). The first archive.org scan of one of my websites was taken in January 1997. That site refers to posts made in 1996.

Since I have that version control database, I can compare the wayback machine’s later copies, so get a feel for their accuracy and completeness. That will allow me to understand the validity of their earlier archives.

I can’t resist grabbing copies of my old websites, archiving the archive, as it were. Most of the tools for digging into archive.org have the special properly called ‘not sodding working for me’. Fortunately, there’s a ruby tool, wayback_machine_downloader, which gives the dog its breakfast. Well, at least it downloads files, which is more than the other tools did.

(What I did was note the dates the wayback machine says it archived a home page, then grabbed all files archived between two such dates, e.g, “wayback_machine_downloader -f 19990204 -t 19990330 -d 19990330 http://dylanharris.org”. I got a mash of everything it had noticed, which was rarely complete.)

Having grabbed the archive, I have to post it! All one persons who follow this site can be excited by following this site even more! Yay! (That one person is me, by the way).

Actually, please, don’t go visiting. My old sites were ever so slightly desperately ugly. They were of their time: intended for 640x480 CRT screens with hugely variable contrast and limited colours which prevented pretty layout, written by a programmer who hadn’t got the hang of HTML layout, built using the simple tools of the time. Furthermore, what you can find below is edited (archive.org has the originals): I’ve corrected spelling, I’ve disabled old email addresses, I’ve pointed all legal stuff to the current page, etc.. Worse, some things are completely lost, so I’ve disabled links to them.

It doesn’t help that my earliest website is blocked from archive retrieval. It was http://www.demon.co.uk/csl/, which, as you see, was hosted at demon. My site address was based on the general demon address, and I presume the modern demon block that site on archive.org to prevent access to those member sites that really do want to be blocked, but in consequence they block everything. I wish they, almost certainly the demon now sold out to the devil, were slightly less clodhoppery about it.

UPDATE: In a continued spirit of bad taste, I’ve posted screenshots of my websites’ old front pages.