Digital Conservation

I’m capping off a fairly substantial project today. When I transitioned this blog off WordPress, the old posts vanished. Partially, the reason I left WordPress was pain in backing up posts. WordPress provides a nice XML export, but things get tough when you start worrying about artifacts. Unfortunately, over the years, a few of the transitions over WordPress versions and hosts didn’t preserve all the data. It was a lot of work, but I’ve restored the majority of 17 years of blog posts.

Let’s start with a couple big things I’ve done on the site:

A few big notes on how I managed this all:

I worked through years of posts, updating images, updating meta data, and comparing against prior versions. It’s not perfect – 17 years of posts are a LOT to go through. While I used some python code to help search/replace update images, I did the work myself. I didn’t want to risk the magic AI box randomly swapping posts around. Spelling errors, typos, factual issues, all hopefully included in this restoration. Markdown is slightly less expressive, so some posts are missing formatting and details of the earlier versions.

While old deep links might result in 404 errors, most everything visible before can be found again. I have some ideas for next steps – there’s a lot possible now that the site is cleaned up and the data restored.

There’s a bit of an ongoing task to go through and update/repair/build more meta-information on posts. Expect some more refinement, but I promise to keep the content itself true to what was previously published.