Thursday, December 12, 2013

'Scarcity in abundance'

The Mount website captured by the Wayback Machine
at one minute after midnight on June 1, 2005.
WE CAME ACROSS A COUPLE OF EXAMPLES recently of disappearing web content, a common problem for archivists and one that is going to leave a big hole in history. It's sometimes known as the conundrum of "scarcity in abundance," the phenomenon of overwhelming amounts of information in electronic form, very little of which is being retained, preserved and discoverable later.

A staff member recently needed access to the important "Diversity Statement" written during the first years of her presidency by Dr. Jacqueline Powers Doud, 2000-2011. It was posted on the Mount website, taken down at some point during a web update or redesign, and not archived.

In a similar case, a UCLA doctoral student recently wanted to see our "Diversity Timeline," which was also posted for many years on the MSMC website. It too was removed and not archived.

Our preservation fallback.
Thank goodness for the Internet Archive and its Wayback Machine. The computers of this nonprofit organization periodically take snapshots of websites and preserve them in a meaningful way -- they can be located and even searched up to a point, and although the pages are often incomplete or missing some content, at least some material can be found.

The Diversity Timeline shows up in snapshots in the mid-2000s forward. (MSMC home page snapshot shown at top), including the Diversity Statement. December 16, 2008, marks their final snapshot nearly five years ago. In the snapshot of January 31, 2009, the "Leadership in Diversity" pages have disappeared from the Wayback Machine.

You can see them by accessing this link, shown in its entirety:
http://web.archive.org/web/20081216073240/http://www.msmc.la.edu/pages/404.asp.

We're lucky with this content. As the frame capture below shows, some content doesn't make it into the Wayback Machine. Another thing about Wayback is you have to know pretty much exactly what you are looking for, or have a complete link, and not count on ordinary searching.

Handing responsibility for keeping our web pages to a set of algorithms owned by a third party isn't a long-term model. Always in need of more funding, what would happen if the Internet Archive went under? We're both glad it's there, and worried that we're all starting to count on it. Unlike paper, once this stuff is gone, it's really gone.

A blank rectangle frames some white space where a photo gallery
on the MSMC home page was not captured by the web snapshot.