Learning from the Web’s Past: Five Lessons Gleaned from the Internet Archive’s Wayback Machine

Preservation of an ever changing digital world is a serious challenge.  While traditionally published books can sit on shelves for years, and physically preserved for centuries, digital works can be deleted in a flash and constantly altered.  For those that believe in the value of history, preservation of these works is a challenge worth taking.  One such organization is the Internet Archive.

The Internet Archive is a 501(c)(3) non-profit that was founded to build an Internet library. Its purposes include offering permanent access for researchers, historians, scholars, people with disabilities, and the general public to historical collections that exist in digital format.

One of the cool things about the Internet Archive is their very handy Wayback Machine.  Type in a website link, click a button, and go back to the internet’s days of yore.  Here are five lessons I’ve learned after playing with the site for a few days.

Fresh Spectrum cartoon showing a museum tour of old web page screen shots


The government moves slowly, so when they move forward, pay attention, the case of the Census

The Feds are often criticized for being slow to pick up on new technology, which I think is a little unfair.  The truth is, it takes time for tech to become tried and true.  Government agencies often cannot afford to be innovators because it puts too much at stake.

If you take a look at the screenshots I put together below you’ll see a website that undergoes very little change from 1999 – 2011.  In 2012, the home page came to life.  There are many techies are out there waiting on the next big thing.  But, if you pay attention, you’ll start to notice the last big thing finally hitting the mainstream.   My hunch, expect to see other slow and steady movers livening up their web presence.

Iterations of the Census Home Page from 1996 to 2012

It’s time to bring associations to life, the cases of AEA and ASA

While both AEA and ASA have jumped into the social realm (AEA much more so than ASA) both websites are still caught up in the billboard-esque middle ages of design.  Just compare these two sites with the Census example above, where do you think they belong?  Integrating social web into the main page is more than just inviting conversation and breathing life into a design.  It offers the opportunity to provide focus while preserving broad access to tools and other member benefits.

Showing screen shots from the American Evaluation Association homepage in 1998 and 2012 and American Sociological Association in 1996 and 2012

We are putting too much faith in corporate information providers, the case of GeoCities

In 1999, Yahoo purchased the third most popular site on the web, GeoCities, for Three Billion Dollars.  A mere ten years later, the site was shutdown.  I’m a huge advocate for the use of free blogging platforms like WordPress and Blogger but what happens if one of them decides to close shop?  Without the work of sites like the Internet Archive a lot of our digital history can be wiped away completely as a result of a business decision.

We need to put more effort into archiving academic works in our own domain.   Our tendency has been to link, not reproduce, the works of others online.  But there is an inherent problem with that approach as it often leaves only a single copy.  In the digital domain, republication helps with preservation.  I’ve taken this issue to heart, expect it to come up in discussions surrounding upcoming improvements to Eval Central.

Screenshot of Yahoo Geocities from 1999


Start with a realistic goal, the case of Wikipedia

Wikipedia started in January of 2001 with the idea of producing a complete encyclopedia from scratch.  Their original goal was to make over 100,000 articles (in evaluation we call this a measurable outcome).  After the first six months they had around 6,000 articles.  Over the next year they added an additional 30,000 articles.  Then over the next six months, or two years after they began, they hit that original 100,000 article target.

I love that Wikipedia started with a target.  The 100,000 is tangible and provides direction during the initial slog that accompanies the beginning of most legitimate projects.  Notice in the excerpt timeline below that they dropped the target from the front page before September 2002 when they were really starting to hit their stride.  The presence of an achievable goal doesn’t stop the ball from rolling, and when things start moving, nobody says you have to keep the target.

As of today Wikipedia has close to 4 million articles in English alone.

Showing the welcome text from Wikipedia's home page from 2001 through 2003 and 2012.


Companies that are successful can paint a rosy picture of their past, the case of Tableau

I plugged Tableau into the wayback machine because of something written on Tableau’s “Who We Are” page.

Put together an Academy-Award winning professor, a brilliant computer scientist at the world’s most prestigious university, and a savvy business leader with a passion for data. Add in one of the most challenging problems in software – making databases and spreadsheets understandable to ordinary people. You have just recreated the fundamental ingredients for Tableau.

I like Tableau a lot but they seemed to be laying it on a little thick.  After looking at the history it appears as though the recipe for Tableau also included instructions to simmer for years.  It’s comforting to know that successful things still move slowly.

Who We Are and even Who We Were changes with time.  Today we may look at ourselves as leading a ho-hum life but depending on what we accomplish in the upcoming years maybe we can redefine our past image as well.  Well, as long as it’s not archived.

Picture showing changes in the way Tableau describes its roots from 2004 to 2011