Fascinating Interplay About Discovering Content in Web Archives

Web archives have arrived, at least in the pages of high-profile publications such as the Washington Post and the New Yorker.

An especially fascinating exchange took place in mid-February. Gareth Millward, a research fellow in the Centre for History in Public Health at the London School of Hygiene and Tropical Medicine, published “I tried to use the Internet to do historical research. It was nearly impossible” with the Washington Post. In it, he explained the difficulties of navigating extremely large web archives: search queries returned useless results, not sorted in an ideal fashion (or at all), and that instead historians may need to find smaller circumscribed corpuses or explore metadata.

The response by Andy Jackson, Web Archiving Technical Lead at the British Library, on the British Library’s Web Archive blog was equally illuminating. His piece, “Building a ‘Historical Search Engine’ is No Easy Thing,” is a must-read. He pointed out the different use cases that historians have: simply replicating Google (which excels at letting us know what we need to know in an extremely contemporary context) won’t make sense when querying large bodies of web archived material. He walks us through the various steps of the search engine, and concludes by arguing that we need to think of Macroscopes rather than of search engines (sidenote: having just finished copyedits on a co-authored book subtitled The Historian’s Macroscope, I’m inclined to agree with this metaphor!).

These two pieces join a third high-profile piece, “The Cobweb: Can the Internet be Archived?” by Harvard historian Jill Lepore. This was a fascinating exploration of the current state and recent history of web archiving, and is well worth your time.

Religion, social media and the web archive

peterwebster:

Peter reblogs here a post on the ways in which his own study of contemporary religious history needs to come to terms with the ways in which social media content is (and is not) captured by traditional web archiving. As historians, we will need to understand how social media content is being archived, and the ways in which different archives of web-delivered content will need to be interrogated *together* to reconstruct the communication of individuals and organisations.

Originally posted on Webstory: Peter Webster's blog:

Late last year I was delighted to be invited to be one of four keynote speakers at a workshop on religion and social media at the International AAAI Conference on Web and Social Media in Oxford in May. Here are some initial thoughts on what I intend to say.

There has been an interesting upswing recently in scholarly interest in the ways in which religious people, and the organisations in which they gather together, represent themselves and communicate with others on social media. However, this work has been conducted relatively independently from the emerging body of scholarship on the archived web. Image by https://www.flickr.com/photos/smemon/ , CC BY 2.0https://www.flickr.com/photos/smemon/ , CC BY 2.0

There are some reasons for this. First is the fact that much of the scholarship on social media tends to be focussed very firmly on the present. As such, data tends to be gathered directly from social media platforms “to order”, to match the…

View original 353 more words

Milligan Presentation: “The Promise of WebARChive Files”

This paper was given at the American Historical Association’s annual meeting in New York City on January 5th, 2015. It was part of the Text Analysis, Visualization, and Historical Interpretation panel. My thanks to my co-presenters and especially Micki Kaufman who organized the panel.

The text that follows may not be exactly what I said, but is based on my speaking notes with a bit of memory filling in here and there.

AHA Talk.001

AHA Talk.002

Hello everybody, I’d like to begin with a somewhat provocative opening:

I believe that historians are unprepared to engage with the quantity of digital sources that will fundamentally transform their trade. Web archives are going to transform the work we do for a few main reasons: Continue reading Milligan Presentation: “The Promise of WebARChive Files”

Welcome to our Blog!

By Peter Webster and Ian Milligan

The first stage of our project, Web Archives for Historians, has concluded. In just under a year, we’ve amassed a healthy bibliography (about twenty works) that fall within the scope of our bibliography – works written by historians covering topics such as: (a) reflections on the need for web preservation, and its current state in different countries and globally as a whole; (b) how historians could, should or should not use web archives; (c) examples of actual uses of web archives as primary sources.

We’ve probably reached the ceiling on this front, however! There aren’t that many historians who are actively working in this area (yet, we dare say). And so we now want Web Archives for Historians to transition into an active blog that will:

    • Aggregate content by historians or for historians on web archives (similar to the Web Archiving Roundtable) – some of this will come from our own blogs (Peter and Ian), but also from a list of blogs that we’ll be following;
    • draw attention to talks and slides that we spot at scholarly conferences or in publication venues;

and

  • carry commissioned posts (eventually).

Our mandate will be to include:

  1. examples of work done using web archives;
  2. historical method in the web archive;
  3. news of significant new projects, tools, data or web services;
  4. contemporary history using the live web (as core source material, rather than just incidentally).

We hope that you join us by following along with your RSS feed, on Twitter, or just by popping by now and again.