Monthly Archives: November 2016

Interdisciplinary Event Documentation: “The Great WARC Adventure”

screen-shot-2016-04-06-at-12-38-11-pmNick Ruest, Anna St-Onge, and myself wrote a piece in the open-access journal Digital Studies / Le champ numérique, “The Great WARC Adventure: Using SIPS, AIPS, and DIPS to Document SLAPPS.” The deliberately acronym-heavy title underlies a piece that does the following:

  • takes readers through the process of creating a web archive using open-source tools;
  • preserving and providing access to the web archive;
  • and enabling some basic analysis on the collection from the perspective of a historian.

While the long publishing time meant that some of our more recent approaches to analyzing web archives – warcbase, for example – didn’t make it in, the article hopefully provides a useful conceptual approach to working with web archives.

You can find the article here, and abstract below. We hope that you enjoy it! Continue reading Interdisciplinary Event Documentation: “The Great WARC Adventure”

From Dataverse to Gephi: Network Analysis on our Data, A Step-by-Step Walkthrough

We thought that this post from December 2015 was still relevant today. In short, it shows how you can take web archive network files generated by our research team and analyze them yourselves using the open-source Gephi package.

Even more excitingly, there’s many more Gephi files available today for your analysis. To find them, visit our network data page here: It grows on a regular basis!

Ian Milligan

Screen Shot 2015-12-10 at 4.20.20 PM Do you want to make this link graph yourself from our data? Read on.

As part of our commitment to open data – in keeping with the spirit and principles of our funding agency, as well as our core scholarly principles – our research team is beginning to release derivative data. The first dataset that we are releasing is the Canadian Political Parties and Interest Groups link graph data, available in our Scholars Portal Dataverse space.

The first file is all-links-cpp-link.graphml, a file consisting of all of the links between websites found in the our collection. It was generated using warcbase’s function that extracts links and writes them to a graph network file, documented here. The exact script used can be found here.

However, releasing data is only useful if we show people how they can use it. So! Here you go.

Video Walkthrough

This video…

View original post 769 more words