Presentation at CASBS 2010: Muninn Project

  • Posted on: 1 June 2010
  • By: warren

Muninn_WWI_small.gif

The Muninn Project
Tracking, Transcribing, and Tagging Government: Building Digital Records for Computational Social Science
Tuesday June 22, 2010, 14:15-15:15
Center for Advanced Study in the Behavioral Sciences
 
Abstract:
 
The Muninn Project is a multidisciplinary, 
multinational, academic research project investigating millions of records pertaining to 
the First World War in archives around the world.

In this talk I will review some of the methods being used in the Muninn project to 
extract information from the scanned documents of historical archives. Previous data 
extraction efforts for historical research were done through the human review of 
documents, one at a time. We employ an approach where computing power is used to collate 
similar document types to extract the information from them.

The Great War era produced a mix of hand-written and type-written documents that require 
processing using computer extraction methods assisted by the manual reviews of specific 
cases by human volunteers. I will contrast this with previous methods that have been used 
to digitize documents, such as recapchat, and close with some observations about managing 
archival data in a high-volume setting.

Undefined

Paper at MEM2010: Canopener: Recycling Old and New Data

  • Posted on: 16 April 2010
  • By: warren

WWW2010.png

MEM2010, Monday, April 26th, 2010, 11:00am-11:30am
Presented by Cosmin Basca
 
Abstract:
 
The advent of social markup languages and lightweight public data access methods has created an opportunity to share the social, documentary and system information locked in most servers as a mashup.  Whereas solutions already exists for creating and managing mashups from network sources, we propose here a mashup framework whose primary information sources are the applications and user files of a server.  This enables us to use server legacy data sources that are already maintained as part of basic administration to semantically link user documents and accounts using social web constructs.
Undefined

Pages