Presentation at Museums on the Web 2015

  • Posted on: 9 March 2015
  • By: warren
Palmer House Hilton, Chicago, IL, USA
April 8-11, 2015, 10:30am - 12:00pm
Grand Ballroom (4F) 
Joint work with David Evans, Minsi Chen, Mark Farrell and Daniel Mayles.
We review the possibilities, pitfalls, and promises of recreating lost heritage sites and historical events using augmented reality and "Big Data" archival databases. We define augmented reality as any means of adding context or content, via audio/visual means, to the current physical space of a visitor to a museum or outdoor site. Examples range from simple prerecorded audio to graphics rendered in real time and displayed using a smartphone.
Previous work has focused on complex multimedia museum guides, whose utility remains to be evaluated as enabling or distracting. We propose the use of a data­-driven approach where the exhibits' augmentation is not static but dynamically generated from the totality of the data known about the location, artifacts, or event. For example, at Bletchley Park, reenacted audio conversations are played within rooms as visitors walk through them. These can be called "virtual contents," as the audio recordings are manufactured. Given that a number of documentary sources, such as meeting minutes, are available concerning the events that occurred within the site, a dynamic computer-generated script could add to the exhibits.
Visitors' experiences can therefore react to their movements, provide a different experience each time, and be factually correct without requiring any expensive redesign. Furthermore, the use of a data-driven approach allows for the updating of exhibits on the fly as researchers create or curate new data sources within the museum. If artifacts need to be removed from an exhibit, pictures, descriptions, or three-dimensional printed copies can be substituted, and the augmented reality of visitor experience can adapt accordingly.

Workshop: Computational Linguistics for Libraries, Archives and Museums at CODE4LIB

  • Posted on: 12 March 2014
  • By: warren


CLLAM Workshop (Computational Linguistics for Libraries, Archives and Museums)
Code4Lib Conference 2014, Raleigh, NC, USA
Monday, March 24
Joint presentations with Corey Harper, Amalia Levi, Douglas W. Oard and Robert Warren.
We will hack at the intersection of diverse content from Libraries, Archives and Museums and bleeding edge tools from computational linguistics for slicing and dicing that content. Did you just acquire the email archives of a start-up company? Maybe you can automatically build an org chart. Have you got metadata in a slew of languages? Perhaps you can search it all using one query. Is name authority control for e-resources getting too costly? Let's see if entity linking techniques can help. These are just a few teasers.

There will be plenty of content and tools supplied, but please bring your own [data] too -- you'll hack with it in new ways throughout the day. We'll get started with some lightning talks on what we've brought, then we'll break up into groups to experiment and work on the ideas that appeal. Three guaranteed outcomes: you'll walk away with new ideas, new tools, and new people you'll have met.


Paper at VLDB 2006: Multi-column substring matching for database schema translation

  • Posted on: 15 August 2006
  • By: warren


Multi-column substring matching for database schema translation,
Wednesday, September 13, 2006, 12:00pm-12:30pm
Abstract: We describe a method for discovering complex schema translations involving substrings from multiple database columns. The method does not require a training set of instances linked across databases and it is capable of dealing with both fixed-and variable-length field columns. We propose an iterative algorithm that deduces the correct sequence of concatenations of column substrings in order to translate from one database to another. We introduce the algorithm along with examples on common database data values and examine its performance on real-world and synthetic datasets.