Baking-in bibliographic references directly into RDF / OWL ontologies

  • Posted on: 21 July 2024
  • By: warren

Not every part of an ontology is novel. Often enough we reuse existing works and standards to populate them. At other times the ontology implements policies and processes that are outlined in a document narrative but never before formalised into a data structure. To give credit where credit is due, and to ensure policy compliance, it can be worthwhile to integrate bibliographic citations and references directly within the ontology. Ontologies already exist to record bibliographic data. BIBO is one which has the added benefit that it provides a direct mapping for bibtex-style bibliography managers. Bibframe provides a comprehensive solution if you have a corporate document repository, but avoid using Dublin Core’s bibliographicCitation term as it was never standardised.

Bibliography as ontological objects, but why?

When the bibliographic reference for the documents used to create the ontology are themselves available as ontological terms, they can be directly referenced by other terms or queried directly. This isn’t an exercise in minutiae; it is a change management tool and an additional asset for application developers.

When a term represents a concept defined by a standard, you can directly link it to the specific edition of the document that defines it authoritatively. Besides avoiding the need for those pesky double quotes in the definition property, this also tracks which edition of the standard this term represents. This can easily be done with various levels of verbosity (hadPrimarySource) using the PROV ontology or by using a simple Dublin Core source annotation.

This is important not only for change management: it helps to differentiate terms that have the same labels and similar descriptions but whose context differs from one edition to another. Thus, datasets that record events using older specifications retain their integrity and newly generated data is kept up to date with current standards.

Keeping up with the Joneses

Things change and so does the paperwork. With citations as terms it becomes possible to link them to document management systems and to query them directly within the ontology or the knowledge graph. This allows for the monitoring of which documents are currently referenced by any ontology and whether these will need to be updated upon the publication of a new edition of a reference document.

From an application developer or end user perspective, consider that this allows for greater transparency of process and communication. Design decisions can be linked backed to the narrative of the original standards with minimal clerical effort since it is integrated to the ontology graph.

We often talk about AI explainability as if this was an algorithmic design problem, but this isn't completely true. There is no point explaining the workings of an algorithm if the underlying rules, facts and definitions that communicate with the outside world aren’t available. No amount of abstract logical proofs have value if you are unable to explain if the coded output means bi-annual interest or monthly interest.

Document, Document, Document

Writing documentation is a thankless task: everyone demands it in their preferred style, few will actually eventually read it and more often than not, the select few individuals that do write it have limited resources to do it properly. An ontology-based bibliography lets you automate some of the documentation generation, referencing and cross-referencing in whatever citation format is required.

Ontologies are primarily objects of communication and interoperability. Some schools of thought, "The Ontology Is The Documentation", fail to consider that no matter its logical coherence, an ontology is part of a bigger world. The ability to reference outside documents and terms ensures that the ontology functions within an operational contexts and adds to the value of the ontology and saves valuable time for its users.

English