Provenance records provide detailed accounts of workflow execution episodes that facilitate sharing and reuse of workflows as well as their data products. By analyzing provenance records, a scientist can understand the assumptions made by others in their reported results, and could attempt to reproduce those results with reasonable fidelity. Therefore, standard representations of workflow provenance would be very beneficial to promote workflow sharing.

We have collaborated with a group of researchers on developing a provenance model that can be shared across workflow systems. The Open Provenance Model (OPM) is a model of provenance that is designed to meet the following requirements: (1) to allow provenance information to be exchanged between systems, by means of a compatibility layer based on a shared provenance model; (2) to allow developers to build and share tools that operate on such a provenance model; (3) to define provenance in a precise, technology-agnostic manner; (4) to support a digital representation of provenance for any “thing", whether produced by computer systems or not; (5) to allow multiple levels of description to coexist; and (6) to define a core set of rules that identify the valid inferences that can be made on provenance representation. OPM is the result of a series of Provenance Challenges held as part of the International Provenance and Annotation Workshops, and represents the effort of a broad community of workflow researchers. The core concepts of OPM are Process (actions that are executed), Artifact (any object used and produced by a process), and Agent (entities that control processes). OPM represents the provenance of objects (whether digital or not) by an annotated causality graph, which is a directed acyclic graph, enriched with annotations capturing further information pertaining to execution.

We have also collaborated with the broader provenance community to develop general representations of provenance records. We participated in the World Wide Web Consortium (W3C) Provenance Incubator Group. The W3C is an international standards body for Web Architecture and promotes the establishment of community-driven activities that may lead to standardization efforts. This new Incubator Group on Provenance charted the path to the establishment of possible standardization proposals in this area. The group developed to date more than 30 use cases and derived more than 200 requirements out of the use cases. A joint report about requirements for provenance on the web was made widely available. The group also designed mappings across provenance models and vocabularies, using OPM as the reference model. The Final Report of this W3C Provenance Incubator Group includes details on the use cases, requirements, and provenance vocabulary mappings. It also proposed the creation of a Working Group to develop a provenance standard based on 17 core terms that were found to be common in existing vocabularies and necessary to support a broad range of the use cases collected.

Based on that recommendation, the W3C Provenance Working Group was established in April 2011 to develop a provenance standard for the Web. The group has released several documents, including a Primer document and a Provenance Model document in December 2011. The W3C standardization work is ongoing and could change how trust, licensing, and information integration is done on the Web.

