Lecture: Professor Luc Moreau, "Provenance: An Open Approach to Support Workflow Inter-Operability"

Past Event

Date
03 March 2008, 3:00pm - 4:00pm

Location
Oxford Internet Institute (OII)
1 St Giles, Oxford OX1 3JS

Over the last few years, e-Science and e-Business have emphasized the need to expose existing and new procedures as services, so that they can be composed in sophisticated functionality for end-users. In particular, workflows have emerged as a paradigm for representing and managing complex distributed scientific computations. To some extent, with workflow technology, e-scientists are today provided with the means to express and run their experiments. However, while workflow technology is a crucial breakthrough, it is only one of the tools required to support the scientific methodology. As important to domain scientists (and very often ignored by computer scientists!) is the ability to describe past experiments, to reproduce and verify them, and to understand differences between executions. The problem is further compounded by the fact that workflow systems will inevitably be heterogeneous, and multiple workflow technologies are bound to co-exist (e.g. Taverna, Triana, Pegasus, Swift, Kepler). Provenance (also known as lineage, pedigree, audit trail) is crucial to allow scientists to implement their scientific methodology fully in silico. Provenance of a data product is defined as the process that led to that data product. While provenance technology has traditionally been embedded in execution environments (workflows system, operating system, specific application), we have taken a radically different view by seeing a provenance management system as a distinct first-class component of any computational environments where past executions should be inspect-able. Applications of our approach not only include e-science but also business, where past processes have to be audited. By taking this view, and separating provenance from workflow, we were able to identify the essence of provenance and to propose an architecture for provenance management systems, which allow past processes to be described, even when multiple execution technologies are involved. In this talk, Professor Moreau presents the principles of provenance, its architectural design, its implementation, and integration with several workflow technologies. They have successfully deployed the approach in multiple application domains, including astronomy, aerospace engineering, and medicine.