Design of shared workflow catalogs

Revision as of 18:56, 9 December 2013
Our research focuses on the use of shared workflow catalogs for retrieval of appropriate workflows for a user's task.

In prior work, we investigated the use of data-centered queries for workflow matching and retrieval. Work in other groups has focused on workflow matching based on workflow structure and social tagging. However, because these techniques focus on retrieving exact matches they would not be useful for the practical situations when users cannot formulate the exact queries that reflect what they are looking for. Typically, users may be looking for workflows with a set of characteristics, but would be happy finding workflows that share some subset of those. Therefore, in designing shared workflow catalogs we wanted to investigate the use of retrieval techniques from case-based reasoning that have addressed these issues. Building on general similarity metrics from that community, we developed algorithms that support: 1) workflow discovery, as the metrics will do partial matches of the user’s query; and 2) workflow adaptation, as the metrics will reflect the aspects of a user’s query that are not met by the workflow and therefore indicate how the workflow could be changed to satisfy them.

We developed a new generic model for representing workflows as semantically labeled graphs, together with a related model for knowledge intensive similarity measures. We developed new algorithms for workflow similarity computation based on A* search. We also developed a new retrieval algorithm that goes beyond traditional sequential retrieval for graphs, interweaving similarity computation with case selection. We evaluated this model with both scientific workflows and business workflows to demonstrate their broad applicability.

This work paves the way to allowing scientists to retrieve and discover workflows from large repositories based on desired type of result, based on type of algorithm used within the workflow, based on initial data available, or a combination of them.

This work is reported in the following publications:

* “Retrieval of Semantic Workflows with Knowledge Intensive Similarity Metrics”. 
Ralph Bergmann and Yolanda Gil. Proceedings of the Nineteenth International Conference on 
Case Based Reasoning (ICCBR), Greenwich, London, September 2011.  Available as a preprint.
* "Similarity Assessment and Efficient Retrieval of Semantic Workflows." 
Ralph Bergmann and Yolanda Gil.  Information Systems Journal, Vol. 40, March 2014.

Available as a preprint.

