Submission by Individual B. Vandervalk, Ed. Internet-Draft E. McCarthy Expires: August 3, 2012 M. Wilkinson Intended status: Standards Track James Hogg Research Centre, Institute for Heart + Lung Health, University of British Columbia January 31, 2012 SADI: Semantic Automated Discovery and Integration draft-bvandervalk-sadi-00 Abstract This document describes Semantic Automated Discovery and Integration (SADI), a set of best practices for implementing stateless web services that consume RDF data as input and generate RDF data as output. The goal of SADI is to establish conventions that will enable a much higher level of interoperability between web services from independent providers than is currently possible under the widespread use of WSDL/XML and RESTful services. Under SADI, interoperability depends on the shared use of predicate vocabularies, rather than the shared use of particular XML schemas, JSON structures, or ad hoc data formats. Through the use of OWL to describe service input and output datatypes, SADI enables: i) automated discovery of services that provide data or computations of interest, and ii) automated matchmaking between local data and available services. By iterative application of the former two capabilities, SADI enables semi-automated construction of arbitrarily complex workflows across independent service providers. Status of this Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on August 3, 2012. Vandervalk, et al. Expires August 3, 2012 [Page 1] Internet-Draft SADI January 2012 Copyright Notice Copyright (c) 2012 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Vandervalk, et al. Expires August 3, 2012 [Page 2] Internet-Draft SADI January 2012 Table of Contents 1. Executive Summary . . . . . . . . . . . . . . . . . . . . . . 4 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 3. Security Considerations . . . . . . . . . . . . . . . . . . . 5 4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 5 5. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 6. Motivation and Goals . . . . . . . . . . . . . . . . . . . . . 7 6.1. Motivation . . . . . . . . . . . . . . . . . . . . . . . . 7 6.2. Design Goals . . . . . . . . . . . . . . . . . . . . . . . 7 6.2.1. Interoperability with the Semantic Web . . . . . . . . 7 6.2.2. Stateless Services . . . . . . . . . . . . . . . . . . 8 6.2.3. Batch Processing of Inputs . . . . . . . . . . . . . . 8 6.2.4. Support for Long-running Services . . . . . . . . . . 8 6.2.5. Explicit Relationship Between Service Input and Output . . . . . . . . . . . . . . . . . . . . . . . . 9 6.2.6. Minimal Constraints on Data Modeling . . . . . . . . . 9 7. Relationship to Other Web Service Standards . . . . . . . . . 9 7.1. Web Services Description Language (WSDL) . . . . . . . . . 9 7.2. Semantic Annotations for WSDL (SAWSDL) . . . . . . . . . . 10 7.3. OWL-S and the Web Service Modeling Ontology (WSMO) . . . . 11 8. Running Example: The SADI "Hello, World!" Service . . . . . . 11 9. Service Metadata . . . . . . . . . . . . . . . . . . . . . . . 13 9.1. Retrieving Service Metadata . . . . . . . . . . . . . . . 14 9.2. Representing Service Metadata in RDF . . . . . . . . . . . 15 9.3. Describing Service Interfaces Using OWL . . . . . . . . . 17 9.3.1. The Input OWL Class . . . . . . . . . . . . . . . . . 17 9.3.2. The Output OWL Class . . . . . . . . . . . . . . . . . 21 9.4. Describing Service Execution Parameters Using OWL . . . . 21 9.4.1. The Parameter OWL Class . . . . . . . . . . . . . . . 22 9.4.2. The Default Parameter Instance . . . . . . . . . . . . 23 10. Service Invocation . . . . . . . . . . . . . . . . . . . . . . 23 10.1. Synchronous Services . . . . . . . . . . . . . . . . . . . 23 10.2. Asynchronous Services . . . . . . . . . . . . . . . . . . 25 10.3. Invoking Services with Execution Parameters . . . . . . . 28 11. Service Registries . . . . . . . . . . . . . . . . . . . . . . 30 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 30 12.1. Normative References . . . . . . . . . . . . . . . . . . . 30 12.2. Informative References . . . . . . . . . . . . . . . . . . 31 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 31 Vandervalk, et al. Expires August 3, 2012 [Page 3] Internet-Draft SADI January 2012 1. Executive Summary SADI is a set of best practices for implementing stateless web services that natively consume RDF data as input and generate RDF data as output. Its primary purpose is to increase the interoperability of services across independent providers. Under SADI, the schemas for the input and output RDF data of a service are defined by the service's _input OWL class_ and _output OWL class_, respectively. Provider-specified metadata about a service, including the URIs of the input and output OWL classes, is published as an RDF document that is retrievable by an HTTP GET on the service URL. Service invocation is accomplished by issuing an HTTP POST request to the service URL with an appropriate input RDF document as the request body. The input RDF document contains one or more instances of the input OWL class which represent independent inputs to the service, and in response the service returns an output RDF document with a corresponding number of instances of the output OWL class. Each output instance has the same root URI as its corresponding input instance, in order to ensure that the data consumed and generated by the service are explicitly linked. 2. Introduction The principal benefit of web services is that they enable widespread and convenient reuse of software components, independent of the larger applications or goals being realized by the client software. However, in the current climate of WSDL/XML [WSDL][XML] and RESTful services [REST], the successful implementation of web service clients still depends on detailed human knowledge of the particular services being used. For WSDL/XML services, software developers must be familiar with the particular XML schemas consumed and generated by a service and must implement transformations of local application data to and from those schemas as necessary. Automatic code generators for WSDL clients assist in this task, but developers must still understand the structure of the input/output data and the meaning of the various components. Likewise, developers of RESTful clients must be familiar with the semantics and permitted values of the named parameters for each service and must implement transformations on the service output data as necessary. The requirement on software developers to learn and accomodate the particular interfaces and data schemas of each service has a high cost in terms of human labour. Moreover, variability in the design of schemas and interfaces creates obstacles for the coordinated use of web services across different providers. SADI addresses the variability of data representation through the use of Semantic Web standards, namely RDF [RDF] and OWL [OWL]. As these Vandervalk, et al. Expires August 3, 2012 [Page 4] Internet-Draft SADI January 2012 standards have been specifically designed to facilitate integration and processing of data across multiple sites, they possess significant advantages over XML and ad hoc data formats for encoding web service input/output. In particular, RDF enables automated merging of data sets and OWL enables automated logical reasoning over data. Meaningful data integration always requires some level of agreement between providers regarding data representations. However, under SADI, providers must only agree at the more granular level of predicate vocabularies, rather than on complete representations of datatypes. SADI addresses the variability of service interfaces by proposing conventions for retrieving metadata about services and for invoking services. Briefly, metadata about a service is retrievable as an RDF document by issuing an HTTP GET to the service URL, while service invocation is realized by issuing an HTTP POST to the service URL with an input RDF document as the request body. The response to a service invocation is likewise an RDF document. 3. Security Considerations SADI services and clients are subject to the same security considerations as servers and clients that use the HTTP protocol, as described in Section 15 of [HTTP]. 4. IANA Considerations This document has no actions for IANA. 5. Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. Readers of this document are expected to have a general familiarity with the HTTP protocol [HTTP], RDF [RDF], and OWL [OWL]. In addition, this document uses the following SADI-specific terminology: o _input OWL class_ -- an OWL class that defines the required structure (i.e. schema) of a single input to a SADI service. Each SADI service has exactly one input OWL class. o _input instance_ -- an RDF node that is an instance of the input OWL class for a given SADI service, and thus is a valid input for that service. In general, an RDF node is an instance of an OWL Vandervalk, et al. Expires August 3, 2012 [Page 5] Internet-Draft SADI January 2012 class if it satifisies the property restrictions of the OWL class. Membership of an RDF node in a given OWL class can be determined using an OWL reasoner or can be directly asserted by assigning an rdf:type value that is equal to the OWL class URI. o _input RDF document_ -- a document containing one or more input instances for a SADI service. A SADI service is invoked by issuing an HTTP POST to the service URL with the input RDF document as the request body. o _output OWL class_ -- an OWL class that defines the required structure (i.e. schema) of a single output from a SADI service. Each SADI service has exactly one output OWL class. o _output instance_ -- an RDF node that is an instance of the output OWL class for a given SADI service, and thus is a valid output for that service. In general, an RDF node is an instance of an OWL class if it satifisies the property restrictions of the OWL class. Membership of an RDF node in a given OWL class can be determined using an OWL reasoner or can be directly asserted by assigning an rdf:type value that is equal to the OWL class URI. o _output RDF document_ -- the result of a SADI service invocation, which contains one or more output instances. For a given service invocation, the number of output instances in the output RDF document should be equal to the number of input instances in the input RDF document. Further, the URIs of corresponding input and output instances are always equal. o _service execution parameter_ -- a value which is separate from the input instances for a service invocation, but affects how the input instances are processed. For example, a "Hello, World!" service that returns a natural language greeting might have a service execution parameter indicating the desired output language for the greeting. o _parameter OWL class_ -- an OWL class that defines the required structure (i.e. schema) of service execution parameters for a SADI service. Similarly to input and output instances, service execution parameters may have arbitrarily complex representations in RDF. The parameter OWL class describes a graph that contains _all_ service execution parameters. If a service has multiple execution parameters, their RDF representations must be connected in a single RDF graph that is an instance of the parameter OWL class. o _parameter instance_ -- an RDF node that is an instance of the parameter OWL class for a given SADI service, and thus is a valid Vandervalk, et al. Expires August 3, 2012 [Page 6] Internet-Draft SADI January 2012 set of execution parameters for that service. The input RDF document for a service invocation may contain at most one parameter instance, which affects the processing of all input instances within the input RDF document. o _default parameter instance_ -- an instance of the parameter OWL class that represents the default values for all service execution parameters. These values are used when an input RDF document does not explicitly specify values for execution parameters. 6. Motivation and Goals 6.1. Motivation The original motivation for the development of SADI was the complexity of discoverying, accessing, and integrating public data and software in the domain of bioinformatics. While there are currently thousands of interrelated bioinformatics databases and software tools freely available on the web, they are published using a plethora of incompatible data models, schemas, and software interfaces that impedes their combined use. The authors sought to develop a set of best practices for publishing data and software resources that would simultaneously offer the benefits of Semantic Web standards and technologies, such as the ability to automatically merge data sets and to automatically compute logical inferences from data. While the development of SADI has been motivated by bioinformatics, there is nothing that prevents its application to other domains. It is applicable in any scenario where integrating data and/or software across multiple sites is required. 6.2. Design Goals 6.2.1. Interoperability with the Semantic Web One of the primary goals of SADI is to create web services that are compatible with the Semantic Web. In particular, it is desirable that services should be able to exchange data directly with various consumers and producers of RDF data such as triple stores, static RDF documents, OWL reasoners, and RDF browsers. For this reason, SADI services consume a standard RDF document as input and generate a standard RDF document as output. Another key point of compatibility with the Semantic Web is the use of OWL to define the requirements for the input data, output data, and execution parameters of a service. This permits the use of an Vandervalk, et al. Expires August 3, 2012 [Page 7] Internet-Draft SADI January 2012 OWL reasoner as the main vehicle for data and service matchmaking tasks, such as: o identifying services that can consume a subset of a given RDF graph as input o extracting input instances for a service from a given RDF graph o matching the output interface of one service to the input interface of another service, in order to create service execution chains (workflows) The intent of SADI is to facilitate the use of web services within Semantic Web applications. For example, the authors have developed a prototype query engine called SHARE [SHARE] that integrates SADI services with a SPARQL query engine, a triple store, and an OWL reasoner in order to answer queries over the data that can be generated by a collection of SADI services. 6.2.2. Stateless Services The scope of SADI is limited to stateless services so that services and clients can be implemented in a straightforward manner, at the expense of certain types of advanced applications. The set of stateless services includes services that perform any type of data retrieval or data analysis, but excludes services that effect changes in the real world. A common example of the latter type of service is a service that makes a withdrawal from a bank account. Previous Semantic Web Service standards such as WSMO and OWL-S have been developed to model such stateful services. However, the formal description of stateful services is complex, and the design of software agents to coordinate such services is an ongoing research problem. 6.2.3. Batch Processing of Inputs In order to minimize overhead due to network latency, it should be possible to group independent inputs for a service into a single request, and to receive the corresponding outputs in a single response. In SADI, the input RDF document for a service invocation may contain any number of input instances which represent independent inputs to the service. Likewise, the output RDF document may contain any number of independent output instances. 6.2.4. Support for Long-running Services The processing time of a service should not be limited to the lifetime of a TCP connection. Asynchronous SADI services support Vandervalk, et al. Expires August 3, 2012 [Page 8] Internet-Draft SADI January 2012 long-running computations by means of client polling and HTTP redirects, as described in Asynchronous Services (Section 10.2). 6.2.5. Explicit Relationship Between Service Input and Output It is desirable to ensure that related input and output instances from a service invocation are explicitly linked. This saves a client from the task of tracking input/output relationships on its own, and ensures that the RDF produced by service invocations forms a connected graph that is queriable in a meaningful manner. In SADI, related input and output instances are linked because they share the same URI. This constraint is demonstrated concretely in Running Example: The SADI "Hello, World!" Service (Section 8) and is described more formally in The Output OWL Class (Section 9.3.2). 6.2.6. Minimal Constraints on Data Modeling Aside from the constraint of the previous section, SADI does not provide any rules about how service input and output data should be modeled in RDF. Service providers are free to encode the data using any OWL or RDFS ontologies deemed suitable. Further, the input and output RDF documents for a service invocation consist only of data that is consumed or generated by the service, respectively. There are no SADI-specific messaging structures required within the input/ output RDF documents. 7. Relationship to Other Web Service Standards 7.1. Web Services Description Language (WSDL) WSDL [WSDL] is an XML schema that is the current de facto standard for machine-readable description of web service interfaces. At the time of writing this document, the most recent version of WSDL is WSDL 2.0. The most important difference between SADI and WSDL is that SADI uses RDF for message content, whereas WSDL conventionally uses XML. WSDL uses XML Schema [XML Schema] as the default schema language for message structures, but is also extensible to use other schema languages. To date, there have been proposals for extensions that use Document Type Definitions (DTDs) and RelaxNG as alternative schema languages for WSDL. In principle, a similar extension could be created for the use of OWL as a schema language, although none has been put forward to date. SADI uses only a small, fixed subset of the behaviours that can described by WSDL. In the terminology of WSDL, SADI services must Vandervalk, et al. Expires August 3, 2012 [Page 9] Internet-Draft SADI January 2012 have: o _one operation per service_, where an _operation_ is interaction between the client and the service to accomplish some result. An operation is analogous to a function call in a programming language. o _one endpoint per service_, where an _endpoint_ is an URL where the client interacts with the service o _a fixed protocol_, where the _protocol_ is the mechanism for transporting messages between the client and the service. All SADI services use HTTP as the underlying protocol. o _a fixed message exchange pattern_, where a _message exchange pattern_ is a sequence of messages that are exchanged between a client and a service during an operation. SADI services have one of two possible message exchange patterns, corresponding to synchronous services (Section 10.1) and asynchronous services (Section 10.2). The only variables of a SADI service interface are the graph representations of the input data, the output data, and the service execution parameters, which are defined by the input OWL class (Section 9.3.1), the output OWL class (Section 9.3.2), and the parameter OWL class (Section 9.4.1), respectively. For SADI, these three OWL classes are the functional analog of a WSDL service description file. 7.2. Semantic Annotations for WSDL (SAWSDL) SAWSDL [SAWSDL] is a small set of extensions to the WSDL XML schema that facilitates the mapping of XML-based services to a semantic data model (e.g. RDF). Specifically, SAWSDL defines 3 additional XML attributes for WSDL: o modelReference o loweringSchemaMapping o liftingSchemaMapping _modelReference_ is used to annotate elements of a WSDL interface with entities from a semantic data model, such as class URIs from an OWL ontology. _liftingSchemaMapping_ and _loweringSchemaMapping_ are used to provide mappings of XML datatypes to and from a semantic data model, respectively. The values of _liftingSchemaMapping_ and _loweringSchema_ are URIs that identify documents that define the Vandervalk, et al. Expires August 3, 2012 [Page 10] Internet-Draft SADI January 2012 transformation; however, SAWSDL is agnostic with respect to the specific mapping language that is used. For example, when translating between XML and RDF, the required mappings might be accomplished with XSLT for the lifting transformation and SPARQL followed by XSLT for the lowering transformation. SAWSDL can be used as an adaptor layer that maps a WSDL service to the expected behaviour of a SADI service. 7.3. OWL-S and the Web Service Modeling Ontology (WSMO) OWL-S [OWL-S] and WSMO [WSMO] are two previous Semantic Web Services standards that are similar in their goals and approaches. Both standards define ontologies for describing the _capablities_ and _choreographies_ of stateful web services, where capabilities are changes to the world that are effected by a service, and choreographies are the sequences of messages exchanged between a client and a service during an interaction. (Choreographies correspond to message exchange patterns in WSDL.) OWL-S and WSMO exceed the descriptive power of WSDL by providing a generic framework for modeling both the internal state of a service and the state of external variables that are affected by the service (e.g. a credit card balance). Transitions between states are formally described by boolean formulas that express preconditions and postconditions for an event. The principal difference between OWL-S and WSMO is that OWL-S uses OWL as its ontology language, whereas WSMO uses a more expressive language called the Web Service Modeling Language (WSML). The OWL-S and WSMO standards are complex, and the development of software agents to coordinate OWL-S/WSMO services to accomplish higher order tasks in an ongoing area of research. In comparison to OWL-S and WSMO, SADI is a simpler standard that is limited to the description of stateless services. SADI uses OWL ontologies only for defining the schema of input data, output data, and service execution parameters, and not to define the effects or choreography of a service. The choreography of a SADI service is fixed to one of two possibilities, corresponding to synchronous services (Section 10.1) and asynchronous services (Section 10.2) respectively. 8. Running Example: The SADI "Hello, World!" Service To illustrate the different aspects of the SADI protocol in a concrete manner, we will frequently make reference to the SADI "Hello, World!" service located at Vandervalk, et al. Expires August 3, 2012 [Page 11] Internet-Draft SADI January 2012 http://sadiframework.org/examples/hello. The purpose of this section is to describe the behaviour of this service and at the same time to provide a brief, non-normative introduction to the key aspects of the SADI protocol. The SADI "Hello, World!" service consumes one or more input instances representing people with names (e.g. "Guy Incognito") and returns corresponding output instances representing greetings for each person (e.g. "Hello, Guy Incognito!"). The following shows an example input RDF document for the service, in N3: @prefix foaf: . @prefix hello: . @prefix input: . input:GuyIncognito a hello:NamedIndividual; foaf:name "Guy Incognito" . input:HomerSimpson a hello:NamedIndividual; foaf:name "Homer Simpson" . In response, the service generates the following output RDF document: @prefix hello: . @prefix input: . input:GuyIncognito a hello:GreetedIndividual; hello:greeting "Hello, Guy Incognito!" . input:HomerSimpson a hello:GreetedIndividual; hello:greeting "Hello, Homer Simpson!" . In this example, the input RDF document contains two input instances with the URIs input:GuyIncognito and input:HomerSimpson. Each input instance is processed independently by the service, and so the same result could be generated by invoking the service twice with the input graphs for Guy Incognito and Homer Simpson separately, and afterwords performing an RDF-merge on the two output RDF documents. The input instances are identified by the service as those URIs having an rdf:type matching the service's _input OWL class_ (hello: NamedIndividual). The purpose of the input OWL class is to describe the expected structure of the input instances. More will be said Vandervalk, et al. Expires August 3, 2012 [Page 12] Internet-Draft SADI January 2012 about the purpose and design of the input OWL class in The Input OWL Class (Section 9.3.1). Analogously to the case for input instances, each output instance in the output RDF document is assigned an rdf:type equal to the URI of the service's _output OWL class_ (hello:Greetedindividual). The purpose of the _output OWL class_ is to describe the expected structure of the output instances generated by the service. Comparing the input and output RDF documents, the reader will observe that the URI of each output instance is equal to URI of its corresponding input instance (input:GuyIncognito and input: HomerSimpson, respectively). This is a general requirement for SADI services that ensures that related input and output instances are always explicitly linked. As a result, SADI clients are able to merge the input and output RDF documents from a service invocation into a single, coherent RDF graph that captures the relationships between the input and output data. For example, a client could load the two RDF documents above into a triple store, and then pose queries against the triple store such as "What is the greeting for Homer Simpson?". 9. Service Metadata The metadata for a SADI service provides information that is potentially helpful to human and/or software clients attempting to use the service. All provider-specified metadata for a SADI service MUST be retrievable as an RDF document by issuing an HTTP GET request to the service URL. Within the RDF document, all metadata items MUST be represented as part of a single, connected RDF graph whose root URI is the URL of the service. The metadata graph MUST include the following items: o _input OWL class_ -- an OWL class describing the expected structure of the input RDF graphs consumed by the service o _output OWL class_ -- an OWL class describing the expected structure of the output RDF graphs generated by the service If the service has one or more execution parameters, the metadata graph MUST also include: o _parameter OWL class_ -- an OWL class describing execution parameters for the service o _default parameter instance_ -- an instance of the parameter OWL class which provides default values for the service execution Vandervalk, et al. Expires August 3, 2012 [Page 13] Internet-Draft SADI January 2012 parameters The following items are optional: o _service name_ -- a human readable label for the service o _description_ -- a human readable description of the service functionality o _contact e-mail address_ -- an e-mail address where the provider of the service may be contacted o _service type URI(s)_ -- one or more rdf:type URIs indicating the type of service. These URIs may be used to categorize the service by a wide variety of criteria, such as the task performed, the algorithm utilized, or the intended users of the service. o _unit test(s)_ -- one or more input RDF graph(s) that MUST constitute valid input(s) to the service. The expected output RDF graphs corresponding to these input graphs MAY also be provided. o _authoritative flag_ -- a boolean value that MUST be true if the person or organization hosting the service is also the author or owner of the data underlying the service, or the author or curator of the software which will generate the service output. If a SADI service acts as an interface to third party data or software, the value of the authoritative flag MUST be false. This is to provide some Quality-of-Service information in cases where multiple services report to provide the same output data. In addition, any other items deemed useful to clients of the service MAY be included in the metadata graph. 9.1. Retrieving Service Metadata An RDF document containing the metadata graph MUST be retrievable by an HTTP GET request to the service URL. The GET request MAY include an Accept header indicating the desired RDF serialization format for the response document. A SADI service MUST support content types of text/rdf+n3 for N3 and application/ rdf+xml for RDF/XML, and MAY support additional content types for these formats or for any other RDF serialization formats. In the event of an omitted, unrecognized, or unsupported content type, the default content type used for the response MUST be RDF/XML. Vandervalk, et al. Expires August 3, 2012 [Page 14] Internet-Draft SADI January 2012 9.2. Representing Service Metadata in RDF The schema for the metadata graph is beyond the current scope of the SADI specification. At the time of writing, all SADI services and SADI-related tools known to the authors use the myGrid/Moby service ontology [myGrid/ Moby] to encode the service metadata graph. For illustrative purposes, the service metadata graph for a parameterized version of the "Hello, World!" service (http://sadiframework.org/examples/hello-param) is shown below in N3 format. @prefix protege-dc: . @prefix mygrid: . @prefix hello: . @prefix test: . @prefix foaf: . @prefix xsd: . a mygrid:serviceDescription ; #---------------------------------------- # Service Name #---------------------------------------- mygrid:hasServiceNameText "ParamaterizedHelloWorld"^^xsd:string ; #---------------------------------------- # Service Description #---------------------------------------- mygrid:hasServiceDescriptionText "A \"Hello, world!\" service where the output language is sp ecified in a parameter"^^xsd:string ; #---------------------------------------- # Contact E-mail Address, Authoritative Flag #---------------------------------------- mygrid:providedBy [ a mygrid:organisation ; protege-dc:creator "person@organization.com"^^xsd:string ; mygrid:authoritative "false"^^xsd:boolean ] ; Vandervalk, et al. Expires August 3, 2012 [Page 15] Internet-Draft SADI January 2012 mygrid:hasOperation [ a mygrid:operation ; #---------------------------------------- # Input OWL Class #---------------------------------------- mygrid:inputParameter [ a mygrid:parameter ; mygrid:objectType hello:NamedIndividual ] ; #---------------------------------------- # Parameter OWL Class, Default Parameter Graph #---------------------------------------- mygrid:inputParameter [ a mygrid:secondaryParameter ; mygrid:objectType hello:SecondaryParameters ; mygrid:hasDefaultValue [ a hello:SecondaryParameters ; hello:lang "en"^^xsd:string ] ] ; #---------------------------------------- # Output OWL Class #---------------------------------------- mygrid:outputParameter [ a mygrid:parameter ; mygrid:objectType hello:GreetedIndividual ] ; #---------------------------------------- # Unit Test # (test input/output RDF included directly) #---------------------------------------- mygrid:hasUnitTest [ a mygrid:testCase ; mygrid:exampleInput [ a hello:InputClass ; foaf:name "Guy Incognito" ] ; mygrid:exampleOutput [ a hello:OutputClass ; hello:greeting "Hello, Guy Incognito!" Vandervalk, et al. Expires August 3, 2012 [Page 16] Internet-Draft SADI January 2012 ] ] ; #---------------------------------------- # Unit Test # (test input/output RDF in external documents) #---------------------------------------- mygrid:hasUnitTest [ a mygrid:testCase ; mygrid:exampleInput test:hello-param-input.rdf ; mygrid:exampleOutput test:hello-param-output.rdf ] ] . 9.3. Describing Service Interfaces Using OWL The input and output OWL classes for a service provide a machine- readable representation of the service interface. This facilitates the automation of various data and service matchmaking tasks, such as: o identifying services that can consume a subset of a given RDF graph as input o extracting input instances for a service from a given RDF graph o matching the output interface of one service to the input interface of another service, in order to create service execution chains (workflows) Moreover, the use of OWL for describing service interfaces enables the use an _OWL reasoner_ as the main vehicle for accomplishing these tasks. 9.3.1. The Input OWL Class The primary purpose of the input OWL class is to identify and extract valid input instances for a service from a given RDF data set. Each SADI service has exactly one input OWL class which MUST either be referenced by or directly included in the metadata graph for the service. For illustrative purposes, the following excerpt shows the definition for hello:NamedIndividual, which is the input OWL class for the SADI "Hello, World!" service: Vandervalk, et al. Expires August 3, 2012 [Page 17] Internet-Draft SADI January 2012 1 This class definition states that a URI is an instance of hello: NamedIndividual if and only if it has one or more values for the foaf:name property. As a result, each input instance for the "Hello, World!" service is required to have at least one foaf:name property. 9.3.1.1. Instance Checking and the Input OWL Class The identification of instances of an OWL class within an RDF graph is a commonly supported operation of OWL reasoners, which we we will refer to here as _instance checking_. The purpose of this section is provide guidelines for writing an input OWL class that enables an OWL reasoner to perform instance checking in a useful manner. The most important consideration when authoring an input OWL class is that the conditions for class membership should be defined using necessary and sufficient ('if and only if') conditions. In OWL, necessary conditions ('if') are defined using the rdfs:subClassOf property whereas necessary and sufficient conditions ('if and only if') are defined using the owl:equivalentClass property. For example, the following two excerpts show alternate definitions of hello:NamedIndividual which use necessary conditions and necessary and sufficient conditions, respectively: 1 Vandervalk, et al. Expires August 3, 2012 [Page 18] Internet-Draft SADI January 2012 1 The first definition uses only a necessary condition. It states that a URI has one or more foaf:name properties _if_ it is an instance of hello:NamedIndividual. (A URI is known to be an instance of the hello:NamedIndividual if it has an rdf:type value of hello: NamedIndividual.) From this rule, a reasoner cannot deduce that a given URI is a member of hello:NamedIndividual based on its properties. The second definition uses a necessary and sufficient condition. It states that a URI has one or more foaf:name properties _if and only if_ it is an instance of hello:NamedIndividual. From this rule, the reasoner can deduce that a URI with one or more foaf:name properties is a member of the hello:NamedIndividual class. A second consideration when authoring an input OWL class is that, at the time of writing this document, the majority of OWL reasoners operate under the Open World Assumption (OWA). The OWA holds that a statement cannot be inferred to be false merely by its absence in a data set. Instead, the truth value of a such a statement is simply unknown. For example, consider an RDF data set which provides foaf: name values for various URIs representing people, as might be used for input to the SADI "Hello, World!" service. Under the OWA, the fact that a particular URI (person) only has a single value for the foaf:name property within a particular data set does not imply that the person only has one name. He or she may have aliases that are represented by other values for foaf:name in other RDF data sets on the web. Similarly, one cannot assume that a person does not have a name simply because there is no foaf:name value for that person in a particular data set. Under the OWA, certain types of property restrictions cannot be directly tested for truth. For example, consider the following definition of hello:NamedIndividual, which uses an exact cardinality restriction: Vandervalk, et al. Expires August 3, 2012 [Page 19] Internet-Draft SADI January 2012 1 This definition states that a URI is a member of the hello: NamedIndividual class _if and only if_ it has exactly one foaf:name property. However, even if a URI (person) has exactly one value for foaf:name in a particular data set, that same URI may possess any number of additional foaf:name values in other RDF data sets on the web. For this reason, a reasoner using the OWA can never confirm the truth of an exact cardinality restriction by examining the known properties of a URI. It can only prove the falsehood of the cardinality restriction in cases where the URI is known to have a greater number of distinct values for the property than desired. As the primary purpose of the input OWL class is to automatically identify and extract input graphs for a service from an RDF data set, it is important to define the input OWL class using property restrictions that can be directly tested for truth. The following types of property restrictions satisfy this criteria: o owl:someValuesFrom (existential quantification) o owl:hasValue (value restriction) o owl:minCardinality (minimum cardinality restriction) o owl:minQualifiedCardinality (qualified minimum cardinality restriction) On the other hand, the following types of property restrictions do NOT satisfy the directly-testable-for-truth criteria: o owl:allValuesFrom (universal quantification) o owl:cardinality (cardinality retriction) o owl:qualifiedCardinality (qualified cardinality restriction) o owl:maxCardinality (maximum cardinality restriction) Vandervalk, et al. Expires August 3, 2012 [Page 20] Internet-Draft SADI January 2012 o owl:maxQualifiedCardinality (qualified maximum cardinality restriction) 9.3.2. The Output OWL Class The output OWL class provides a machine readable description of the output instances produced by a service, which facilitates the automated identification of services that produce data of interest to either human or software clients. Each SADI service has exactly one output OWL class which MUST be either referenced by or directly included in the metadata graph for the service. Each input instance that is sent to a service produces exactly one output instance, and each pair of corresponding input and output instances ALWAYS have the same URI. While the function of the input OWL class is to identify valid input instances for a service within an RDF data set, the function of the output OWL class is to describe the graph structures that are attached to each input instance as a result of invoking the service. To illustrate, the following shows the definition of hello: GreetedIndividual, the output OWL class for the SADI "Hello, World!" service: 1 The definition for hello:GreetedIndividual indicates that one or more hello:greeting properties will be attached to each input instance as a result of invoking the service. 9.4. Describing Service Execution Parameters Using OWL Service execution parameters are values which are separate from the input instances for a service invocation, but which affect how those input instances are processed. For example, the parameterized version of the SADI "Hello, World!" service (discussed below) accepts an execution parameter indicating the desired language for the generated greetings. Vandervalk, et al. Expires August 3, 2012 [Page 21] Internet-Draft SADI January 2012 9.4.1. The Parameter OWL Class Analogously to the input and output OWL classes, the _parameter OWL class_ describes the RDF representation of the service execution parameters. A service may have any number of execution parameters, and each parameter may have an arbitrarily complex representation in RDF. However, all parameter representations MUST be contained within a single, connected RDF graph that is described by the parameter OWL class. Any service with execution parameters MUST have exactly one parameter OWL class that is either referenced by or directly included in the metadata graph for the service. To demonstrate the description of service execution parameters, consider an extended "Hello, World!" service that returns greetings in alternate languages. Such a service has been implemented and is accessible at http://sadiframework.org/examples/hello-param. The parameter OWL class for the service (hello:SecondaryParameters) describes a single execution parameter for the desired language of the output greetings, as shown below: 1 Although the intention of the hello:SecondaryParameters class is to describe a graph with a single value for hello:lang, a minimum cardinality restriction is used to facilitate instance checking with OWL reasoners that use the Open World Assumption (OWA). For a detailed discussion of OWL class design as it pertains to instance checking and the OWA, see Instance Checking and the Input OWL Class (Section 9.3.1.1). When invoking a service, a client indicates the desired values of execution parameters by including a single _parameter instance_ in the input RDF document for the service invocation, where the parameter instance is an RDF node that is an instance of the parameter OWL class. All input instances within the input RDF document are processed according to this single, shared set of parameter values. Vandervalk, et al. Expires August 3, 2012 [Page 22] Internet-Draft SADI January 2012 9.4.2. The Default Parameter Instance The inclusion of a parameter instance in the input RDF document for a service invocation is not required. If no parameter instance is provided by the client, the service MUST use a suitable set of default values for the execution parameters. Further, the service metadata graph MUST either include or reference exactly one _default parameter instance_ that represents these default values for the parameters. The output instances generated by an input RDF document that includes a copy of the default parameter instance MUST be identical to the output instances generated by an input RDF document that does not contain a parameter instance. For illustrative purposes, the following N3 shows the default parameter instance for the parameterized "Hello, World!" service discussed above: @prefix hello: . [] a hello:SecondaryParameters; hello:lang "en" . This default parameter instance indicates that, unless otherwise specified, greetings will be returned in English. 10. Service Invocation 10.1. Synchronous Services Communication with a synchronous SADI service occurs according to the following steps: 1. The client sends an HTTP POST request to the service URL. The body of the request is an RDF document containing one or more instances of the service's input OWL class and, optionally, one instance of the service's parameter OWL class. The serialization format of the RDF document SHOULD be specified in the Content-type header of the POST request (one of application/rdf+xml or text/ rdf+n3). If no Content-type header is sent with the request, the service MUST assume that the input is in RDF/XML format. Similarly, the desired serialization format of the response RDF document MAY be indicated with an Accept header (also one of application/rdf+xml or text/rdf+n3). In the case that the client does not provide an Accept header, or the content type is unsupported, the service MUST return RDF/XML. Vandervalk, et al. Expires August 3, 2012 [Page 23] Internet-Draft SADI January 2012 POST /examples/hello HTTP/1.1 Host: sadiframework.org Accept: text/rdf+n3 Content-type: text/rdf+n3 @prefix hello: . @prefix input: . @prefix foaf: . @prefix rdf: . input:GuyIncognito a hello:NamedIndividual; foaf:name "Guy Incognito" . input:HomerSimpson a hello:NamedIndividual; foaf:name "Homer Simpson" . 2. The service sends a response containing the output. The body of the response is an RDF document containing one or more instances of the service's output OWL class. Each instance of the output OWL class MUST have the same root URI as exactly one instance of the input OWL class from the input RDF document. The serialization format of the output RDF document MUST be specified in the Content- type header of the response (one of application/rdf+xml or text/ rdf+n3). HTTP/1.1 200 OK Content-type: text/rdf+n3 @prefix hello: . @prefix input: . input:GuyIncognito a hello:GreetedIndividual; hello:greeting "Hello, Guy Incognito!" . input:HomerSimpson a hello:GreetedIndividual; hello:greeting "Hello, Homer Simpson!" . A synchronous service must finish processing its input before the TCP connection between the client and the service times out. SADI services that must run for longer periods of time should be implemented as asynchronous services, as described in the next section. Vandervalk, et al. Expires August 3, 2012 [Page 24] Internet-Draft SADI January 2012 10.2. Asynchronous Services Asynchronous SADI services use client polling and HTTP redirects to accomodate services that may run for an arbitrarily long period of time. Communication with an asynchronous SADI service occurs according to the following steps: 1. The client sends an HTTP POST request to the service URL. The body of the request is an RDF document containing one or more instances of the service's input OWL class and, optionally, one instance of the service's parameter OWL class. The serialization format of the RDF document SHOULD be specified in the Content-type header of the POST request (one of application/rdf+xml or text/ rdf+n3). If no Content-type header is sent with the request, the service MUST assume that the input is in RDF/XML format. Similarly, the desired serialization format of the response RDF document MAY be indicated with an Accept header (also one of application/rdf+xml or text/rdf+xml). In the case that the client does not provide an Accept header, or the content type is unsupported, the service MUST return the response in RDF/XML. POST /examples/hello HTTP/1.1 Host: sadiframework.org Accept: text/rdf+n3 Content-type: text/rdf+n3 @prefix hello: . @prefix input: . @prefix foaf: . @prefix rdf: . input:GuyIncognito a hello:NamedIndividual; foaf:name "Guy Incognito" . input:HomerSimpson a hello:NamedIndividual; foaf:name "Homer Simpson" . 2. The service sends a response with the HTTP response code 202 (accepted but incomplete). The body of the response is an RDF document containing statements about the input instances. The serialization format of the RDF document MUST be specified in the Content-type header of the response (one of application/rdf+xml or text/rdf+n3). The existence of additional data is indicated by rdfs: isDefinedBy statements where the object is a URL the client must fetch to receive the complete output. There MAY be multiple such Vandervalk, et al. Expires August 3, 2012 [Page 25] Internet-Draft SADI January 2012 URLs and the initial output MAY contain only rdfs:isDefinedBy statements. HTTP/1.1 202 Accepted Content-type: text/rdf+n3 @prefix hello: . @prefix input: . @prefix rdfs: . @prefix rdf: . input:GuyIncognito a hello:GreetedIndividual; rdfs:isDefinedBy . input:HomerSimpson a hello:GreetedIndividual; rdfs:isDefinedBy . 3. The client sends a GET request for each rdfs:isDefinedBy URL in the initial response. The client SHOULD include an Accept header indicating the desired RDF serialization format for the response, which otherwise defaults to RDF/XML. Request 1: GET /examples/hello?poll=1 Host: sadiframework.org Accept: text/rdf+n3 Request 2: GET /examples/hello?poll=2 Host: sadiframework.org Accept: text/rdf+n3 4. If the output for a given polling URL is ready, the service sends a response with the output. The body of the response is an RDF document containing statements that should be combined with the initial output document. The serialization format of the RDF document MUST be specified in the Content-type header of the response (one of application/rdf+xml or text/rdf+n3). If the output is not yet ready, the service sends an HTTP redirect with a Retry-after header that contains the number of seconds that the client should wait before trying again. In the example responses below, the Vandervalk, et al. Expires August 3, 2012 [Page 26] Internet-Draft SADI January 2012 response for the first output is ready, but the response for the second output is not yet ready. Response for Request 1: HTTP/1.1 200 OK Content-type: text/rdf+n3 @prefix hello: . @prefix input: . input:GuyIncognito a hello:GreetedIndividual; hello:greeting "Hello, Guy Incognito!" . Response for Request 2: HTTP/1.1 302 Moved Temporarily Pragma: sadi-please-wait = 5000 Location: http://sadiframework.org/examples/hello?poll=1 5. The client waits as suggested for the second output, then follows the redirect. GET /examples/hello?poll=2 Host: sadiframework.org Accept: text/rdf+n3 6. If the second output is still not ready, the service sends another HTTP redirect as above. When the second output is ready, the service returns another RDF document: HTTP/1.1 200 OK Content-type: text/rdf+n3 @prefix hello: . @prefix input: . input:HomerSimpson a hello:GreetedIndividual; hello:greeting "Hello, Homer Simpson!" . 7. The client may perform an RDF-merge on the initial output document and the output documents from each polling URL to create an RDF document that contains a complete representation of all output instances. The client SHOULD remove the rdfs:isDefinedBy statements Vandervalk, et al. Expires August 3, 2012 [Page 27] Internet-Draft SADI January 2012 from the merged document, as the polling URLs MAY expire after returning data or after a fixed period of time, at the discretion of the service provider. @prefix hello: . @prefix input: . input:GuyIncognito a hello:GreetedIndividual; hello:greeting "Hello, Guy Incognito!" . input:HomerSimpson a hello:GreetedIndividual; hello:greeting "Hello, Homer Simpson!" . 10.3. Invoking Services with Execution Parameters _Service execution parameters_ are settings that affect the behaviour of a service invocation and which are separate from the input data that is processed by the service. A SADI service MAY have any number of execution parameters that are collectively described by a single _parameter OWL class_, as detailed in The Parameter OWL Class (Section 9.4.1). A client indicates values for service execution parameters when invoking a synchronous or asynchronous SADI service by including exactly one instance of the parameter OWL class in the input RDF document. All instances of the input OWL class within the input RDF document MUST be processed according to this shared set of execution parameters. If the client includes more than one instance of the parameter OWL class in the input RDF document, the behaviour of the service is undefined. If no instance of the parameter OWL class is provided by the client, the service MUST use a suitable set of default values for the execution parameters. In the case where the parameter graph represents multiple independent parameters, the client MAY omit any component of the parameter graph corresponding to a particular parameter, in which case the service will use the corresponding default value instead. The metadata graph for every service that has execution parameters MUST either reference or directly include exactly one _default parameter instance_, as described in The Default Parameter Instance (Section 9.4.2). The following shows an example invocation of the parameterized "Hello, World!" service at http://sadiframework.org/examples/hello-param, which is a synchronous Vandervalk, et al. Expires August 3, 2012 [Page 28] Internet-Draft SADI January 2012 service. The client indicates that the output greetings should be returned in Spanish by including an appropriately constructed instance of the parameter OWL class in the input RDF document. Request: POST /examples/hello HTTP/1.1 Host: sadiframework.org Accept: text/rdf+n3 Content-type: text/rdf+n3 @prefix hello: . @prefix input: . @prefix foaf: . [] a hello:SecondaryParameters; hello:lang "es" . input:GuyIncognito a hello:NamedIndividual; foaf:name "Guy Incognito" . input:HomerSimpson a hello:NamedIndividual; foaf:name "Homer Simpson" . Response: HTTP/1.1 200 OK Content-type: text/rdf+n3 @prefix hello: . @prefix input: . input:GuyIncognito a hello:GreetedIndividual; hello:greeting "Hola, Guy Incognito!" . input:HomerSimpson a hello:GreetedIndividual; hello:greeting "Hola, Homer Simpson!" . Vandervalk, et al. Expires August 3, 2012 [Page 29] Internet-Draft SADI January 2012 11. Service Registries A service registry allows users to query metadata about a collection of services, in order to discover services that accomplish a task of interest. The implementation of service registries for SADI is beyond the current scope of the specification. However, for the benefit of SADI users, the authors maintain a public registry of SADI services that may be accessed at http://sadiframework.org/registry. 12. References 12.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, 1997. [WSDL] Chinnici, R., Moreau, J., Ryman, A., and S. Weerawarana, "Web Services Description Language (WSDL) Version 2.0 Part 1: Core Language", W3C Recommendation , 2007. [XML] Bray, T., Paoli, J., Sperberg-McQueen, C., Maler, E., and F. Yergeau, "Extensible Markup Language (XML) 1.0 (Fifth Edition)", W3C Recommendation , 2008. [REST] Fielding, R., "Architectural styles and the design of network-based software architectures", University of California , 2000. [RDF] Klyne, G. and J. Carroll, "Resource Description Framework (RDF): Concepts and Abstract Syntax", W3C Recommendation , 2004. [OWL] Motik, B., "OWL 2 Web Ontology Language: Structural Specification and Functional-Style Syntax", W3C Recommendation , 2009. [HTTP] Fielding, R., "Hypertext Transfer Protocol -- HTTP/1.1", RFC 2616, 1999. [XML_Schema] Thompson, H., Beech, D., Maloney, M., and N. Mendelsohn, "XML Schema Part 1: Structures Second Edition", W3C Recommendation , 2004. [SAWSDL] Farrell, J. and H. Lausen, "Semantic Annotations for WSDL and XML Schema", W3C Recommendation , 2007. Vandervalk, et al. Expires August 3, 2012 [Page 30] Internet-Draft SADI January 2012 [OWL-S] Martin, D., "OWL-S: Semantic markup for web services", W3C Member Submission , 2004. [WSMO] De Bruijn, J., "Web Service Modeling Ontology (WSMO)", W3C Member Submission , 2005. 12.2. Informative References [myGrid/Moby] Wolstencroft, K., Alper, P., Hull, D., and C. Wroe, "The myGrid Ontology: Bioinformatics Service Discovery", International Journal of Bioinformatics Research and Applications , 2007. [SHARE] Vandervalk, B., "The SHARE system : a semantic web based approach for evaluating queries across distributed bioinformatics databases and software", University of British Columbia , 2011. Authors' Addresses Ben Vandervalk (editor) James Hogg Research Centre, Institute for Heart + Lung Health, University of British Columbia Email: ben dot vvalk at gmail E. Luke McCarthy James Hogg Research Centre, Institute for Heart + Lung Health, University of British Columbia Mark D. Wilkinson James Hogg Research Centre, Institute for Heart + Lung Health, University of British Columbia Vandervalk, et al. Expires August 3, 2012 [Page 31]