Técnicas de IA para Biologia

8 - OWL Language and Semantic Web

André Lamúrias

OWL Language and Semantic Web

Recap - Semantic Web

OWL Language and Semantic Web

RDF - Resource Description Framework

RDF - Resource Description Framework

Basic Idea

  • Represent information about resources in the World Wide Web
  • Statements using triples containing a subject, a predicate and an object
    • subject - who/what the statement is about
    • predicate - property/characteristic of the subject in the statement
    • object - associated value using the characteristic
  • Use uniform resource identifiers (URIs) for each of them
  • Example:
http://www.example.org/ hasCreator John Smith

RDF - Resource Description Framework

RDF formats - RDF/XML

  • Example (modified) from Enzyme Commission numerical classification scheme for enzymes
  • from Uniprot - protein sequence and annotation data
1.  <?xml version=’1.0’ encoding=’UTF-8’?>
2.  <rdf:RDF xmlns="http://purl.uniprot.org/core/"
3.           xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
4.    <rdf:Description
5.        rdf:about="http://purl.uniprot.org/enzyme/1.14.11.2">
6.      <rdf:type rdf:resource="http://purl.uniprot.org/core/Enzyme"/>
7.      <name>Procollagen-proline dioxygenase</name>
8.      <name>Procollagen-proline 4-dioxygenase</name>
9.      <name>Prolyl 4-hydroxylase</name>
10.     <activity>L-proline-[procollagen] + 2-oxoglutarate + O(2) =
11.           trans-4-hydroxy-L-proline-[procollagen] + succinate + CO(2).
12.     </activity>
13.     <cofactor>Iron</cofactor>
14.     <cofactor>L-ascorbic acid</cofactor>
15.   </rdf:Description>
16. </rdf:RDF>

RDF - Resource Description Framework

RDF formats - RDF/XML

  • Line 1: XML declaration
  • Line 2: opens RDF element and defines default namespace of doc
  • Line 3: namespace for RDF referred - rdf is its shortcut
  • Line 5: specifies the subject
  • Line 6: type specification (is an enzyme)
  • Line 7-9: synonyms using name
  • Line 10-12: activity (equation for chemical reaction)
  • Line 13-14: cofactors
  • Line 15 and 16: closing Description and RDF element

RDF - Resource Description Framework

RDF formats - N3

  • "Readable" RDF syntax
    • Define namespaces as prefixes
    • Define triples
    • rdf:type expressed by "a"
    • Use S P1 O1; P2 O2 for S P1 O1 and S P2 O2
@prefix : <http://purl.uniprot.org/core/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

:1.14.11.2 a :Enzyme ;
     :name   "Procollagen-proline dioxygenase",
             "Procollagen-proline 4-dioxygenase",
             "Prolyl 4-hydroxylase" ;
   :activity "L-proline-[procollagen] + 2-oxoglutarate+ O(2) =
             trans-4-hydroxy-L-proline-[procollagen] + succinate + CO(2)" ;
   :cofactor "Iron", "L-ascorbic acid" .

RDF - Resource Description Framework

RDF formats - Graphs

  • RDF triples can be represented as a graph

RDF - Resource Description Framework

RDF characteristics

  • Basis of RDFS, OWL and Semantic Web
  • Data stored in RDF/triple stores
  • SPARQL as query language
  • Capable of merging data different triple stores

OWL Language and Semantic Web

RDF Schema

RDF Schema

Schema language

  • Schema - formal definition of the syntax of a language
  • Schema language - language for expressing that definition
  • XML Schema - schema language for XML written in XML
  • Overall a framework for interpreting the meaning of data

RDF Schema

RDF Schema (RDFS)

  • Interpret the meaning of data written in RDF
  • Extension of RDF vocabulary
  • Vocabulary identified with "rdfs"
  • Adds additional specifications and keywords
  • Allows for simple inferences

RDF Schema

Subclasses

@prefix x: <http://www.my-example.org> .
x:Mouse rdf:type rdfs:Class .
x:Rodent rdf:type rdfs:Class .
x:Tom rdf:type x:Mouse .
x:Mouse rdfs:subClassOf x:Rodent .
  • Allows the application of inferences rules
  • Example: from
X rdfs:subClassOf Y .
b rdf:type X .
  • infer
b rdf:type Y .
  • We can infer that Tom is an instance of the class rodent.
  • Note the difference between Tom is a rodent and mouse is a rodent.

RDF Schema

Some RDFS keywords

RDFS keyword Explanation
$\scriptsize\sf rdfs:Resource$ All things in RDFS are instances of this class.
$\scriptsize\sf rdfs:Class$ Class of resources that are (RDF) classes.
$\scriptsize\sf rdfs:subClassOf$ subject is subclass of object.
$\scriptsize\sf rdfs:subPropertyOf$ subject is subproperty of object.
$\scriptsize\sf rdfs:domain$ domain of a subject property.
$\scriptsize\sf rdfs:range$ range of a subject property.
$\scriptsize\sf rdfs:label$ property proving a humanreadable name for subject.

RDF Schema

Other inference rules

  • Transitivity of subClassOf
    • E.g., (assuming defined classes Mouse, Rodent, and Mammal)
x:Mouse rdfs:subClassOf x:Rodent .
x:Rodent rdfs:subClassOf x:Mammal .
  • Transitivity of subPropertyOf
  • Property domains and ranges

RDF Schema

Inference with domain and ranges

  • Does not work like a database constraint
    • Does not yield an error if the objects are not known to be aligned with the given classes
  • Rather adds the corresponding inference(s)

OWL Language and Semantic Web

OWL Language

OWL Language

OWL

  • Web Ontology Language (OWL - not WOL)
  • An extension of RDF Schema? - Yes and No
    • Adds expressive means for more inferences
    • Direct RDF Extension
      • Allows to view URIs simultaneously as classes and objects

OWL Language

Classes

  • Forms of defining classes
    • Indicated by a URI
    • Enumeration of individuals (oneOf - nominals in DLs)
    • Property restriction
    • Intersection of classes
    • Union of classes
    • Complement of a class

OWL Language

RDF-like Syntax

  • Example (simplified from NCI Thesaurus)
1.  <owl:Class rdf:about="#A-Microtubule">
2.    <rdfs:label>A-Microtubule</rdfs:label>
3.    <rdfs:subClassOf rdf:resource="#Cilium Microtubule"/>
4.    <rdfs:subClassOf>
5.      <owl:Restriction>
6.        <owl:onProperty rdf:resource="#is Physical Part of"/>
7.        <owl:someValuesFrom rdf:resource="#Cytoskeleton"/>
8.      </owl:Restriction>
9.    </rdfs:subClassOf>
10.   <rdfs:subClassOf>
11.     <owl:Restriction>
12.       <owl:onProperty rdf:resource="#is Physical Part of"/>
13.       <owl:someValuesFrom rdf:resource="#Cilium"/>
14.     </owl:Restriction>
15.	 </rdfs:subClassOf>
16. </owl:Class>

OWL Language

Class Definition

  • Line 1: subject
  • Line 2: label
  • Line 3: subClass specification
  • Line 4-9: subClass specification
    • someValuesFrom - existential Restriction
    • qualified with #Cytoskeleton
  • Line 10-15: subClass specification
    • someValuesFrom - existential Restriction
    • qualified with #Cilium
  • Line 16: closing the description

OWL Language

Property restrictions for all values

  • Corresponds to universal quantification
<rdfs:subClassOf>
  <owl:Restriction>
    <owl:onProperty rdf:resource="P" />
    <owl:allValuesFrom rdf:resource="#V"/>
  </owl:Restriction>
</rdfs:subClassOf>

OWL Language

Other Constructors

  • Cardinality constraints
    • min: $\sf owl:minCardinality$
    • max: $\sf owl:maxCardinality$
  • Property characteristics
    • Transitivity $\sf owl:TransitiveProperty$
    • Symmetry $\sf owl:SymmetricProperty$
    • Inverses $\sf owl:InverseOf$
    • Functionality $\sf owl:FunctionalProperty$
  • Disjoint classes - $\sf owl:disjointWith$
  • distinct individuals - $\sf owl:differentFrom$
  • corresponding individuals - $\sf owl:sameAs$

OWL Language and Semantic Web

Summary

  • RDF
  • RDF Schema
  • OWL Language
  • OWL Profiles

Further reading:

  • Robinson and Bauer, Introduction to Bio-Ontologies, Chapter 2
  • Antoniou et al., A Semantic Web Primer, Chapters 2 and 4