Biostat 823 - Web Ontology Language

Hilmar Lapp

Duke University, Department of Biostatistics & Bioinformatics

2024-09-24

OWL2: History

Predecessors and major influences
- 2000-2001 DARPA Agent Markup Language (DAML) and Ontology Inference Layer (OIL) (DAML+OIL)
- 2000-2004 Resource Description Framework Schema (RDFS)
- 2001-2004 W3C Web-Ontology Working Group
Web-Ontology Language:
- OWL: W3C recommended standard in 2004
- OWL2: W3C recommended standard in 2009
Web-native: all identifiers are IRIs¹

OWL2: Rich ecosystem

Thriving community of ontology editors, semantic engineers, data scientists, and software developers
100s of ontologies covering various domains across biology and biomedicine
Open-source software tooling ecosystem:
- Protégé ontology editor
- Machine reasoners
- Command-line tooling
- Ontology development kit, libraries and tools

OWL2: Terminology vs DL and FOL

OWL and OWL2 are Description Logics (DLs), which are a fragment of FOL.

FOL	DL	OWL	Example
constant	individual	individual	‘my hand’, ‘Durham, NC’
unary predicate	concept	class	‘manus (hand)’, ‘city’
binary predicate	role	property	‘part of’, ‘has parent’

Individuals, classes, and properties are called entities in OWL/OWL2.
Properties hold between individuals, not classes
- Classes can have property restrictions, using existential or universal quantification, or specific cardinality.

OWL2 constructs vs DL notation (I)

C and D are concepts, R is a role (property). If \(a\ R\ b\) then b is connnected by R (“R-successor”).

DL notation	Concept	OWL2 (Functional Syntax)
\(\top\)	Top concept	`owl:Thing`
\(\bot\)	Bottom concept	`owl:Nothing`
\(C \sqcap D\)	Conjunction of concepts C and D	ObjectIntersectionOf( C D )
\(C \sqcup D\)	Disjunction of concepts C and D	ObjectUnionOf( C D )
\(\neg C\)	Complement of concept C	ObjectComplementOf( C )
\(\forall R.C\)	All connected by R are in C (universal quantification)	ObjectAllValuesFrom( R C )
\(\exists R.C\)	Some connected by R are in C (existential quantification)	ObjectSomeValuesFrom( R C )

Non-atomic concept definitions are called class expressions.

OWL2 constructs vs DL notation (II)

C and D are concepts, R is a role (property), a and b are individuals.

DL notation	Axiom	OWL2 (Functional Syntax)	Semantics
\(C \sqsubseteq D\)	Concept inclusion	SubClassOf( C D )	\(\forall a\in C \rightarrow a\in D\)
\(C \sqcap D \sqsubseteq \bot\)	Concept exclusion	DisjointClasses( C D )	\(\not\exists a\!: a\in C \wedge a\in D\)
\(C \equiv D\)	Concept equivalency	EquivalentClasses( C D )	\(\forall (a\in C, b\in D) \rightarrow\\ a\in D \wedge b \in C\)
\(C(a)\)	Concept member	ClassAssertion( C a )	\(a \in C\)
\(R(a,b)\)	Role	ObjectPropertyAssertion( R a b )
\(R(a,\!'val')\)	Data value	DataPropertyAssertion(R a “val”)

C and D can be atomic concepts or class expressions. If the left hand side in a concept inclusion axiom is a class expression, it is called a General Class Inclusion (GCI) axiom.

OWL2 property axioms

C and D are concepts; P, R and S are roles (properties); a and b are individuals.

DL	Role …	OWL2 (Functional Syntax)	Semantics
\(R \sqsubseteq S\)	inclusion	SubObjectPropertyOf( R S )	\(\forall (a,b):\\ R(a,b) \rightarrow S(a,b)\)
\(R \equiv S\)	equivalency	EquivalentObjectProperties( R S )
\(R \equiv S^-\)	inverse	InverseObjectProperty( R S )	\(\forall (a,b):\\ S(a,b) \rightarrow R(b,a)\)
\(R \circ S \sqsubseteq P\)	chain	SubObjectPropertyOf( ObjectPropertyChain( R S ) P)	\(\forall (a,b): R(a,z) \wedge S(z,b)\\ \rightarrow P(a,b)\)
	Functional role	FunctionalObjectProperty( R )	\(\forall (a,b,c): R(a,b) \wedge R(a,c)\\ \rightarrow b \equiv c\)
	Inverse functional	InverseFunctionalObjectProperty( R )	\(\forall (a,b,c): R(a,b) \wedge R(c,b)\\ \rightarrow a \equiv c\)

Properties can also be defined as (ir)reflexive, (a)symmetric, or transitive

Domain and Range constraints

Object property domain axiom:

ObjectPropertyDomain( R C )
- Semantics: \(\forall a: R(a,\cdot) \rightarrow C(a)\)
Object property range axiom:

ObjectPropertyRange( R D )
- Semantics: \(\forall a: R(\cdot,a) \rightarrow D(a)\)

Axioms about individuals vs classes

OWL properties apply to individuals, both as subject and object:

‘my left index finger’ :part_of ‘my left hand’
For classes (“universals”), must use property restriction (\(\sqsubseteq\exists part\_of.hand\)):

‘index finger’ SubClassOf :part_of some ‘hand’¹

or more specifically (\(\sqsubseteq anatomical\_structure \sqcap\exists part\_of.hand\)):

‘index finger’ SubClassOf ‘anatomical structure’ and :part_of some ‘hand’
Note that existential quantification is “asymmetric”, and the reverse with the inverse property does not necessarily follow:

\(index\_finger\sqsubseteq\exists part\_of.hand \wedge has\_part\equiv part\_of^- \not\rightarrow\\ hand\sqsubseteq\exists has\_part.index\_finger\)

Satisfiability and consistency

In formal logic, a formula is satisfiable iff there is some assignment of values to variables that make it true.
For ontologies, a class is unsatisfiable if no individual can exist that is a member of the class.
- C is unsatisfiable iff \(C\sqsubseteq\bot\) (in OWL this is owl:Nothing)
- Often this is the result of a class C being (asserted or inferred as) a subclass of another class D and also the complement of D.
An ontology is inconsistent if it asserts an individual as a member of an unsatisfiable class.
- Most reasoners stop when encountering an inconsistency.

OWL uses Open World Assumption

Closed World Assumption (CWA):
- Facts that are not known are assumed to be false.
- Databases and database queries are most common example:
```
SELECT Instructor_Name FROM Lesson_Instructors WHERE ...
```
  The result set is assumed to contain all values that can possibly match (i.e., that exist).
Open World Assumption (OWA):
- Facts that are not known are undefined (neither assumed true nor false).
- For example, an individual asserted only as a member of class C cannot be assumed as not being a member of class D (unless C and D are asserted as disjoint).

Tbox and Abox

All axioms (statements) about classes (concepts) form the Tbox, the terminological component of an ontology.
All axioms about individuals form the Abox, the assertional component of an ontology.
An ontology O consists of Tbox and Abox: \(O=\{\mathcal{T},\mathcal{A}\}\)

Semantic entailment

An ontology semantically entails a statement \(\phi\) if \(\phi\) is true in all models (valid interpretations) of the ontology.
Reasoners compute semantic entailments.
A reasoner is
- sound if every semantic entailment it computes is correct;
- complete if it computes all possible semantic entailments.
To be useful, reasoners need to be both sound and complete.

Reasoning services

Semantic entailment
Satisfiability and consistency
Classification
- Inference of class subsumption hierarchy
- Inference of individuals’ class membership
Conjunctive query answering (“DL queries”)
```
'anatomical structure' and 'part of' some 'hand'
```
The DL Query tutorial in the OBOOK (OBO Organized Knowledge) collection of training and tutorial materials on ontology development and use provides a good introduction.
Explanation

OWL2 profiles and decidability

OWL/OWL2 DL:
- Maximum expressivity while maintaining computational completeness and decidability
- Allows well-performing reasoners
- Some restrictions that in practice are rarely relevant (individuals and classes must be distinct; some cardinality restriction constraints for transitive properties)
OWL2 defines several expressivity profiles: OWL2-EL, OWL2-QL, and OWL2-RL
OWL2-EL Profile:
- Decidable in polynomial time, allows for very effective reasoners
- Unsupported constructs include universal quantification; cardinality restrictions; disjunction; class negation; inverse, functional, (a)symmetric object properties.
- Supported constructs are sufficient for most bio-ontologies, including those that are very large (Gene Ontology (GO), UBERON, SNOMED-CT)

Resources: Reasoners

OWL2 DL reasoners
- Many based on the Analytic Tableaux algorithm
- Fact++
- Hermit
- Racer
OWL2 EL reasoners
- ELK
More complete list with details

Symbolic and Neural AI (I)

AI approaches relying on formal logic knowledge representation and reasoning fall under Symbolic AI.
- Symbolic AI stands in contrast to neural AI approaches (ANNs, Deep Learning, etc; also called “connectionist”)
Neuro-symbolic AI approaches attempt to integrate these to complement each other

Issue	Symbolic AI	Neural AI
Decisions	Self-explanatory	Black box
Expert knowledge	Readily utilized	Difficult to utilize
Trainability	Typically not	Trainable from raw data
Susceptibility to data errors	High, brittle	Low, robust
Speed	Slow on large expressive KB	Fast once trained

Neuro-symbolic AI resources

Recent overviews:
- Hitzler P, Eberhart A, Ebrahimi M, Sarker MK, Zhou L. (2022). Neuro-symbolic approaches in artificial intelligence. Natl Sci Rev. 2022 Mar 4;9(6):nwac035.
- Sarker, M. K., Zhou, L., Eberhart, A., & Hitzler, P. (2022). Neuro-symbolic artificial intelligence: Current Trends. AI Communications, 34(3), 197–209.
Deep learning-based reasoners
- Ebrahimi, M., Eberhart, A., Bianchi, F., & Hitzler, P. (2021). Towards bridging the neuro-symbolic gap: deep deductive reasoners. Applied Intelligence, 51(9), 6326–6348.
- Tang, Z., Hinnerichs, T., Peng, X., Zhang, X., & Hoehndorf, R. (2022). FALCON: Sound and Complete Neural Semantic Entailment over \(\mathcal{ALC}\) Ontologies. arXiv:2208.07628 [cs.AI].

Resources: OWL Ontologies

Primers, Tutorials
- OWL2 Primer
- Ontology 101 Tutorial
Software
- Protégé – authoring/editing ontologies
- Ontology Development Kit (ODK) – initiate an ontology following OBO Library practices
- Ontology Access Kit (OAK) – Python library and CLI
- ROBOT – command line tool
- OWLAPI – library (Java) and reference implementation

Resources: Ontologies in Bio

Ontologies for biology & biomedicine
- OBO Ontologies
Tutorials and Training
- OBO Semantic Engineering Training
- ICBO OBO Tutorial 2022
Publications
- The Gene Ontology Consortium (2000). Gene ontology: tool for the unification of biology. Nature Genetics, 25(1), 25–29.
- Dececchi TA, Balhoff JP, Lapp H, & Mabee PM (2015). Toward Synthesizing Our Knowledge of Morphology: Using Ontologies and Machine Reasoning to Extract Presence/Absence Evolutionary Phenotypes across Studies. Systematic Biology, 64(6), 936–952.
- Haendel MA, Chute CG, and Robinson PN (2018) Classification, Ontology, and Precision Medicine. New England Journal of Medicine 379 (15): 1452–62.

Other resources

Nicole Vasilevsky’s curated list of ontology resources
Michael DeBellis (2021) A Practical Guide To Building OWL Ontologies Using Protégé 5.5 and Plugins. Edition 1.4.
- Pizza Ontology
OBO Organized Knowledge (OBOOK) collection of training and tutorial materials on developing and using ontologies in life science.