# Description logic

Description logics (DL) are a family of knowledge representation languages which can be used to represent the concept definitions of an application domain (known as terminological knowledge) in a structured and formally well-understood way. The name description logic refers, on the one hand, to concept descriptions used to describe a domain and, on the other hand, to the logic-based semantics which can be given by a translation into first-order predicate logic. Description logic was designed as an extension to frames and semantic networks, which were not equipped with formal logic-based semantics. They form a middle ground solution: including some more expressive operations than propositional logic and having decidable or more efficient decision problems than first order predicate logic.

Description logic was given its current name in the 1980s. Previous to this it was called (chronologically): terminological systems, and concept languages. Today description logic has become a cornerstone of the Semantic Web for its use in the design of ontologies. The OWL-DL and OWL-Lite sub-languages of the W3C-endorsed Web Ontology Language (OWL) are based on a description logic.

The first DL-based system was KL-ONE (by Brachman and Schmolze, 1985). Some other DL systems came later. They are LOOM (1987), BACK (1988), KRIS (1991), CLASSIC (1991), FaCT (1998) and lately RACER (2001), CEL (2005), and KAON 2 (2005).

## Syntax

Syntax of description logics consists of

• A set of unary predicate symbols that are used to denote concept names;
• A set of binary relations that are used to denote role names;
• A recursive definition for defining concept terms from concept names and role names using constructors.

In general, a concept denotes the set of individuals that belongs to it, and a role denotes a relationship between concepts.

The syntax of a member of the description logic family is characterized by its recursive definition, in which the constructors that can be used to form concept terms are stated. Some common constructors include logical constructors in first-order logic such as intersection or conjunction of concepts, union or disjunction of concepts, negation or complement of concepts, value restriction (universal restriction), existential restriction, etc. Other constructors may also include restrictions on roles which are usual for binary relations, for example, inverse, transitivity, functionality, etc. Especially for intersection and union, description logics use the symbols $\sqcap$ and $\sqcup$ to distinguish them from the first-order logic and and or.

The following is an example of definition of the syntax of the description logic AL (Attributive Language).

• An atomic concept is an AL-concept;
• The top concept ($\top$) is an AL-concept;
• The bottom concept ($\bot$) is an AL-concept;
• The complement of an atomic AL-concept C is also an AL-concept (denoted by ¬C)
• The intersection (conjunction) of two AL-concepts C and D is also an AL-concept (denoted by $C\sqcap D$);
• If C is an AL-concept and R is a role name, then $\forall R.C$ (value restriction) is also an AL-concept;
• If R is a role name, then $\exists R.\top$ (limited existential restriction) is also an AL-concept.

For example, $A \sqcap B$ is an AL-concept, but $A \sqcup B$ is not. Also, $\exists S.\top$ is an AL-concept, but $\exists S.A$ is not.

## Semantics

The semantics of description logics is defined by interpreting concepts as sets of individuals and roles as sets of pairs of individuals. Those individuals are typically assumed from a given domain. The semantics of non atomic concepts and roles is then defined in terms of atomic concepts and roles. This is done by using a recursive definition similar to the syntax.

For example, given a set as the domain, an interpretation of AL-concepts is defined first over atomic concepts and roles as follows:

• An atomic concept is interpreted as a set of individuals that is a subset of the domain.
• An atomic role is interpreted as a set of pairs of individuals from the domain, i.e., a binary relation over the domain. In this case, if an individual x is related to y via a role R, then y is called an R-successor of x.

Next, this interpretation is extended to non atomic concept and role according to the constructors. This is done in the following.

• The top concept is interpreted as the whole domain.
• The bottom concept is interpreted as the empty set.
• The interpretation of ¬C is the set of all individuals in the domain which does not belong to the interpretation of C.
• Intersection of two concepts C and D is interpreted as set-intersection, i.e., the set of all individuals in the domain that belongs to both the interpretation of C and the interpretation of D.
• The value restriction ∀R.C is interpreted as the set of all individuals in the domain whose R-successors (if any) all belong to the interpretation of C.
• The limited existential restriction $\exists R.\top$ is interpreted as the set of all individuals in the domain that have at least one R-successor.

Example. If P is interpreted as the set of all persons in our domain and F is interpreted as the set of all females, then the set of all persons that are not female can be expressed by the concept

$P\sqcap\neg F$

## Modeling in Description Logics

In DLs, a distinction is drawn between the so-called TBox (terminological box) and the ABox (assertional box). In general, the TBox contains sentences describing concept hierarchies (i.e., relations between concepts) while the ABox contains ground sentences stating where in the hierarchy individuals belong (i.e., relations between individuals and concepts). For example, the statement:

(1) Every employee is a person

belongs in the TBox, while the statement:

(2) Bob is an employee

belongs in the ABox.

Note that the TBox/ABox distinction is not significant, in the same sense that the two "kinds" of sentences are not treated differently in first-order logic (which subsumes most DLs). When translated into first-order logic, a subsumption axiom like (1) is simply a conditional restriction to unary predicates (concepts) with only variables appearing in it. Clearly, a sentence of this form is not privileged or special over sentences in which only constants ("grounded" values) appear like (2).

So why was the distinction introduced? The primary reason is that the separation can be useful when describing and formulating decision-procedures for various DLs. For example, a reasoner might process the TBox and ABox separately, in part because certain key inference problems are tied to one but not the other one ('classification' is related to the TBox, 'instance checking' to the ABox). Another example is that the complexity of the TBox can greatly affect the performance of a given decision-procedure for a certain DL, independently of the ABox. Thus, it is useful to have a way to talk about that specific part of the knowledge base.

The secondary reason is that the distinction can make sense from the knowledge base modeler's perspective. It is plausible to distinguish between our conception of terms/concepts in the world (class axioms in the TBox) and particular manifestations of those terms/concepts (instance assertions in the ABox.)

There are two features of Description Logics that are not shared by most other data description formalisms: DLs do not make the Unique Name Assumption (UNA) or the Closed World Assumption (CWA). Not having UNA means that two concepts with different names may be allowed by some inference to be shown to be equivalent. Not having CWA, or rather having the Open World Assumption (OWA) means that lack of knowledge of a fact does not immediately imply knowledge of the negation of a fact.

## Decision problems

In addition to the ability to describe concepts formally, one also would like to employ the description of a set of concepts to ask questions about the concepts and instances described. The most common decision problems are basic database-query-like questions like instance checking (is a particular instance (member of an A-box) a member of a given concept) and relation checking (does a relation/role hold between two instances, in other words does a have property b), and the more global-database-questions like subsumption (is a concept a subset of another concept), and concept consistency (is there no contradiction among the definitions or chain of definitions). The more operators one includes in a logic and the more complicated the T-box (having cycles, allowing non-atomic concepts to include each other), usually the higher the computational complexity is for each of these problems (see Navigator on Description Logic Complexity for examples).

## Fuzzy description logics

Fuzzy description logic combines fuzzy logic with DLs. Since many concepts that are needed for intelligent systems lack well defined boundaries, or precisely defined criteria of membership, we need fuzzy logic to deal with notions of vagueness and imprecision. This offers a motivation for a generalization of description logics towards dealing with imprecise and vague concepts.

What people should also think about for intelligent systems is multiple viewpoints of the data. This will lead to subjective (as opposed to objective) intelligent systems.

## Differences with OWL

### Terminology

A concept in DL jargon is referred to as a class in OWL. A role in DL jargon is a property in OWL.

## DL operators and naming conventions

There are many varieties of Description Logics and there is an informal naming convention, roughly describing the operators allowed. The expressivity is encoded in the label for a logic using the following letters:

 $\mathcal{F}$ Functional properties. $\mathcal{E}$ Full existential qualification (Existential restrictions that have fillers other than owl:thing). $\mathcal{U}$ Concept union. $\mathcal{C}$ Complex concept negation. $\mathcal{S}$ An abbreviation for $\mathcal{ALC}$ with transitive roles. $\mathcal{H}$ Role hierarchy (subproperties - rdfs:subPropertyOf). $\mathcal{R}$ Limited complex role inclusion axioms; reflexivity and irreflexivity; role disjointness. $\mathcal{O}$ Nominals. (Enumerated classes of object value restrictions - owl:oneOf, owl:hasValue). $\mathcal{I}$ Inverse properties. $\mathcal{N}$ Cardinality restrictions (owl:Cardinality, owl:MaxCardinality). $\mathcal{Q}$ Qualified cardinality restrictions (available in OWL 1.1, cardinality restrictions that have fillers other than owl:thing). $^\mathcal{(D)}$ Use of datatype properties, data values or data types.

Some canonical DLs that do not exactly fit this convention are:

 $\mathcal{AL}$ Attributive language. This is the base language which allows: Atomic negation (negation of concepts that do not appear on the left hand side of axioms) Concept intersection Universal restrictions Limited existential quantification $\mathcal{FL}_o$ A sub-language of $\mathcal{FL^-}$, which is obtained by disallowing limited existential quantification. $\mathcal{EL}$ Intersection and full existential restriction.

As an example, $\mathcal{ALC}$ is a centrally important description logic from which comparisons with other varieties can be made. $\mathcal{ALC}$ is simply $\mathcal{AL}$ with complement of any concept allowed, not just atomic concepts.

A further example, the description logic $\mathcal{SHIQ}$ is the logic $\mathcal{ALC}$ plus extended cardinality restrictions, and transitive and inverse roles. The naming conventions aren't purely systematic so that the logic $\mathcal{ALCOIN}$ might be referred to as $\mathcal{ALCNIO}$ and abbreviations are made where possible, $\mathcal{ALC}$ is used instead of the equivalent $\mathcal{ALUE}$.

The Protégé ontology editor supports $\mathcal{SHOIN}^\mathcal{(D)}$. Three major bioinformatic terminology bases, Snomed, Galen, and GO, are expressible in $\mathcal{EL}$ (with additional role properties).

OWL 2 provides the expressiveness of $\mathcal{SROIQ}^\mathcal{(D)}$, OWL-DL is based on $\mathcal{SHOIN}^\mathcal{(D)}$, and for OWL-Lite it is $\mathcal{SHIF}^\mathcal{(D)}$.

## Description Logic Reasoners

There are some reasoners to deal with OWL and Description Logics. These are some of the most popular:

• CEL is a free (for non-commercial use) LISP-based reasoner
• Cerebra Engine was a commercial C++-based reasoner, acquired in 2006 by webMethods.
• FaCT++ is a free open-source C++-based reasoner.
• KAON2 is a free (free for non-commercial usage) Java reasoner.
• MSPASS is a free open-source C reasoner for numerous description logics.
• Pellet is a dual-licensed (AGPL and proprietary) commercial, Java-based reasoner.
• RacerPro is a commercial (free trials and research licenses are available) lisp-based reasoner.
• SimDL is a free open-source Java-based reasoner for the language ALCHQ. It also provides a similarity measurement functionality between concepts. To access this functionality a Protégé plugin can be used.

DL reasoners, such as FaCT, FaCT++, RACER, DLP and Pellet, implement the analytic tableau method. KAON2 is implemented by algorithms which reduce a SHIQ(D) knowledge base to a disjunctive datalog program.

Other tools related to Description Logics include the following:

• Protégé is a free, open source ontology editor and knowledge-base framework, which can use DL reasoners which offer a DIG interface as backends for consistency checks.
• DIG Implementation. DIG is an XML interface to DL systems, recommended by the DL Implementation Group. DIG 2.0 is an ongoing effort for a new DIG interface standard.