Ontology (information science)

From Wikipedia, the free encyclopedia

Jump to: navigation, search

In computer science and information science, an ontology is a formal representation of a set of concepts within a domain and the relationships between those concepts. It is used to reason about the properties of that domain, and may be used to define the domain.

Example of an ontology visualized: the Mason-ontology.

In theory, an ontology is a "formal, explicit specification of a shared conceptualisation".[1] An ontology provides a shared vocabulary, which can be used to model a domain — that is, the type of objects and/or concepts that exist, and their properties and relations.[2]

Ontologies are used in artificial intelligence, the Semantic Web, software engineering, biomedical informatics, library science, and information architecture as a form of knowledge representation about the world or some part of it.

Contents

[edit] Overview

The term ontology has its origin in philosophy, and has been applied in many different ways. The core meaning within computer science is a model for describing the world that consists of a set of types, properties, and relationship types. Exactly what is provided around these varies, but they are the essentials of an ontology. There is also generally an expectation that there be a close resemblance between the real world and the features of the model in an ontology.[3]

What ontology has in common in both computer science and in philosophy is the representation of entities, ideas, and events, along with their properties and relations, according to a system of categories. In both fields, one finds considerable work on problems of ontological relativity (e.g., Quine and Kripke in philosophy, Sowa and Guarino in computer science)[4] and debates concerning whether a normative ontology is viable (e.g., debates over foundationalism in philosophy, debates over the Cyc project in AI). Differences between the two are largely matters of focus. Philosophers are less concerned with establishing fixed, controlled vocabularies than are researchers in computer science, while computer scientists are less involved in discussions of first principles (such as debating whether there are such things as fixed essences, or whether entities must be ontologically more primary than processes).

[edit] History

Historically, ontologies arise out of the branch of philosophy known as metaphysics, which deals with the nature of reality – of what exists. This fundamental branch is concerned with analyzing various types or modes of existence, often with special attention to the relations between particulars and universals, between intrinsic and extrinsic properties, and between essence and existence. The traditional goal of ontological inquiry in particular is to divide the world "at its joints", to discover those fundamental categories, or kinds, into which the world’s objects naturally fall.[5]

During the second half of the 20th century, philosophers extensively debated the possible methods or approaches to building ontologies, without actually building any very elaborate ontologies themselves. By contrast, computer scientists were building some large and robust ontologies (such as WordNet and Cyc) with comparatively little debate over how they were built.

Since the mid-1970s, researchers in the field of artificial intelligence have recognized that capturing knowledge is the key to building large and powerful AI systems. AI researchers argued that they could create new ontologies as computational models that enable certain kinds of automated reasoning. In the 1980s, the AI community began to use the term ontology to refer to both a theory of a modeled world and a component of knowledge systems. Some researchers, drawing inspiration from philosophical ontologies, viewed computational ontology as a kind of applied philosophy.[6]

In the early 1990s, the widely cited Web page and paper "Toward Principles for the Design of Ontologies Used for Knowledge Sharing" by Tom Gruber[7] is credited with a deliberate definition of ontology as a technical term in computer science. Gruber introduced the term to mean a specification of a conceptualization. That is, an ontology is a description, like a formal specification of a program, of the concepts and relationships that can exist for an agent or a community of agents. This definition is consistent with the usage of ontology as set of concept definitions, but more general. And it is a different sense of the word than its use in philosophy.

Ontologies are often equated with taxonomic hierarchies of classes, class definitions, and the subsumption relation, but ontologies need not be limited to these forms. Ontologies are also not limited to conservative definitions – that is, definitions in the traditional logic sense that only introduce terminology and do not add any knowledge about the world.[8] To specify a conceptualization, one needs to state axioms that do constrain the possible interpretations for the defined terms.[9]

In the early years of the 21st century, the interdisciplinary project of cognitive science has been bringing the two circles of scholars closer together. For example, there is talk of a "computational turn in philosophy" that includes philosophers analyzing the formal ontologies of computer science (sometimes even working directly with the software), while researchers in computer science have been making more references to those philosophers who work on ontology (sometimes with direct consequences for their methods). Still, many scholars in both fields are uninvolved in this trend of cognitive science, and continue to work independently of one another, pursuing separately their different concerns.

[edit] Ontology components

Contemporary ontologies share many structural similarities, regardless of the language in which they are expressed. As mentioned above, most ontologies describe individuals (instances), classes (concepts), attributes, and relations. In this section each of these components is discussed in turn.

Common components of ontologies include:

  • Individuals: instances or objects (the basic or "ground level" objects)
  • Classes: sets, collections, concepts, types of objects, or kinds of things.[10]
  • Attributes: aspects, properties, features, characteristics, or parameters that objects (and classes) can have
  • Relations: ways in which classes and individuals can be related to one another
  • Function terms: complex structures formed from certain relations that can be used in place of an individual term in a statement
  • Restrictions: formally stated descriptions of what must be true in order for some assertion to be accepted as input
  • Rules: statements in the form of an if-then (antecedent-consequent) sentence that describe the logical inferences that can be drawn from an assertion in a particular form
  • Axioms: assertions (including rules) in a logical form that together comprise the overall theory that the ontology describes in its domain of application. This definition differs from that of "axioms" in generative grammar and formal logic. In these disciplines, axioms include only statements asserted as a priori knowledge. As used here, "axioms" also include the theory derived from axiomatic statements.
  • Events: the changing of attributes or relations

Ontologies are commonly encoded using ontology languages.

[edit] Domain ontologies and upper ontologies

A domain ontology (or domain-specific ontology) models a specific domain, or part of the world. It represents the particular meanings of terms as they apply to that domain. For example the word card has many different meanings. An ontology about the domain of poker would model the "playing card" meaning of the word, while an ontology about the domain of computer hardware would model the "punch card" and "video card" meanings.

An upper ontology (or foundation ontology) is a model of the common objects that are generally applicable across a wide range of domain ontologies. It contains a core glossary in whose terms objects in a set of domains can be described. There are several standardized upper ontologies available for use, including Dublin Core, GFO, OpenCyc/ResearchCyc, SUMO, and DOLCE. WordNet, while considered an upper ontology by some, is not an ontology: it is a unique combination of a taxonomy and a controlled vocabulary[citation needed] (see above, under Attributes).

The Gellish ontology is an example of a combination of an upper and a domain ontology.

Since domain ontologies represent concepts in very specific and often eclectic ways, they are often incompatible. As systems that rely on domain ontologies expand, they often need to merge domain ontologies into a more general representation. This presents a challenge to the ontology designer. Different ontologies in the same domain can also arise due to different perceptions of the domain based on cultural background, education, ideology, or because a different representation language was chosen.

At present, merging ontologies that are not developed from a common foundation ontology is a largely manual process and therefore time-consuming and expensive. Domain ontologies that use the same foundation ontology to provide a set of basic elements with which to specify the meanings of the domain ontology elements can be merged automatically. There are studies on generalized techniques for merging ontologies, but this area of research is still largely theoretical.

[edit] Ontology engineering

Ontology engineering (or ontology building) is a subfield of knowledge engineering that studies the methods and methodologies for building ontologies. It studies the ontology development process, the ontology life cycle, the methods and methodologies for building ontologies, and the tool suites and languages that support them.[11][12]

Ontology engineering aims to make explicit the knowledge contained within software applications, and within enterprises and business procedures for a particular domain. Ontology engineering offers a direction towards solving the interoperability problems brought about by semantic obstacles, such as the obstacles related to the definitions of business terms and software classes. Ontology engineering is a set of tasks related to the development of ontologies for a particular domain.[13]

[edit] Ontology languages

An ontology language is a formal language used to encode the ontology. There are a number of such languages for ontologies, both proprietary and standards-based:

  • Common Algebraic Specification Language is a general logic-based specification language developed within the IFIP working group 1.3 "Foundations of System Specifications" and functions as a de facto standard in the area of software specifications. It is now being applied to ontology specifications in order to provide modularity and structuring mechanisms.
  • Common logic is ISO standard 24707, a specification for a family of ontology languages that can be accurately translated into each other.
  • The Cyc project has its own ontology language called CycL, based on first-order predicate calculus with some higher-order extensions.
  • The Gellish language includes rules for its own extension and thus integrates an ontology with an ontology language.
  • IDEF5 is a software engineering method to develop and maintain usable, accurate, domain ontologies.
  • KIF is a syntax for first-order logic that is based on S-expressions.
  • Rule Interchange Format (RIF) and F-Logic combine ontologies and rules.
  • OWL is a language for making ontological statements, developed as a follow-on from RDF and RDFS, as well as earlier ontology language projects including OIL, DAML and DAML+OIL. OWL is intended to be used over the World Wide Web, and all its elements (classes, properties and individuals) are defined as RDF resources, and identified by URIs.

[edit] Examples of published ontologies

  • Basic Formal Ontology,[14] a formal upper ontology designed to support scientific research
  • BioPAX,[15] an ontology for the exchange and interoperability of biological pathway (cellular processes) data
  • CCO[16] The Cell-Cycle Ontology is an application ontology that represents the cell cycle
  • CContology,[17] an e-business ontology, to support online customer compliant management.
  • CIDOC Conceptual Reference Model, an ontology for cultural heritage[18]
  • COSMO,[19] a Foundation Ontology (current version in OWL) that is designed to contain representations of all of the primitive concepts needed to logically specify the meanings of any domain entity. It is intended to serve as a basic ontology that can be used to translate among the representations in other ontologies or databases. It started as a merger of the basic elements of the OpenCyc and SUMO ontologies, and has been supplemented with other ontology elements (types, relations) so as to include representations of all of the words in the Longman dictionary defining vocabulary.
  • Cyc a large Foundation Ontology for formal representation of the universe of discourse.
  • Disease Ontology[20] designed to facilitate the mapping of diseases and associated conditions to particular medical codes.
  • DOLCE, a Descriptive Ontology for Linguistic and Cognitive Engineering[21]
  • Dublin Core, a simple ontology for documents and publishing.
  • Foundational, Core and Linguistic Ontologies[22]
  • Foundational Model of Anatomy[23] for human anatomy
  • Gene Ontology for genomics
  • GUM (Generalized Upper Model),[24] a linguistically-motivated ontology for mediating between clients systems and natural language technology
  • Gellish English dictionary, an ontology that includes a dictionary and taxonomy that includes an upper ontology and a lower ontology that focusses on industrial and business applications in engineering, technology and procurement. See also Gellish as Open Source project on SourceForge.
  • GOLD[25] General Ontology for Linguistic Description
  • IDEAS Group A formal ontology for enterprise architecture being developed by the Australian, Canadian, UK and U.S. Defence Depts.[26]
  • Linkbase[27] A formal representation of the biomedical domain, founded upon Basic Formal Ontology.
  • LPL Lawson Pattern Language
  • OBO Foundry: a suite of interoperable reference ontologies in biomedicine.
  • Ontology for Biomedical Investigations is an open access, integrated ontology for the description of biological and clinical investigations.
  • Plant Ontology[28] for plant structures and growth/development stages, etc.
  • POPE Purdue Ontology for Pharmaceutical Engineering
  • PRO,[29] the Protein Ontology of the Protein Information Resource, Georgetown University.
  • Program abstraction taxonomy program abstraction taxonomy
  • Protein Ontology[30] for proteomics
  • SBO, the Systems Biology Ontology, for computational models in biology
  • Suggested Upper Merged Ontology, which is a formal upper ontology
  • SWEET[31] Semantic Web for Earth and Environmental Terminology
  • ThoughtTreasure ontology
  • TIME-ITEM Topics for Indexing Medical Education
  • WordNet Lexical reference system

[edit] Ontology libraries

The development of ontologies for the Web has led to the apparition of services providing lists or directories of ontologies with search facility. Such directories have been called ontology libraries.

The following are static libraries of human-selected ontologies.

  • DAML Ontology Library[32] maintains a legacy of ontologies in DAML.
  • Protege Ontology Library[33] contains a set of owl, Frame-based and other format ontologies.
  • SchemaWeb[34] is a directory of RDF schemata expressed in RDFS, OWL and DAML+OIL.

The following are both directories and search engines. They include crawlers searching the Web for well-formed ontologies.

  • OBO Foundry / Bioportal[35] is a suite of interoperable reference ontologies in biology and biomedicine.
  • OntoSelect[36] Ontology Library offers similar services for RDF/S, DAML and OWL ontologies.
  • Ontaria[37] is a "searchable and browsable directory of semantic web data", with a focus on RDF vocabularies with OWL ontologies.
  • Swoogle is a directory and search engine for all RDF resources available on the Web, including ontologies.

[edit] See also

Related philosophical concepts

[edit] References

  1. ^ Tom Gruber (1993). "A translation approach to portable ontology specifications". In: Knowledge Acquisition. 5: 199-199.
  2. ^ Fredrik Arvidsson and Annika Flycht-Eriksson. Ontologies I. Retrieved 26 Nov 2008.
  3. ^ Lars Marius Garshol (2004). Metadata? Thesauri? Taxonomies? Topic Maps! Making sense of it all on www.ontopia.net. Retrieved 13 October 2008.
  4. ^ (Top-level ontological categories. By: Sowa, John F. In International Journal of Human-Computer Studies, v. 43 (November/December 1995) p. 669-85.),
  5. ^ Perakath C. Benjamin et al. (1994). IDEF5 Method Report. Knowledge Based Systems, Inc.
  6. ^ Tom Gruber (2008). "Ontology". To appear in the Encyclopedia of Database Systems, Ling Liu and M. Tamer Özsu (Eds.), Springer-Verlag, 2008.
  7. ^ Gruber, T. R., "Toward Principles for the Design of Ontologies Used for Knowledge Sharing". In: International Journal Human-Computer Studies, 43(5-6):907-928, 1995
  8. ^ Enderton, H. B. (1972). A Mathematical Introduction to Logic. San Diego, CA: Academic Press.
  9. ^ Gruber, T. R. (1993). "A translation approach to portable ontologies". In: Knowledge Acquisition. 5(2):199-220, 1993.
  10. ^ See Class (set theory), Class (computer science), and Class (philosophy), each of which is relevant but not identical to the notion of a "class" here.
  11. ^ Asunción Gómez-Pérez, Mariano Fernández-López, Oscar Corcho (2004). Ontological Engineering: With Examples from the Areas of Knowledge Management, E-commerce and the Semantic Web. Springer, 2004.
  12. ^ A. De Nicola, M. Missikoff, R. Navigli (2009). "A Software Engineering Approach to Ontology Building". Information Systems, 34(2), Elsevier, 2009, pp. 258-275.
  13. ^ Line Pouchard, Nenad Ivezic and Craig Schlenoff (2000). "Ontology Engineering for Distributed Collaboration in Manufacturing", In Proceedings of the AIS2000 conference, March 2000.
  14. ^ Basic Formal Ontology (BFO)
  15. ^ BioPAX http://biopax.org
  16. ^ CCO
  17. ^ CContology
  18. ^ [information.http://cidoc.ics.forth.gr/ CIDOC Conceptual Reference Model]
  19. ^ COSMO
  20. ^ Disease Ontology
  21. ^ DOLCE
  22. ^ Foundational, Core and Linguistic Ontologies
  23. ^ http://sig.biostr.washington.edu/projects/fm/AboutFM.html Foundational Model of Anatomy]
  24. ^ Generalized Upper Model
  25. ^ GOLD
  26. ^ The IDEAS Group Website
  27. ^ Linkbase
  28. ^ Plant Ontology
  29. ^ PRO
  30. ^ Protein Ontology
  31. ^ SWEET
  32. ^ DAML Ontology Library
  33. ^ Protege Ontology Library
  34. ^ SchemaWeb
  35. ^ OBO Foundry / Bioportal
  36. ^ OntoSelect
  37. ^ Ontaria

[edit] Further reading

[edit] External links

Personal tools